What makes a data scientist?
The big data phenomenon trained a bright spotlight on those who perform deep information analysis and can combine quantitative and statistical modeling expertise with business acumen and a talent for finding hidden patterns. Here's a closer look.
Data scientists rely on analytics, predictive models, statistical analysis and modeling, data mining, sentiment and what-if analysis, and more to do their jobs. Cleansing raw data and building models is detailed work, and the right tools make the process much more efficient.
The IBM® Biglnsights TM Data Scientist module accelerates data science with advanced analytics to extract valuable insights from Hadoop. Stable machine learning algorithms are optimized for Hadoop. Text analytics extract insight from unstructured data with existing tooling so analytic applications don't have to be developed from scratch. Big IR statistical analysis and distributed frames allow data scientists to use the entire Hadoop cluster, not just a limited sample

Good data scientists select and address the business problems that have the most value to the organization. Armed with data and analytical results, they must present their informed conclusions and recommendations to technical and nontechnical stakeholders.
New in Biglnsights 4.0: Apache Spark 1.21 for dramatic performance improvements that leverage in-memory distributed computation.
Data that isn't accurate, secure and available isn't useful. Data scientists must be able to quickly and efficiently sort, structure, categorize and present data from internal and external sources, and they need to know the data is trustworthy so they can be confident in their findings and recommendations.
The Biglnsights Analyst module lets data scientists use their existing skills to find data across the organization and visualize it without extra coding. IBM BigSheets is a spreadsheet-style data manipulation and visualization tool that gives business users direct access to data through a recognizable interface. IBM-designed Big SQL offers HDFS caching and high availability benefits as well as query optimization —without forcing data scientists to learn a new skill set.
The IBM Biglnsights Enterprise Management module helps ensure the scalability, performance and security of Hadoop clusters. For example, multi-tenant scheduling and multi-instance support enhance scalability and performance by allowing multiple installs of Biglnsights on the same cluster with data isolation and resource sharing.