BigInsights is IBM's analytical platform, based on industry standard Apache Hadoop and Apache Spark, which combines the best of open source software with enterprise-grade capabilities. It is built on 100% open-source components, grouped under IBM Open Platform component of the platform, on top of which IBM add-ons have been developed for analysts, data scientists and administrators (hence the other three big components of the platform, IBM BigInsights Data Scientist module, IBM BigInsights Analyst module and IBM Enterprise Management module).

IBM Open Platform for Apache Hadoop is the foundation of the whole analytics platform and represents flexible solution for processing large volumes of data. It includes Apache Hadoop and many other popular open-source projects from Hadoop environment, supports a large variety of data and popular APIs.

Using Hadoop, it enables applications to work with thousands of nodes (another name for commodity hardware, CPU and disks) and petabytes of data in a highly parallel and cost effective manner. Through Hadoop, nodes can be combines in clusters ,new nodes can be added as needed without changing the data formats, how data is loading and how jobs are written.

For the development of large-scale data processing jobs, the platform includes Apache Spark which runs on top of HDFS (but can also run outside Hadoop environment) and is much faster than the standard MapReduce mechanism implemented in Hadoop, ensures in-memory processing and distibuting computing. It offers also a great deal of flexibility by providing a wide range of workloads, built-in libraries and APIs for Scala, Java, Pyhon.

On top of the Open Platform, IBM has built a series of add-ons for different types of users, from which the most important ones are:

The list of features is larger than that. IBM Insights is a complete analytical platform, which gets all the benefits from the open-source industry standard Hadoop project while leveraging them through a set of add-ons that greatly improves the analytics process and the quality of the results.