Analytics 3.0: After predictive analytics, now the prescriptive
Fig.1: The development of analytics according to Davensport
Analytics 3.0 is regarded as the new era in big data. It was developed in the USA by Prof. Thomas Davensport. In addition to ”What will happen?” as in predictive analytics, one can also ask ”Why will it happen?” in prescriptive analytics. These ”what” and ”why” lead to new relationships and insights into the chain of cause and effect.
Besides conventional BI technology, Analytics 3.0 also incorporates new big data technologies that allow you to stream large quantities of live data in various data formats. The data analysis is done on distributed infrastructures and on In memory technologies using machine learning and data mining algorithms. As opposed to conventional data warehousing, there are no limits on data formats and data modeling is dramatically simplified.
Algorithms lie at the heart of Analytic 3.0. Machine learning algorithms automatically extract information from data. This is done without any machine-human interface and to make this possible, people are trained in algorithms with smaller amounts of data and models are built. The algorithms are also partly self-learning: over time, the models improve and so do their predictions.
Data mining also actually has an important place within Analytics 3.0. Here however, a specific person is always present in the discovery or prediction process. Typically, the solution of a concrete, complex problem lies in the foreground. One might, for example, using pattern recognition, want to obtain better understanding of a complex situation with several unknown influencing factors. Data mining uses many machine learning algorithms and vice versa.
Analytics 3.0 is a combination of technology and mathematics. It is reality and future at the same time. Analytics 3.0 has been used for many years and many universities and companies are doing intensive research on it.
Today, the number of algorithms that can be used is already very large and a strong transformation process is underway. Every day, new algorithms are added and existing ones are improved. Most of these algorithms are public and can be acquired through several open source packages such as R, Mahout, Weka, etc. Other algorithms are encapsulated in commercial products and therefore proprietary.
Additional software is required for the algorithms to function optimally with big data technology (distribution on CPU and RAM). Again, there are open source options or purchasable software products which are continuously developed.
One thing is clear: The possibilities of predictive and prescriptive analytics are far from being exhausted.
Industry 4.0: The computerization of industry
As a German/European project since 2011, Industry 4.0 includes the computerization of manufacturing technology. The automation technology necessary for Industry 4.0 should become smarter with the introduction of procedures for self-optimization, self-configuration, self-diagnosis and cognition. It should help people better in their increasingly complex work. This creates an intelligent factory (Smart Factory), which is adaptable and resource-efficient and optimally integrates into the business processes of a company.
The ideas of Industry 4.0 are omnipresent in Swiss industry and are not limited to manufacturing technology. The degree of maturity of Swiss industrie with regard to Industry 4.0 varies hugely. A few pioneers are already running remote maintenance systems where machines installed somewhere in the world feed data to the manufacturer which in turn, for example, triggers manufacturing processes. Some operators have connected their facilities located in various places or their entire factories so centrally that their data are jointly evaluated. But these implementations are only the first steps into a world of Industry 4.0, because the ”self”, that is the logical and physical networking of machines is growing only very selectively.
An important element of Industry 4.0 is the development of sensor technology itself. For example, the sensors in ”machine vision” – the field of image acquisition, also in the wavelength levels of infrared and X-ray – are bringing new possibilities to online quality measurements and they are simultaneously very data intensive. Spectroscopy is increasingly directly involved in the processes and delivers large amounts of data. Ever modernizing sensor techniques have even larger data streams which must be dealt with.
From our perspective, analysis still gets too little an emphasis within Industry 4.0. Analytics 3.0 and Industry 4.0 are highly separated worlds. Why? Both worlds are complex and are only partially controllable. The intersection is going to be great and just the skills necessary to unite both worlds are missing today.
Analytics Industry 4.0: A new star is born
Figure 3: Towards Analytics Industry 4.0 with the Cloud
If we now consider the intersection of both worlds, then we may call it Analytics Industry 4.0. the Cloud is going to be central in bringing these worlds together because it ultimately boils down to networking of data to a central location. Analytics Industry 4.0 is a branch of Industry 4.0 emphasizing the analytical part of this fourth industrial revolution. What is the purpose of such a branch? For this, let us go back to the definition of Industry 4.0 and understand the importance of the analysis:
- Self-optimization: The self-optimization of the manufacturing process is, in addition to the physical machine operation, a mathematical optimization process which is based on data. It is based on nothing but the algorithms described in Analytic 3.0.
Self-optimization has two aspects. On one hand the optimization of the production process itself, and on the other, the manufactured product is also in focus. A self-optimization of a manufactured product can be described as automated quality optimization. It requires automated quality measurements which generate huge amounts of data that must be processed. Thus large, high-performance analytical infrastructures are necessary to enable timely flow of this data back into the production process.
- Self-diagnosis: The purpose of self-diagnosis is detecting any possible machine breakdowns in advance. This extends far beyond notifications. A self-diagnosis can only happen through the combination of measurement data, their algorithmic processing and recycling of the information derived from the production process for further physical processing.
- Cognition is the totality of the mental activities associated with thinking, knowledge, remembering and communicating. Just as the human brain needs it, industry also needs a data pool as a basis to generate knowledge. This basis is the (Cloud-)infrastructure of Analytics 3.0.
The aim is therefore aligning Analytics 3.0 with the ongoing fourth industrial revolution. It not only affects manufacturing technology, but also storage technology, process engineering, air conditioning technology and energy technology. Both the data-infrastructure and the algorithms are specific tools of these industries and have to be developed. In our view, open source is going to play a crucial role in this. We believe that the goal of Industry 4.0 will be achieved fastest through existing and new open source projects in the analytics and big data area. Open source initiatives generate new products for Analytics Industry 4.0.