Why anomaly detection for predictive maintenance and where are the limits?

On 19.11.2019 we presented our project experience with the Predictive Maintenance Expert Group of the Swiss Alliance for Data Intensive Services. We would like to share some of these experiences in this blog.

When we discuss predictive maintenance with our industrial customers, the main topic is often the inadequate data situation. “Our machines are so good and produce hardly any errors”. Our customers are then prematurely of the opinion that predictive maintenance cannot be implemented within the time and effort involved. In this blog we want to explain how we deal with this topic.

Let’s take our Predictive Maintenance project in the distribution centres of Swiss Post as an example. In the various distribution centres, 8 sorter systems with approximately 700 wagons each are in continuous operation. Each of these wagons can have various defects, which of course happen very rarely. The next picture shows the possible defects:

Fig. 1: Maintenance topics on the chassis of a tilting tray trolley

The important conclusion to be drawn from this project is that machine learning can achieve great results, even when very few defective cases are available for training. To explain how we do it, here are some basics:

In the following figure you can see the principle of classification, which belongs to the field of Supervised Learning. In these 3 pictures we can see that a sufficient classification is not possible if the data situation is insufficient. New measurements (marked “unknown” in the picture) will be classified as “normal wagon” in the picture on the far right, even if a defect has occurred in the wagon. The classification model gives incorrect results if the training basis is insufficient. This is when the problematic cases seen in the training are not representative of all the possible problematic cases we want to identify later. In such cases the solution becomes unusable.

Fig. 2: Incorrect classification, if the model has only a few cases of damage for training

The situation is different with anomaly detection, an ML modelling technique that assumes hardly any problematic cases during training, and is thus considered Unsupervised. During the training phase of anomaly detection, we essentially model what is the “normal” behavior or wagons, and we only need very few examples of defects (if available), to tune the model parameters. The model can then very well detect a new defect in the field, because it only has to detect the deviation from the normal state, which is correspondingly easier for the model. See the second picture in the next diagram.

Fig. 3 Difference of classification and anomaly detection

But an anomaly detection model can do much more, as we see in the Swiss Post case. We use a stationary laser triangulation to measure bending on different parts of the chassis. We measure various points and surfaces of the car in the submillimeter range as it passes the measuring device. The evaluation of these measurements takes place within a few seconds, so that we can play the measurements on each car online in our LeanBI Cloud during operation. This makes it possible to detect changes over time and thus make more than just status statements. So you can see at an early stage whether a problem could occur in the future, and this information is then passed on to the maintenance personnel. This is predictive maintenance.

The conventional opinion that anomaly detection only detects one anomaly, but not the individual problem, is not always correct. Using the appropriate feature extraction, i.e. the selection of the survey points, we can pass on precise information about the anomaly. For example, the chassis coordinate x,y,z has shifted in the direction y by 0.4 mm. In this case this is already a very accurate localization of the problem.

But there must still be a catch then? Yes, it is true that the models are not yet very accurate on the basis of a few damage patterns. But these are so accurate that we get all possible damages within the 10% wagons that our model will indicate, so no damage slips through accidentally. It can happen that a damage is reported in error, where there is no damage at all (a “false positive”). But: The maintenance staff is 10 times faster than before, because they only have to inspect 10% of the cars. And we detect faults faster (online) and better than the maintenance staff. For example, the test measurements revealed damage to a number of wagons that were overlooked during periodic maintenance.

With this predictive maintenance solution, we prevent unplanned downtime, which is the main benefit of such a system. This is so that the customers of the Post want their parcels to arrive on time for Christmas. Unplanned downtimes can also be very expensive if a car is severely damaged. At the same time, we reduce the work of the maintenance staff and draw their attention to the specific problematic wagons. This also helps with less trained personnel.

Summa Summarum, the business case for Swiss Post is worth it.

In the area of predictive maintenance, there are problem areas that are much more complex than this case study. Can anomaly detection then fail? In such cases, at least the anomaly detection alone is not sufficient. The aim is to continuously include the know-how of the specialists in the system even more strongly in more complex subject areas. The model thus becomes “self-learning”. We are working on this topic in other projects and will report on it in the next blogs.

What AI can really do today

“People underestimate the capability of AI. They think it is a smart human. It will be much more than that.“ Elan Musk, 29.8.2019

In this blog we would like to discuss the limits of artificial intelligence today. We will cover a wide range of topics from definition and history of artificial intelligence to today’s practical limits of Artificial Intelligence.

AI Definition

Artificial Intelligence (AI) deals with the automation of intelligent behaviour and is closely related to the term Machine Learning (ML). The goal of AI is to emulate or at least simulate intelligent behaviour with algorithms.

AI is particularly relevant today in its areas of knowledge-based systems (expert systems), pattern recognition and prediction, and robotics.
Data science is also strongly linked to AI. Data science refers to the extraction of knowledge from data, using methods, processes and algorithms such as statistical and machine learning methods. Deep Learning is a branch of Machine Learning.

AI History

AI developed out of many theoretical considerations from the 1920s onwards and manifested itself concretely for the first time at the beginning of the 1950s. From then on, AI and its sub-areas underwent several hype cycles. Already in the 50’s there was an almost limitless expectation regarding the ability of computers to become intelligent, e.g. to become chess world champions. But the IBM Computer Deep Blue succeeded as a chess world champion much later, in 1997. There were also expert systems since the beginning of the 70’s, which supported already at that time diagnosis decisions of the physicians, but in very narrow subranges of medicine. The problem at that time was that the limits of competence were blurred, which sometimes led to false diagnoses. The hypothetical Fuzzy Logic was also an achievement that has established itself particularly in various machine controls, but did not achieve a broad breakthrough. Therefore, after every AI hype, the so-called AI winters are characteristic.

The AI hype of nowadays was socially triggered by two events: IBM Watson, who defeated the Jeopardy Champion in 2011 and AlphaGo, who defeated the world’s best Go player in 2018. The latter was thanks to Deep Learning, which is based to artificial neural networks inspired by the networks of the human brain. Deep Learning is also the basis of many developments in the field of image recognition, speech and text analysis.

Fig.1 Neural network with several layers (Lit. Towards Data Science, Training Deep Learning Networks, Rawindra Parmar, 11.9.2018)

AI breakthroughs today

Networking and cloud services have led to an enormous increase in global data volumes. Big Data has become both a buzzword and a curse. A few years ago, Big Data helped AI to make a great leap forward (because large amounts of data are often indispensable for many models of modern AI), but the Big Data hype was soon replaced by the one of AI. The topic of Big Bata is still important, but nobody talks about it anymore. This is exactly how AI will be in a few years’ time. What is interesting, however, is that AI has had a strong avalanche effect. Many new courses of study have emerged, and politics continue to invest enormous amounts of money into research and development. In contrast to Big Data, AI became political, and today’s political opinions go so far that individual states believe that AI can govern themselves and the world.

Much is exaggerated, but there have been some astonishing technological breakthroughs in recent years that suggest AI’s omnipotence. And so too many, even AI experts, believe that artificial intelligence will overtake human intelligence in less than 20 years. Fortunately, history teaches us that this will take much longer, if ever.

Indeed, there have been some notable technological breakthroughs in recent years like:

  • Language translation machines (Deepl, Skype, Google Translate, etc) that are pretty good. Speech to text and vice versa also works very well on
    the basis of AI.
  • Personal assistants based on the Internet (Amazon Alexa, Google Assistant, Healthcare, etc.), which have become very popular especially in America.
  • Recommendation systems on Internet platforms that are increasingly becoming the norm.
  • Automated behavioural analysis such as sentiment analysis for marketing.
  • AI Security: AI helps us to fight crime in many cases. Fraud detection, cyber security and a better understanding of crime.
  • And a lot of applications based on image and video analysis, which is also the basis for autonomous driving. Here, for example, our project of defect detection
    (cracks, wet spots, etc.) in tunnels also falls into this category.
  • It is estimated that the predictive maintenance market will reach $11 billion by 2022 (Market Research Future Roland Berger, 2018), a very strong development over
    the next few years.

The intelligence of AI

Fig.2 shows where the intelligence of AI is compared to the animal world.
A comparison can best be made using Deep Learning, as the concept of the neural network is similar to that of the brain.

A human brain has about 100 billion neurons (nerve cells) and 100 trillion synapses. The comparison of AI neurons to organic neurons is somewhat vague, but we tend to be somewhere near to a honeybee, some claim near a frog. Thus at least a factor of 10,000 still lies between AI and the intelligence of a human being.

Fig.2: The intelligence of Deep Learning in comparison to the animal world

The bee can find specific behaviour patterns very well, for example the way, perhaps even better than humans. The same applies to AI. AI can detect a specific tumour better on an x-ray than a doctor (even if not all doctors agree). But the model cannot simultaneously detect the image of another disease in the brain unless it has been specifically trained to do so.

Unfortunately, it has been found that the accuracy of pattern recognition decreases once the neuronal network trying to solve many problems at the same time. At least currently, the intelligence of deep learning still has a limit that may not be broken. A Deep Learning model will never have the intelligence of a human being, to get in that direction, completely different concepts are needed.

The missing intelligence of AI

  • AI doesn’t have what’s called a common sense. AI can only draw conclusions in a specific field in which it has been trained. AI does not work beyond these limits.
  • Deep learning only works through probabilities. There is no direct wiring of input and output, so deep learning will never produce 100% accurate results.
  • Deep learning is only as good as the data used to train the network. This can often lead to bias, essentially big mistakes learned from the data. For example, a model trained with
    historical mortgage decision data can obtain a racial or gender bias, making it more reluctant to allow a mortgage to an African American woman.
  • Deep learning only works with a lot of training data. Furthermore, if the problem area changes over time, new data is needed to retrain the network.
  • Human intelligence functions through the exchange of information and the cross-domain application of knowledge. Deep learning cannot do this. Deep learning cannot be described as
    creative (even though a special type of networks – the GANs – can solve specific “creative” tasks given the right training data).

Where does AI (still) make little sense?

Here are just a few examples to warn against hasty euphoria:

AI for HR

A few years ago, AI applications came onto the market that support the application process of new employees. A system was proposing to conclude the suitability of an applicant based on her facial expressions during the interview. What amused me personally was that the system was (at least initially) positively received by several companies.
The suitability for a job is usually very individual. In addition to the skills for the role advertised, the person must fit into a very specific team. It is hardly possible that a deep learning algorithm can provide a decisive opinion for such an individual decision-making process.

Deep learning for the stock market

Tell AI to predict the stock price. This is not possible for entire stock markets and certainly not for individual companies in the future. The stock market reacts to economic and political influences such as instabilities that are unpredictable and is emotionally charged. In the past, AI has even destabilized stock exchanges. But the tendency is that with AI stock exchanges become less emotional, so that the swings will be less dramatic.

AI as a doctor

Expert systems in HealthCare will support the physician and become better and better over time, the more data these systems have (historical data and online patient data). In a few years, it will be impossible to imagine the medical practice without expert systems. But it will be many years before the function of a doctor has to be reconsidered.

Autonomous driving

The autonomous driving especially in the cities will unfortunately be delayed. Even though many prototypes have been in use for years. Unfortunately, the forecasts for an introduction are constantly being revised backwards.

It is now assumed that the City Pilot will be introduced in 2030 (ADAC Studie, 2018), with a penetration rate of 10% to 20% in cities in 2050. That’s quite small and I hope that in fact this will happen a little faster. Hope dies last.

Fig. 3: Stock penetration until 2050: total stock; ADAC studies by Prognos, Aug 2018

Elan Musk is a visionary. But he will make it to Mars sooner than the AI becomes more intelligent than human beings. Many hopes, but also fears (fortunately) are exaggerated beyond measure.

What worries maintenance managers. Fmpro Round Table of maintenance managers – Digital Retrofit

On June 26th 2019, LeanBI, in cooperation with the company OST, was invited to present at the fmpro Round Table on the topic of “Digital Retrofit of Systems”. In addition to our presentation with many use cases, the joint discussion at the Round Table was of great importance. We summarize the outcome and report on what worries the maintenance managers of large companies.

Fmpro is Switzerland’s largest association in the field of facility management and maintenance. Its aim is to support education, know-how transfer and networking in the industry. In this context, fmpro organizes round tables on the subject of maintenance. The maintenance managers of selected companies such as DSM, BASF, Comet, ABB, Migros, Bilfinger, Nestle, Axpo, Post CH and Altran were invited to the “Digital Retrofit of Plants” event to share their opinions.

Digital retrofitting is a major topic in the industry, and particularly predictive maintenance in this context. The market volume in Switzerland is estimated to reach $247 million within 5 years, five times higher than this year (see chart). A good enough reason for maintenance managers to keep dealing with this topic again and again.

Yvan Jacquard, CEO of OST, and Marc Tesch, CEO of LeanBI, demonstrated how predictive maintenance is approached in the industry in their keynote speech and underpinned this with 3 use cases. In summary, the following 3 points are crucial for a successful retrofit of logistics and production facilities:

  • Choose a broad IoT approach, but start small
  • If available, use “Out of the Box” solutions, in combination with AI and sensor technology
  • Staggered implementation: benefit/cost/feasibility oriented – few risks and better performance

Our central focus in this blog is the point of view of the maintenance managers, because these well-known and experienced people seem to be rather cautious towards predictive maintenance. So, why is that?

Some maintenance managers doubt that failures of machine components can be predicted sufficiently well so that the time of the failure can be determined. Many companies have had various negative experiences. For example, in one of the cases we presented, two identical systems were operated until failure and it was found that the failure occurred after very different times. This implies that non-measurable environmental influences have a dominant influence on the components.

In another case, in process technology, many parameters were recorded for components such as pumps and heat exchangers. Based on physical principles, fouling topics (i.e. deposits on the heat exchanger walls) could thus be better understood. That was a success. Also, with the pumps, design improvements could be introduced to the pumps by means of broad measurements in order to reduce failures. However, despite success of data analysis, predictive maintenance was not tackled in either case.

Another important point for the maintenance specialists is that it is difficult to combine their experience with completely automatic AI and Machine Learning models. The world of production is too complex, but today’s predictive maintenance solutions are too simple to incorporate logic from the experts.

To summarize the input of the maintenance managers: Market research and our experience are more in contrast to the experiences of many specialists. So who is right? – BOTH

The derivability of AI models over different operating states and products with the same or almost the same systems cannot be easily automated even today. If, however, relevant data exist for the normal operation of the driven operating states, anomalies can be detected early on. If we know the chronological course up to the failure, AI can forecast when a failure is likely to occur. The model can be adjusted in such a way that a failure is prevented at all costs. We, like other companies, have proven this several times. But a residual risk remains. The trick is to highlight the residual risk and thus create transparency and trust in the AI models.

But how good is such an AI model at predicting defaults? Of course, only as good as the data. If an ambient humidity or temperature is needed to determine failure, but not recorded, the model cannot forecast correctly. It is therefore up to us, in cooperation with the engineers and maintenance specialists, to determine the correct input data. Another difficulty is the fact that environmental influences are constantly changing. For example, input materials are never the same. The model must therefore be “calibrated” to the respective operating conditions. If we talk here about fluctuations in shorter cycles, then we are convinced that real-time measurement in the plants can lead to an improvement.

Is predictive maintenance worthwhile at all? Does the business case pay off? Or is it not easier to fall back on experienced maintenance specialists? Our answer is twofold:

  1. Benefit: The goal of predictive maintenance is not to merely reduce maintenance costs. It is always a wider bundle of improvements that ultimately makes the difference. The top priority is the availability of the systems, the readiness to deliver and the quality of the products produced. But process and energy optimization are also important issues in digital retrofit projects. The maintenance departments in the companies are under great cost pressure. Predictive maintenance cannot be provided by these departments alone. Predictive maintenance is therefore always a management task and budgets must be available at a higher level.
  2. Automate Experience: The experience of the maintenance specialists cannot be so easily encapsulated in the AI models. But the mapping of rules based on the results of the models is not too complex in an IoT environment. If we go one step further, then we encapsulate this experience again with AI via various feedback loops. This is a development topic of LeanBI.

The biggest problem for the maintenance departments in Switzerland is the “shortage of skilled workers”. A survey at the Round Table clearly showed this. One way of tackling this problem is to transfer the specialists’ experience to the systems, and automate it. And here we can help, despite healthy skepticism in the industry, to create a large added value in the area of maintenance.

LeanBI is going to ICLR 2019 (6-9 May, New Orleans)

Artificial Intelligence is arguably one of the fastest growing fields of technology in the last 10 years. While machine learning is often used as a black box in data science, this approach is limiting the range of problems the data scientist can solve. In our experience, at least half of the projects need more than that: we have to either customize known machine learning techniques, or devise completely custom solutions based on the math of machine learning. Therefore, for successful data science projects in the industry a very good overview of the latest advances of machine learning tools is needed.

LeanBI is happy to participate this year in the most important Deep Learning conference in the world, the International Conference on Representation Learning (ICLR).

ICLR is a relatively new conference, as is the field of deep learning. It is only running for the 7th year, but it has gained a place among the most important machine learning conferences in the world. This year, there will be more than 4000 participants gathering in New Orleans from 6 to 9 May.

One thing that is great about ICLR (besides being THE deep learning conference), is that it combines academic and industrial research. This is something very needed in the ML community, as we often see the best professors leaving the academia for the industry to join research labs of Google, Facebook and other machine-learning-fuelled companies.

While the occasion for us is to present some of the academic work from the PhD of the our colleague Vassilis Kalofolias (Large Scale Graph Learning), we will have the opportunity to see and discuss the latest ways of tackling machine learning problems from top universities and ML-based companies of the world.

In a later post we will summarize some of the highlights of the conference. Until then, more information and the program of ICLR 2019 are to be found here: https://iclr.cc/