Success Story
How Verbund, Austria’s largest electricity producer, modelled sensor time series data for interpretable and robust condition-based monitoring.
Austria’s largest electricity producer, Verbund Hydro Power, needs to monitor the state of its water turbines. To automatically detect anomalous process conditions, data scientists at Verbund modelled the turbine’s expected oil temperatures based on multiple years of historical data of the water temperature and other sensors. However, the modelling required a careful selection of meaningful periods for training and validation. Times of regular turbine operation were repeatedly followed by shut-down and start-up procedures that had to be excluded from the modelling. Moreover, upgrades to the plant made it unclear to which extent historical data was representative of the current turbine behavior. Selecting relevant process data was thus considered a significant time and cost factor.
The data scientists required means to label periods of the process data as suitable training and test data while discussing the data with the process experts. This labeling step typically needs to involve the experts’ opinions and thus cannot be automated by prepared scripts. However, even labeling multiple years of data should not take more than a few hours to reduce the experts’ time effort. Furthermore, both the data selection and the modelling had to be transparent to get the buy-in of the experts. It was necessary that the models could be interpreted and checked for physical plausibility before using them for operational condition monitoring.
With Visplore, the data scientists could interactively label multiple years of process data live while discussing the data with the process experts.
Intuitive tools enabled the users to graphically select suitable periods and label them as training data with a click. Labels could easily be corrected based on the feedback of the process experts and exported to Excel or Python. Selecting useful test data from four years of process data and dozens of tags took much less than an hour.
Also, for the predictive modelling itself, the interactive cockpits of Visplore enabled the data scientists to work together live with the process experts. Intuitive visualization made the modelled dependencies interpretable and allowed for a quick plausibility assessment for different operating conditions. Automatic ordering of all variables by their influence on the modelled oil temperature allowed for selecting relevant features quickly while including the process expertise in the modelling. After a few minutes, a predictive model was created, well-understood and approved by the process experts, and could be exported as Python code.
The exported Python code could easily be used to define dynamic tolerance corridors for condition-based monitoring. The corridor robustly adapted to different process conditions, minimizing false positive alerts while being narrow enough to detect real process anomalies quickly. Using Visplore, the know-how of the plant engineers could very easily be included in a transparent and highly efficient process for cleaning and modelling the sensor data. This saved much effort for data preparation and considerably shortened the overall project duration. Furthermore, it led to a plausible model that was widely accepted as it considered necessary domain knowledge.