The Visual Approach to AI
May 2025 – Dr. Harald Piringer (hp@visplore.com)
Visplore has AI and advanced analytics at its core, but takes a human-centric data visualization approach to AI – why?
May 2025 – Dr. Harald Piringer (hp@visplore.com)
Visplore has AI and advanced analytics at its core, but takes a human-centric data visualization approach to AI – why?
Using data and AI to the best extent is becoming an increasing imperative to stay competitive in a highly dynamic environment. Most companies are currently investigating strategies that fit best for their own journey towards digital transformation. AI is a leading trend in this context.
In practice, many decisions and daily data usage scenarios require collaboration between human experts and AI. Let’s assume a guiding example: A company operates a boiler as a critical asset in their process, for example in thermal power generation or in paper production. Now, assume operators report recent problems with the boiler. The decision to make is when and how to service this equipment.
Why does it make sense to involve subject-matter experts such as maintenance engineers rather than to trigger maintenance in a fully automated way?
Reason 1: Reducing the risk of wrong decisions with high costs.
In this case, the user manually assigns labels to specific time periods, for example by annotation in a time series display (as shown in the image above). This can be the most efficient procedure if the number of time periods to be labeled is rather small, for example process states that remain constant for a long time. Furthermore, manual labeling can also be necessary if the patterns to be labeled vary strongly, so that approaches for semi-automatic labeling would not be possible or efficient.
Reason 2: There is often no single “true” answer, but many alternatives to be assessed within context.
When deciding for servicing the boiler, still many contextual aspects must be taken into account that go far beyond information contained in the sensor data from the asset itself. For example, production capacity planning, prioritization over other maintenance tasks, and the availability of required persons and resources are highly relevant for defining the best timing of servicing the boiler. Parts of this context may come from other sources such as ERP systems, and then need to be merged with the asset operation data. And much relevant information may not be represented explicitly at all – and is thus not accessible, but just known to the engineers.
Reason 3: In highly dynamic environments, there are and will also be in the future many situations with insufficient amounts of relevant data for AI.
Advanced AI models work best (or at all) if they are trained on huge amounts of relevant data. “Relevant” means that the training data captures all required aspects for assessing the present situation of the process or asset operation, while avoiding misleading information such as outdated data. In the daily practice of operating and optimizing industrial processes, however, there are often situations that lack relevant historic data that could have been used for training and predicting the current situation. Production processes are oftentimes unique and tend to be subject to frequent changes: new products being produced, new materials being used, varying ambient conditions, assets being serviced, and many other factors are permanently changing (see Image 1). So, while there might be much data available in total, oftentimes little of it adequately represents the situation at hand. This is not a data collection problem that will eventually be overcome, but an inherent complexity that introduces significant uncertainty in the output of advanced AI models at best, or rules out their use for certain questions at all. In such situations, experience and background knowledge of subject-matter experts are indispensable to make informed decisions and also to use the rather small data that actually reflects the current situation.
Image 1: In batch production, such as pharmaceutics and chemistry, it is a common situation that each production campaign comprises only a few dozens of highly complex batch profiles, and campaigns differ significantly in their characteristics. It is a typical situation where the complexity and variance of the process exceeds the available amount of relevant training data by far, precluding the use of complex statistical models such as deep learning.
Reason 4: Acceptance by staff.
For sustainable work environments, companies need the buy-in for new technology by their own staff. Subject-matter experts are well-aware of the aforementioned limitations such as permanently varying process conditions and that relevant data may not be adequately available in a digitized form at all. Solutions that ignore these aspects will likely not find broad acceptance. On the other hand, highly-qualified subject-matter experts see themselves confronted with high work load and are in most cases aware that there is no way around an increased use of data in decision making.
While general-purpose AI tools such as ChatGPT have widely been adopted rather quickly, it can be much more challenging to convince subject-matter experts to trust AI-based results which refer to their very specific processes and assets. One reason is that the subject-matter experts are typically well aware of the incompleteness and limited quality of the data basis and the complexity of the decision-making processes. Still, the question is not whether or not to use AI. The actual question is how to integrate AI into their work in an optimal and widely accepted way. Because subject-matter experts will continue to play a key role in the operation of processes and assets, as reasoned above. There are some fundamental approaches to foster acceptance, which focus on transparency, explainability, and ways to efficiently communicate with the AI and algorithms.
Data transparency: The principle of “garbage in – garbage out” holds true more than ever in the age of AI. Industrial sensor data is very prone to numerous problems, ranging from simple data acquisition problems at the sensor level up to contextual issues such as avoiding comparisons across unrelated process situations (“apples and oranges”). Therefore, the traceability and validation of data used as input for predictions and suggestions is key for trusting the outcome. For a digital AI solution, it means that users should be able to inspect underlying (raw) data on demand to assess the data quality and relevance, and also to make adjustments such as excluding irrelevant and outdated time periods.
Data contextualization: Many decisions and predictions go beyond sensor data but require multiple sources, such as data from ERP systems, logs about maintenance interventions, product quality measurements, energy price time series, and also knowledge which is only in the heads of the staff such as current management priorities. AI solutions that do not take this data diversity into account are prone to missing relevant aspects. Many equipment-specific predictive maintenance solutions suffer from this drawback. Comprehensive solutions for AI-based decision making require a way to easily access multiple sources and merge the data in a way that allows algorithms to combine heterogenous information for a holistic assessment.
Suggest and compare alternatives: For many decisions, there is no single true answer, but many possible alternatives that might reflect different priorities. Comprehensive solutions for AI-based decision making thus present these alternatives to users, allow them to ask “what if” questions, as well as enable them to browse and compare them efficiently. This approach is not only a great way of increasing trust, it is also a very efficient way to involve subject-matter experts with their knowledge in order to ultimately obtain the best decision.
Use explainable and interpretable models where possible: as a rule in statistics, the complexity of a model should not exceed the complexity of the underlying problem and data, in order to avoid “overfitting”. This means: More complex models do not necessarily imply better outcomes. In addition, when it comes to acceptance and trust, simple models such as decision trees and polynomial regression are significantly easier to understand and to validate. This makes them favorable where possible, because it enables cross-validation with domain intuition, which is essential in industrial settings where human expertise is irreplaceable and the availability of relevant data might be limited as explained above.
Enable sharing across teams: In general, the validation of AI outcomes will benefit if multiple subject-matter experts may participate in the assessment and discussion whether assumptions and results align with their experience. Systems that allow for sharing results across teams thus promote acceptance and better decisions overall. But in which form should results be shared?
Chat-like interaction with AI has reached tremendous popularity in recent years. For many applications, natural language is the most intuitive way of articulating interest as well as consuming results. However, text has inherent limitations when it comes to representing large amounts of numeric information, and in particular complex patterns and relations which are relevant in the context of historic sensor data. Graphics are superior to support the perception of this type of information by humans, as nicely summarized by the famous proverb “A picture is worth a thousand words”. As a quick proof: How much text would you need to precisely describe the batch campaign profiles shown in the example of Image 1? It wouldn’t only be next to impossible to do, but would also be highly inefficient to perceive and understand.
Therefore, plots and graphs have a long tradition in mathematics, engineering, and science for adequately representing complex information. Engineers are typically used to interpreting plots as part of their work. For all these reasons, data visualization will not lose its relevance for engineering-oriented use cases with sensor data in the age of AI.
This is not only true for the output of AI, but also for giving input to AI models: Charts have many advantages over text to identify relevant events, time periods, and patterns in general. They are way more precise, inform the specification of thresholds, and support approaches like “query by example” to search for relevant patterns such as anomalies or process operations by simply marking up one or more good examples (see Image 2). Therefore, communicating with AI graphically via suitable data visualization can be way easier and significantly more efficient than it would ever be possible via chat-like user interfaces.
Image 2: “Query by example” in action. The user graphically selects a pattern with a specific shape – in this case a significant level shift of a sensor in a paper product process. Based on this example, advanced algorithms find and suggest similar patterns. When contextualizing the sensor data by ERP data, it becomes clear that the patterns are changeovers from one paper type to another.
That being said, chat-like user interaction can still be beneficial in analytics tools for engineering-oriented applications. For example, it may help novice users to find the right software features for their task, or may provide additional information for interpreting results. Due to the inherent advantages of graphical over textual representations, however, data visualization and graphical approaches to data selection will very likely stay relevant in the foreseeable future.
The considerations given so far informed the design of Visplore: Its goal is to support an effective collaboration between human experts and AI for engineering-oriented use cases in industry and energy supply. It puts a special emphasis on user acceptance by ensuring transparency and easy contextualization of data. And it takes a graphical data visualization-based approach to specifying inputs and consuming outputs of advanced algorithms and AI.
The workflow can be summarized as follows:
Let’s walk through the steps of this workflow by revisiting the example of servicing a boiler. The immediate goal is to decide about the necessity and timing of interventions by comparing the current asset operation to a suitable reference period in the past. While we assume this boiler to be part of a paper production process, most aspects of the example also apply to other settings, and also to other asset types and root-cause analyses in general.
1. Connect: Visplore fetches and merges data from multiple sources. Historic and current sensor data may come directly from historians such as the AVEVA PI system. Contextual data about produced type of paper resides in SAP. The data may be enriched by logs of process interventions, and much more. It is noteworthy that this merging of data with different characteristics by itself solves a key challenge in practice.
2. Select: Before starting the AI-powered analysis, it is necessary to define time periods which represent the current operation and reference periods of healthy asset state. The selection typically excludes downtimes and makes sure only similar products such as paper grades are compared to each other. This step can be pre-configured for standardized use and does not require user interaction then. However, for ad-hoc analysis, subject-matter experts can easily define suitable periods by intuitive graphical selection tools.
3. Analyze: This is the step when Visplore employs advanced algorithms in order to identify relevant results for the user. In this specific example, it identifies differences in characteristics between the compared time periods such as altered value levels of sensors, changes of dependencies, and differences in patterns such as start-ups and shut-downs of the boiler. For other usage scenarios, Visplore offers pattern search features, predictive models, outlier detection, trend detection, and much more.
4. Inspect: The result of this algorithm is a list of suggested plots which are sorted by significance. Users can easily browse the list to see chart types that appropriately display the difference, for example scatterplots to show changed dependencies between sensor values, or time series charts showing different shut-down patterns (faster pressure drops may indicate boiler leakages). The role of the user is to assess the plots in order to get an informed understanding of the actual health of the boiler, and to tell which of the suggestions are most actionable.
5. Refine: When inspecting results in the previous step, a possible outcome is the need for adjustments before re-running the analytics. Additional data may be needed. Or the data may need additional filtering, to better represent the current process conditions with respect to steam demand, ambient temperatures, and so on. Such feedback mechanisms to the AI contribute a lot to building up trust in the approach and to eventually achieve better results and decisions.
6. Decide: When it comes to deciding about intervention planning, the generated plots are a suitable way of communicating insights to colleagues. Annotated graphics can easily be shared for discussion and collaborative decision making. Also at this stage, spontaneous questions by colleagues can still be answered, for example by adding new data, looking at additional historic data, and much more.
7. Automate: Changes in the characteristics of the asset operation may be detected automatically in case they re-occur in the future. The plots obtained during the root-cause analysis of an anomaly can be a great starting point to define expressive and robust triggers for such alerts. For this purpose, Visplore allows users to set up automated alerts directly via graphical queries. It should be noted that also automated alerts may require data from multiple sources beyond the sensor data itself. For example, triggers may differ between product types.
Numerous use cases in the operation of industrial assets and processes benefit largely from AI and advanced analytics, but require human experts to be part of the decision-making loop. Effective data visualization is not a contradiction to AI, but actually complements AI-powered workflows by the most suitable means to define necessary input and to consume results. Visplore offers a visual, human-centric approach to AI, that has the assessment of auto-generated charts by subject-matter experts at its core. This enables to make informed and transparent decisions faster than ever, and to specify robust triggers for obtaining early warnings via automated alerts in the future.
Harald Piringer studied informatics at the Vienna University of Technology and finished his PhD in 2011. For more than 10 years, Harald Piringer was the head of the Visual Analytics group at the VRVis research center in Vienna / Austria, where he did applied research in close collaboration with partners from industry, energy, healthcare, and other sectors. Harald Piringer (co-)authored more than 30 international publications in the fields of data visualization and visual analytics. In 2020, he co-founded the Visplore GmbH as its CEO.