Comparing events

Many analysis questions involve the comparison of certain events. An example is comparing ramp-up procedures or other operating conditions in a power plant. Another example is analyzing process anomalies and explaining under what conditions they occur.

Visplore supports defining the event conditions interactively and visualizes the compared data subsets in various ways (statistics, colored histograms, and much more). A comparison of all events to the immediate time before (or after) the events is also possible. This supports root-cause analysis and a quick identification of sensors that behaved suspiciously already before the events happened.

Select time periods -> Compare in various plots

In this tutorial, you learn how to compare explicitly selected events.

Note: This section describes how to compare all events where a variable has a certain condition like “Power > 40kW”. If you want to describe how you can use a selection of 2 or more intervals on a time axis to compare please refer to the part on Comparing time periods.



Preparation (to follow this lesson using demo data)



Defining events for analysis

The first section is about defining events. In this example, we are interested in periods with high production. This is done by selecting a value range and creating a comparison for each of the resulting events.

As a result, all visualizations are filtered to show only the compared events.

In the timeline, the events are shown next to each other, and gaps in between are collapsed. Note that some events are short and only contain a single data point (zoom in to see this more clearly). Later, we will show how you can rid of very short events in the comparison.

You will also notice that the bar chart on the top right corner was replaced with a pivot table (see image below). This shows the time and duration of each event together with some key metrics such as the mean value. At this point, you might be interested to adjust the statistics and exporting your findings.



Set a minimal duration for considered events

As we saw, we have events with longer periods and events that are single points only. To filter out these very short events you can apply a minimal duration of 3 hours to exclude them from the analysis. If you do not apply this in the dialog in the beginning you can also change the setting afterwards as seen below. This change applies to the event definition as a whole and not only to the pivot table.



Plotting all events as overlaid time series

In the following, you will see how the previously selected events can be overlaid in one combined plot. For exmaple, to identify the "typical" shape of events, or outliers. For this, switch to the "Pattern Search and Comparison cockpit".

In the "Segmented curves" view, you see each event on a common x-axis indicating the duration in hours. In this case, it is helpful to align them all by their maximum value. This will align the maximum value of all events on the zero time.

To visualize an average event, use the "Show aggregated curves" feature. It allows you to calculate and export the mean curve. Additionally, you can add boundaries defined by the standard deviation.



Comparing events to the periods before/after

For root-cause analyses of anomalies or other events, a common approach is to compare sensor values during the events to the values before the events. The idea is to identify sensors with suspicious patterns that occur before many of the events. This kind of analysis is answered by including a context time in the comparison of events.

Again, let's say we are interested in high production events, but this time with 2 hours of context before and after each event:

The histogram now shows distributions for the event in green, distributions before the event in orange, and distributions after the event in blue. Also, the time series plot now has differently colored segments. Depending on your setup and previous usage of Visplore, these colors may be different.

With this categorization of the time before, during, and after the events, we can investigate which other variables had significantly different values for these three categories.

The "Histograms" view shows a list of all variables, sorted by how different their value distributions are for the three temporal categories. Sensors where the values were significantly different before/during/after the events are ranked at the top - guiding the user towards sensors that may explain why the events happened. Sensors, where the values were the same all the time, are ranked lower.

In this example, you see that the power generation, the generator current, and the apparent power of BrightCounty all have similar distributions for the event and its context, which is expected. The fourth variable is also significantly different but shows a different distribution than the ones before. After selecting the fourth variable, it can be seen in its histogram, that the events of highest power generation show the lowest harmonics as seen in the color green (use the right mouse button to draw a range to zoom into to make this clearer). This insight can be confirmed due to the technical characteristics of the inverters.

Use this feature of comparing value distributions with sorted relevance to generate other insights and detect hidden root causes of problematic events.

Great! You have mastered the workflow for comparing events.




>> Continue with Next lesson: Comparing catagories




License Statement for the Photovoltaic and Weather dataset used for Screenshots:
"Contains public sector information licensed under the Open Government Licence v3.0."
Source of Dataset (in its original form): https://data.london.gov.uk/dataset/photovoltaic--pv--solar-panel-energy-generation-data
License: UK Open Government Licence OGL 3: http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
Dataset was modified (e.g. columns renamed) for easier communication of Visplore USPs.