Cockpit: "Multivariate Drill-down"
Analyze distributions and trends of numerical variables and quality-related KPIs. Investigate the effect of independent variables on these KPIs. Summarize variables as pivot tables and drill down to single values.
Pro This cockpit is only available in Visplore Professional.
Overview
- Normalization and Used Color-Aggregate of the Selected Variable: Here, you can optionally normalize the variables for the analysis (e.g., z-standardization). Moreover, you can select the statistical summary that is color-coded per category in various views (like "Category (1 axis)").
- Overview and Selection of Variables: The views in this group show an overview of all variables, e.g., statistics, the distribution over categories, or histograms for each variable. Select variables in these views for a detailed inspection - all other views in the cockpit refer to the variable (or variables) selected here.
- Selection of Categories in Focus: These views can be used to define the focus of the analysis in terms of categories. By selecting categories, you can restrict the analysis to categories like a specific location or plant.
- Distributions: These views are the core of the cockpit. For example, you can find a pivot table with statistics for each category / category combination ("Pivot Table"), a histogram, or a view showing the selected variable’s box plots per category. All views in this group only show data entries that are in focus.
- Access to Single Values: This group of views allows access to single values. The "Timeline" view shows the selected variable, including process borders and regions for existing target values and upper and lower tolerance limits.
Starting the cockpit - assigning semantic roles
The following roles can be given to data attributes in this cockpit. Use the icon in the toolbar to adjust them.
- Time axis: This role can be given to a data attribute defining a temporal ordering of the data records. It can be time stamps or numerical values. It will be used to determine the temporal context for all variables, e.g., measurement times, consumption, production, etc. If the role is assigned to a data attribute of date/time type, periods like 'Month', 'Hour', etc., are extracted and available for defining filters and categorical plots like bar charts.
- Category: Some views aggregate values by categories (e.g., per day of week, per month, etc.). When this role is assigned to a categorical data attribute, its contents are available for such aggregations. The role can also be given to numerical data attributes, resulting in one category per distinct attribute value (e.g., different states encoded as 0 or 1). If the role is given to a data attribute of date/time type, periods like 'Month', 'Hour', etc., are extracted and available for defining filters and categorical plots like bar charts.
- Asset ID: Defines a categorical channel to split the data between assets you want to compare or work on independently. Find more information on working with assets in the last section of this guide.
- Variable: Numerical data attributes with this role can be inspected in the cockpit and are considered in calculations. Unassign this role if you want to exclude a variable from all considerations.
- Independent Variable: Data attributes with this role are independent variables whose influence on the numerical properties / KPIs is to be investigated. Example: Design parameters in a simulation or industrial process, hyperparameters for models, etc. Any number of numerical variables can have this role.
- Upper Limit: A numerical data attribute with this role defines an upper limit for a variable. It will be shown along with the referenced data attribute in timeline views and histograms. Moreover, it is used in process statistics like tolerance violations.
- Lower Limit: analogous to Upper limit, but for lower limits.
- Setpoint: analogous to the Upper limit, but numerical data attributes with this role represent a setpoint (=desired state) for the reference data attribute. Data attributes with this role are additionally displayed in views.
- Curves: This optional role is for data attributes with one curve per data record, e.g., a short time series per production batch or a spectrum.
Selection of Variables
The views "Statistics", "Heatmaps", "Histograms", and "Recent changes" provide an overview of the variables available for the analysis (role "Variable"). Selecting a variable in one of these views means that all other views in the entire cockpit now refer to that variable. For example, the "Timeline" view now shows the time series of the selected variable.
Selection of Categories in Focus
In the views "Category", "Category Combination", and "Categories", you can obtain a summary of the values of the currently selected variable over categories in the data. The chosen aggregation measure determines the color-coding of the views. If, for example, "Mean" is chosen as a measure, the color of the areas shows the mean value of the measured variable for each category of one or multiple categorical data attributes. The categories by which the views are subdivided can be set via drop-down menus accessible via the corresponding axis labels.
These views are mainly used to select subsets of the data. This affects the rest of the views in the cockpit. For example, the calculations of the statistical values in the "KPI Statistics" view are restricted to the data of the selected cells of these views (i.e., of a specific category or category combination). Furthermore, the entries of the selected category are brought into focus in the "Timeline" view (in full-color intensity), and all other entries are displayed as context (grayed out). This allows you to view the individual values of the selected category (-combination) and their distribution over time.
Distributions
The views in this area form the core of the analysis. In contrast to the views in the "Selection of Categories" area, the views in this area are restricted to the focus. The views show detailed information on the currently focused data of the selected variable / variables. All other cockpit areas can be minimized to provide enough screen space for the views in this area. In this way, the data in focus can be viewed and analyzed in detail under various aspects.
Statistics per Category
The "Pivot Table" view shows statistics for one or more selected variables, broken down by (combinations of) categories. The displayed statistics and subdivision levels can be freely configured by clicking the axis labels and the '+' buttons next to them. The columns of the table can be rearranged using drag and drop. If you want to relate specific categories to one another, this view supports the display of relative aggregates (click the view title, then "absolute / relative results").
In the example above, a pivot table for the variable "Diffusion Ratio" was created with a few clicks, which was subdivided on the Y-axis by "Plant" and "Location". A configured pivot table can be exported to the clipboard or a CSV file via the corresponding view options (click on "Export" in the top right of this view), imported, and further used in other programs such as Microsoft Excel.
Histogram
The "Histogram" view shows the distribution of the values in focus for the selected variables using histograms. Additionally, all data records, including those not in focus, are shown as grey context (see image). If available, the histograms show the variables' tolerance limits and target values. The view also allows displaying the normal distribution based on the mean value and standard deviation of the underlying values on top of each histogram as a black line (can be enabled in the menu when clicking the view title). Finally, the view can display statistical tests (t-test, chi-square-test) for the data in focus vs. the rest or categories (turn on/off in the view title menu).
Bar Chart, Heatmap, Multi-Variable Bars
These three views offer similar functionality as the views in the "Select data" section, but with one difference: they only show the data that is currently in focus. These views are beneficial if, for example, you want to find out which categories or category combinations occur in an existing selection and how they are distributed.
Box Plot
Box plots for the selected variable are displayed in the "Box Plot" view to analyze a variable's distributions by different categories. The subdivision by categories can be set as desired using the drop-down menus in the axis labels. In the example above, box plots are displayed for the measured variable "Diffusion Ratio" per combination of "Plant" and "Location".
The width of the individual box plots results from the number of values in the corresponding category combination. To compare box plots of different variables, the view shows box plots of multiple selected variables side-by-side, distinguished by color:
Independent Parameters (perform input-output analyses)
When giving the role 'Independent Parameter' to some numerical variables, there are additional views of these independent parameters available:
Selecting values for independent variables (as part of the 'Select' block)
- Ind.: A histogram of one selected independent variable to choose a value range. Select the displayed independent variable by clicking the axis label.
- Ind. x Ind.: Scatterplot of two selected independent variables to choose combinations of values. Select the displayed independent variables by clicking the axis labels.
- All Ind.: A parallel coordinates plot of all independent variables to select value ranges for multiple variables. Axes can be re-ordered with drag and drop.
Dependent variables vs. independent variables (as part of the 'Single Values' block)
- KPI vs. Ind.: A scatterplot of the currently selected variables (selected in 'Select KPIs') on the y-axis vs. one selected independent variable (select by clicking the x-axis label). Supports a sensitivity analysis of one or multiple objectives over one independent variable.
- KPI vs. Ind. x Ind.: Shows one selected variable (selected in 'Select KPIs') as a colored background over two independent variables (select by clicking the labels of the x- and y-axis)
Access to Single Values
Timeline
The "Timeline" view displays the currently selected variable as a time series. The time series view allows selecting and exploring parts of a time series. If an area of a time series is selected, other representations in the cockpit are adapted to this subset of the data.
Timeline (stacked)
If two or more variables are selected in the cockpit, an other view appear next to the "Timeline" view: "Timeline (stacked)", which shows the variables one above the other.
Parallel Coordinates
The "Parallel Coordinates" view compares selected measurement variables in a parallel arrangement. The horizontal lines between the two measured values show a point (data line); the values of each line can be read from the parallel axes. The brighter the lines, the fewer points at the corresponding values. This allows you to determine in which areas values are located, at which points the most are concentrated, and to what extent the compared measured variables correlate.
In this view, you can select data rows in the value ranges of the parallel axes. Selected rows remain in focus (full-color intensity), and all others fall into context (as gray background). You can then observe how the selected lines are distributed over other axes (and their value ranges).
Data Table
The "Data Table" view displays values of the currently selected data rows, including (if available) target values and upper and lower tolerances.
Working with Assets
If Visplore finds candidates for Asset IDs or a variable as category transformation in your loaded data set, it prompts you to set it before fully loading it into Visplore, so you can start working on your data with the correct presets from the start. You can still set these options later using the Data Source Manager.
- A default asset filter that is always present automatically gets assigned and can be used to limit all analysis panes to selected assets.
- Heatmaps split per asset: instead of showing the ‘Statistics’ view to select data attributes, the heatmaps view is shown per default, where each column represents one asset.
- New asset selection view: an additional view, similar to the pivot table, is added to the ‘select data’ section where assets can be selected, and some basic statistics are shown for each.
- Distribution axes pre-set to distinguish assets: all distribution views will start with the asset ID as the selected value for the x-axis.
- Asset-based coloring in single value views: A shortcut to color views by asset ID is added to the ‘single value’ views (see image below).
The selection behavior in the ‘single value’ views also changes to facilitate working with assets. E.g., limiting the range selection to just the selected asset is now possible by simply selecting one asset in the ‘bar chart’ or ‘asset’ view after selecting a range of data in the timeline view.