CSV Data Import
To import data from CSV ("Comma Seperated Values") files, choose the "File (CSV, parquet, ...)" option when starting a data import. The files to be loaded for CSV import can end with either ".csv" or".txt".
CSV Import Dialog
After selecting a file, you get a preview of the first lines of the file:
Advanced Options: Most settings like the used separator, comma, data types, etc. are automatically detected from the first 200 lines. If something is detected wrong, there are many options to override and customize these settings, in "Advanced Options":
- The data type of each column can be changed using a drop-down menu (see "Specifying data types" below for details).
- A preview table of first 200 lines, loaded according to the current import parameters. Updates immediately, when you change a setting like data types etc.
- Setting the delimiter that separates columns.
- Whether the first line is supposed to contain column names, second line contains units, etc.
- Skipping lines that contain too few delimiters (e.g. comment lines in the beginning), or skipping a number of lines by hand (hint: use the line numbers on the left of the preview dialog).
- Setting of the decimal and thousands separator.
- Specify if certain characters should be ignored in columns with numerical values and (below) whether leading zeros in categories should be ignored.
- Loading multiple files at once from the same folder. They must be in same structure (regarding column names), but may have different values. For example, one csv per day. In Visplore, they are vertically appended to each other. There are several options which files are loaded from the folder:
- "All similar files": All similar files are loaded.
- "Most recently modified": The N last modified files are loaded.
- "Modified after": All files are loaded that were last modified after a specified date.
- After data types were manually changed, these settings are stored for the user to be the same next time. In "Default data types..", you can change and clear these overrides.
Specifying Data Types
You can specify the type for each column in the dropdown field in the preview dialog. The most important types are:
- Value = interpret column as numbers
- Categorical = interpret as category, keeps the strings exactly like in the file
- Date/Time = interpret column as timestamps, with automatic time format detection
The CSV import interface tries to determine the data types itself by analyzing the first 200 lines, but this does not always deliver the desired result. It makes sense you scroll to the right and check all types, possibly adjusting them, before loading your dataset.