Reshaping long data tables to wide data tables
Here, the user can perform long-to-wide transformations.
- 1. Key columns: The columns to be used as keys while transformation can be selected here.
- 2. Select/unselect all: Use to select or unselect all the columns for reshaping
- 3. Search: This bar could be used for searching data attributes
- 4. Columns to be reshaped: Select all the data attributes to be reshaped using the keys.
- 5. Splitter columns: Select additional columns that are to be used as splitters
Example:
Let’s examine the following data. There are two plants and three machines per plant generating a power output. Thus, for each hourly timestamp, there are six entries.
data:image/s3,"s3://crabby-images/c84f1/c84f159abef594ba8c70881d780c320ef8df7e9e" alt=""
Note that at Plant 1, the operator recorded the values in the order Machine1-Machine2-Machine3. However, the operator at Plant 2 used the opposite order.
The analyst wants to use the plant information as a key and roll out other columns.
data:image/s3,"s3://crabby-images/ce864/ce86416376c5ef30c93da04818636fb24a802e6a" alt=""
data:image/s3,"s3://crabby-images/6ac32/6ac32296f7e47d9d5bd9f24486d6959c3a458ff4" alt=""
The analyst is unsatisfied with the table format and decides not to reshape the machine column by unchecking it.
data:image/s3,"s3://crabby-images/7e4e5/7e4e51bc541b594c79799628c8861000f9bd4f4e" alt=""
data:image/s3,"s3://crabby-images/da389/da3892ad36b6ff446ce9f2a523215a9e6f882404" alt=""
Notice that the ‘Machine’ column used the first value per rolled-out row. The analyst is not happy about that because the data is misleading now. For example, the transformed table indicates that at 8:00, generation was 100 and 130 kW for machine 1 at the respective plants. However, this is incorrect, as seen from the original data.
The analyst uses the machine column as a splitter to correct that mistake.
data:image/s3,"s3://crabby-images/1b8ff/1b8ffb2efa4a347cacb221e0b7b20a0c8c52f42c" alt=""
data:image/s3,"s3://crabby-images/8d7d7/8d7d7c38cecafbe769ff07301719c65de0979f44" alt=""
Please take a moment to notice the difference between this and the earlier resulting tables. In this case, the reshaping is performed per timestamp and machine ID.