I've been given a few sets of MS-Excel worksheets with a lot of nested data in areas, and I have researching for a few hours looking for a way to reduce each 'id' row to single rows. Specifically merging 'Step ID', 'Install Steps', and 'Expected step' into single lines with some formatting.
Here is shortened simple of the data within the Excel sheets I need to convert.
| Name | ID | Host | Step ID | Install Step | Expected step | Extra |
|---|---|---|---|---|---|---|
| Test1 | 4 | Cat | 1 | Move x to y | x is with y | x will protest |
| 2 | move x away from y | x and y are not together | y will protest | |||
| Test2 | 5 | Dog | 1 | remove x from tank | y is alone | |
| 2 | Drop duplicate of y, y2 in tank | y1 is not alone | y1 will protest | |||
| 3 | Drop more duplicates of y into tank, y3 and y4 | y1 and y2 will protest | ||||
| test 3 | 6 | Dog | 1 | empty tank | nothing is in tank |
And I am looking to transform this excel sheet into the following
| Name | ID | Host | Install Step | Expected step | Extra |
|---|---|---|---|---|---|
| Test1 | 4 | Cat | 1 - Move x to y 2 - move x away from y |
1 - x is with y 2 - x and y are not together |
1 - x will protest 2 - y will protest |
| Test2 | 5 | Dog | 1 - remove x from tank 2 - Drop duplicate of y, y2 in tank 3 - Drop more duplicates of y into tank, y3 and y4 |
1 - y is alone 2 - y1 is not alone |
2 - y1 will protest <br / > y1 and y2 will protest |
| Test3 | 6 | Dog | 1 - empty tank | 1 - nothing is in tank |
I have testing a few of the other Stackoverflow questions and repsonses for pandas, but the few that closely match my need just fill in the empty areas with duplicate data.