4

I have a pandas dataframe with a number of columns. Some columns are hierarchically groupable. I would like to use this groupability to turn the column structure into a hierarchical structure to be used in a machine learning environment.

Example:

my pandas frame has columns run, obj_id and data and it can look as follows:

Index    run    obj_id    data1    data2
0        0      0         1.3134   3.4943
1        0      0         2.3311   5.4434
2        1      0         1.3345   6.9942
3        1      0         3.4422   3.5353
4        0      1         4.2233   0.3112

and so on. What I would like to do here is first of all train a separate model for each obj_id. Then I would like to turn the run into batch, that is, each run should be seen as a batch. And then the data columns should be the features.

The result would probably look like this:

X = [ # obj_id: model
      [ # run: batch
        [ # data_: features
          [1.3134, 3.4943], 
          [2.3311, 5.4434]
        ], 
        [
          [1.3345, 6.9942], 
          [3.4422, 3.5353]
        ]
      ]

Is there an easy way to do that transformation?

1 Answer 1

2

Not the best solution, but you can do:

(df.groupby('obj_id')
   .apply(lambda x: x.groupby('run')['data1','data2']
                     .apply(lambda y: y.values.tolist() )
                     .to_list()
         )
   .to_list()
)

Output:

[
    [
        [
            [1.3134, 3.4943], 
            [2.3311, 5.4434]
        ], 
        [
            [1.3345, 6.9942], 
            [3.4422, 3.5353]
        ]
    ],
    [
        [
            [4.2233, 0.3112]
        ]
    ]
]
Sign up to request clarification or add additional context in comments.

6 Comments

Is that different from df.groupby(["obj_id", "run"]).aggregate(list).values.tolist()?
I didn’t try that, but that does look much better, if it gives the desired output.
I think the problem with my suggestion is that it combines the grouping of obj_id and run into one hierarchical layer instead of handling them separately...
Then my solution, which gives the correct hierarchy, is different.
I'll give it a shot. Though, maybe there's a also more elegant solution. Thanks
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.