3

I have a dataframe like the following: Multi-index dataframe by columns

I would like to get 3 dataframes named like each columns (compass, accel, gyro) with the timeindex untouched, and three columns each(df1, df2, df3).

I've tried for index,row in df.iterrows(): but couldnt really got it to work And I was thinking in somenthing stack() and unstack() but don't really know how.

1

2 Answers 2

2

groupby allows you to split the DataFrame along a MultiIndex level with the same level_values. We will use DataFrame.xs to remove the grouping Index level, leaving you with only the columns you care about. Separate DataFrames are stored in a dictionary, keyed by the unique level-1 values of the original column MultiIndex.

Sample Data

import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame(np.random.randint(1, 10, (4, 9)),
                  columns=pd.MultiIndex.from_product([['df1', 'df2', 'df3'],
                                                      ['compass', 'gyro', 'accel']]))
#      df1                df2                df3           
#  compass gyro accel compass gyro accel compass gyro accel
#0       3    3     7       2    4     7       2    1     2
#1       1    1     4       5    1     1       5    2     8
#2       4    3     5       8    3     5       9    1     8
#3       4    5     7       2    6     7       3    2     9

Code

d = {idx: gp.xs(idx, level=1, axis=1) for idx,gp in df.groupby(level=1, axis=1)}
d['gyro']
#   df1  df2  df3
#0    3    4    1
#1    1    1    2
#2    3    3    1
#3    5    6    2

As such splits are readily available with a groupby you may not even need to store the separate DataFrames; you can manipulate each of them separately with GroupBy.apply.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! I was trying to get them to separate csv files. I'll see if I can use the groupby.apply
@MrAce2C then in that case just do something like: for idx,gp in df.groupby(level=1, axis=1): gp.xs(idx, level=1, axis=1).to_csv(f'{idx}.csv') Instead of storing them in a dict, it will create 3 separate CSVs 'gyro.csv', 'accel.csv' and 'compass.csv'
1

You can save the 3 first columns in a csv file, and repeat the process more 2 times to the others csv files...

You can select the 3 columns to your dataframe like this:

x = 0
data=pd.read_csv(file.csv, keep_default_na=False, skiprows=line_header, na_filter=False, usecols=[x,x+1,x+2])[[compass, accel, gyro]])

where x = your first column of the "big dataframe"

the usecols property is really useful in this case

You can read more about in: Pandas.read_csv

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.