Split a multi-index dataframe in dataframes by column names

Question

I have a dataframe like the following: Multi-index dataframe by columns

I would like to get 3 dataframes named like each columns (compass, accel, gyro) with the timeindex untouched, and three columns each(df1, df2, df3).

I've tried for index,row in df.iterrows(): but couldnt really got it to work And I was thinking in somenthing stack() and unstack() but don't really know how.

Welcome to StackOverflow. Please take the time to read this post on how to provide a great pandas example — anky
– anky, Commented Jan 22, 2020 at 17:17

ALollz · Accepted Answer · 2020-01-22 18:31:57Z

2

groupby allows you to split the DataFrame along a MultiIndex level with the same level_values. We will use DataFrame.xs to remove the grouping Index level, leaving you with only the columns you care about. Separate DataFrames are stored in a dictionary, keyed by the unique level-1 values of the original column MultiIndex.

Sample Data

import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame(np.random.randint(1, 10, (4, 9)),
                  columns=pd.MultiIndex.from_product([['df1', 'df2', 'df3'],
                                                      ['compass', 'gyro', 'accel']]))
#      df1                df2                df3           
#  compass gyro accel compass gyro accel compass gyro accel
#0       3    3     7       2    4     7       2    1     2
#1       1    1     4       5    1     1       5    2     8
#2       4    3     5       8    3     5       9    1     8
#3       4    5     7       2    6     7       3    2     9

Code

d = {idx: gp.xs(idx, level=1, axis=1) for idx,gp in df.groupby(level=1, axis=1)}
d['gyro']
#   df1  df2  df3
#0    3    4    1
#1    1    1    2
#2    3    3    1
#3    5    6    2

As such splits are readily available with a groupby you may not even need to store the separate DataFrames; you can manipulate each of them separately with GroupBy.apply.

edited Jan 22, 2020 at 18:31

answered Jan 22, 2020 at 18:26

ALollz

59.7k7 gold badges74 silver badges97 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

MrAce2C Over a year ago

Thank you! I was trying to get them to separate csv files. I'll see if I can use the groupby.apply

ALollz Over a year ago

@MrAce2C then in that case just do something like: for idx,gp in df.groupby(level=1, axis=1): gp.xs(idx, level=1, axis=1).to_csv(f'{idx}.csv') Instead of storing them in a dict, it will create 3 separate CSVs 'gyro.csv', 'accel.csv' and 'compass.csv'

João Victor Fernandes · Accepted Answer · 2020-01-22 17:25:49Z

1

You can save the 3 first columns in a csv file, and repeat the process more 2 times to the others csv files...

You can select the 3 columns to your dataframe like this:

x = 0
data=pd.read_csv(file.csv, keep_default_na=False, skiprows=line_header, na_filter=False, usecols=[x,x+1,x+2])[[compass, accel, gyro]])

where x = your first column of the "big dataframe"

the usecols property is really useful in this case

You can read more about in: Pandas.read_csv

answered Jan 22, 2020 at 17:25

João Victor Fernandes

601 silver badge9 bronze badges

Collectives™ on Stack Overflow

Split a multi-index dataframe in dataframes by column names

2 Answers 2

Sample Data

Code

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Sample Data

Code

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related