2

I have a pandas Dataframe and i want to group some of the columns to build higher levels columns:

Exemple : i have

Index       A       B       C       D
    1    0.25     0.3    0.25    0.66
    2    0.25     0.3    0.25    0.66
    3    0.25     0.3    0.25    0.66

and i want

    Index              AB        ||           CD
    Subindex       A   |      B  ||      C    |      D 
    1            0.25  |    0.3  ||   0.25    |    0.66
    2            0.25  |    0.3  ||   0.25    |    0.66
    3            0.25  |    0.3  ||   0.25    |    0.66

Thank you for your help...

2
  • "i have ","i want "...and you've tried? Commented Dec 10, 2018 at 21:52
  • Check multiple index Commented Dec 10, 2018 at 21:53

2 Answers 2

4

Create a dictionary to define your mapping and use pd.MultiIndex.from_tuples. If needed you can also specify names=['level_0', 'level_1'] to add names.

import pandas as pd

d = {'A': 'AB', 'B': 'AB', 'C': 'CD', 'D': 'CD'}
df.columns = pd.MultiIndex.from_tuples([*zip(map(d.get, df), df)])
# Equivalently
# df.columns = pd.MultiIndex.from_tuples([(d[col], col) for col in df.columns])

Output:

         AB         CD      
          A    B     C     D
Index                       
1      0.25  0.3  0.25  0.66
2      0.25  0.3  0.25  0.66
3      0.25  0.3  0.25  0.66
Sign up to request clarification or add additional context in comments.

4 Comments

Hi, thank you for your answer, but i've already done a double for loop to fill a new dataframe like you said, and it seems that it's not the fastest option. Is there no possible way to make the changes in place ?
@Tbertin your question/comment doesn't make a lot of sense. This answer does alter the dataframe in place and should be pretty fast as it is only altering the columns object.
@ALollz pd.MultiIndex.from_tuples([*zip(map(d.get, df), df)]) as a fun alternative. Your's is more readable of course (-:
Thanks :D. Definitely need to just commit that syntax to memory very soon.
2

groupby / concat hack

m = {'A': 'AB', 'B': 'AB', 'C': 'CD', 'D': 'CD'}
pd.concat(dict((*df.groupby(m, 1),)), axis=1)

         AB         CD      
          A    B     C     D
Index                       
1      0.25  0.3  0.25  0.66
2      0.25  0.3  0.25  0.66
3      0.25  0.3  0.25  0.66

Note that with this method it is possible to select an arbitrary subset of the columns in the original DataFrame, whereas the alternative answer appears to require a valid dictionary mapping for all values in the parent DataFrame

3 Comments

so sorry for that but how can you group AB and CD now in a higher level ? At the end i would like to have DataFrame ['ABCD'] ['AB] ['A'] for example. The same logic doesn't seem to work...
What if I a have a column "F" that I don't want to group?
pd.concat(dict((*df.drop(cols2skip, axis=1).groupby(m, 1),)), axis=1) where cols2skip is a list of columns to not include. If there is only one column pd.concat(dict((*df.drop('F', axis=1).groupby(m, 1),)), axis=1)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.