1

I have a MultiIndex dataframe with the top level columns named:

Col1_1 | Col1_2 | Col 2_1 | Col 2_2 | ... |

I'm looking to combine Col1_1 with Col1_2 as Col1. I could also do this before creating the MultiIndex, but the original data is more drawn out as:

Col1_1.aspect1 | Col1_1.aspect 2 | Col1_2.aspect1 | Col1_2.aspect2 | ... |

where 'aspect1' and 'aspect2' become subcolumns in the MultiIndex.

Please let me know if I can clarify anything, and many thanks in advance.

Current df

The expected result combines the two as just Sample1; any number of ways is fine, including stacking/concatenating the data, outputting a summary stat e.g. mean(), etc.

6
  • share df.head() Commented Jan 26, 2017 at 17:18
  • I've previously found similar questions, e.g. stackoverflow.com/questions/41221079/… , but I don't believe this is quite right for this problem. Commented Jan 26, 2017 at 17:19
  • 1
    again, share a sample and share an example of your expected result Commented Jan 26, 2017 at 17:20
  • :) we commented ~simultaneously as you can see by the time stamps, I didn't ignore your share request. I've uploaded a snip of the df (it contains hundreds of cols, thousands of rows). Many outputs would work here, as noted above. Thanks. Commented Jan 26, 2017 at 17:33
  • thanks, so basically what does it mean with your actual columns? Are you trying to like merge gtype, score etc. columns from sample11 and sample12 in one unique column? or something else Commented Jan 26, 2017 at 17:36

1 Answer 1

2

You can use groupby and apply an aggregation function against it like mean. You must group against axis 1 (columns) and with level 1 (lower multiindex columns). It will apply the grouping across all samples. Then simply do a mean if it's what you want to achieve:

df.groupby(level=1, axis=1).mean()
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! Knowing 'levels' is really helpful for a Python newbie like myself. This is a great start, though I receive: 'DataError: No numeric types to aggregate'. I'll see if I can find the issue then add it here and mark accepted to close it out.
That s likely because of your text columns: filter them out from the dataframe before grouping

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.