1

I have a MultiIndex dataframe with 200 columns. I would like to select a specific column from that. Suppose df is some part of my dataframe:

df=
                       a                             b
                       l       h     l       h       l       h      l    
                      cold    hot    hot    cold    cold     hot   hot
2009-01-01 01:00:00   0.1     0.9    0.4    0.29    0.15     0.6    0.3
2009-01-01 02:00:00   0.1     0.8    0.35   0.2     0.15     0.6    0.4
2009-01-01 03:00:00   0.12    0.7    0.3    0.23    0.23     0.8    0.3
2009-01-01 04:00:00   0.1     0.9    0.33   0.24    0.15     0.6    0.4
2009-01-01 05:00:00   0.17    0.9    0.41   0.23    0.18     0.75   0.4

I would like to select the values for this column[h,hot].

My output should be:

df['h','hot']=
                       a      b
2009-01-01 01:00:00   0.9   0.6
2009-01-01 02:00:00   0.8   0.6
2009-01-01 03:00:00   0.7   0.8
2009-01-01 04:00:00   0.9   0.6
2009-01-01 05:00:00   0.9   0.75

I would appreciate any guidance on how I could select that.

4
  • I think that df['b','h','hot'] should just work here for hierarchical columns Commented Oct 14, 2016 at 13:27
  • Please post list(df.columns). This will help us see if there are errant spaces... Commented Oct 14, 2016 at 13:29
  • df['b','h','hot'] works for me, if it doesn't then post df.info() and print(df.columns.tolist() to see what the real column names are Commented Oct 14, 2016 at 13:32
  • Thank you for your answers. But I had forgotten a minor point of my question. I modified my post a bit. I do appreciate that if you guide me with the modified version of my question. Thanks. Commented Oct 14, 2016 at 13:49

2 Answers 2

1

For multi-index slicing as you desire the columns needs to be sorted first using sort_index(axis=1), you can then select the cols of interest without error:

In [12]:
df = df.sort_index(axis=1)
df['a','h','hot']

Out[12]:
0
2009-01-01 01:00:00    0.9
2009-01-01 02:00:00    0.8
2009-01-01 03:00:00    0.7
2009-01-01 04:00:00    0.9
2009-01-01 05:00:00    0.9
Name: (a, h, hot), dtype: float64
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you very much. Yes, that's what exactly I was looking for. I had tried this commands without "sort", and as you said I got that error. Could you please guide me what exactly does that "sort" command? Thank you.
sort_values just sorts the columns in this case, it's a method on Index, Series, DataFrame etc
Thank you. But "sort_values" leads to sort column names without moving their values (wrong label).That means when I am looking for values of df['a','h','hot'], I will got another values for example df['a','l','cold']. I do appreciate that if you guide me to solve this issue.
does df = df.sort_index(axis=1) work prior to the column selection?
0

Try this:

dataframe= pd.DataFrame()
dataframe["temp"] = df["b"]["h"]["hot"]

df - is your dataframe

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.