417

If I've got a multi-level column index:

>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> pd.DataFrame([[1,2], [3,4]], columns=cols)
    a
   ---+--
    b | c
--+---+--
0 | 1 | 2
1 | 3 | 4

How can I drop the "a" level of that index, so I end up with:

    b | c
--+---+--
0 | 1 | 2
1 | 3 | 4
2
  • 6
    It would be nice to have a DataFrame method that does that for both index and columns. Either of dropping or selecting index levels. Commented May 24, 2018 at 17:56
  • 1
    @Sören Check out stackoverflow.com/a/56080234/3198568. droplevel works can work on either multilevel indexes or columns through the parameter axis. Commented Apr 23, 2020 at 7:35

8 Answers 8

487

You can use MultiIndex.droplevel:

>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> df = pd.DataFrame([[1,2], [3,4]], columns=cols)
>>> df
   a   
   b  c
0  1  2
1  3  4

[2 rows x 2 columns]
>>> df.columns = df.columns.droplevel()
>>> df
   b  c
0  1  2
1  3  4

[2 rows x 2 columns]
Sign up to request clarification or add additional context in comments.

5 Comments

It's probably best to explicitly say which level is being dropped. Levels are 0-indexed beginning from the top. >>> df.columns = df.columns.droplevel(0)
If the index you are trying to drop is on the left (row) side and not the top (column) side, you can change "columns" to "index" and use the same method: >>> df.index = df.index.droplevel(1)
In Panda version 0.23.4, df.columns.droplevel() is no longer available.
@yoonghm It is there, you are probably just calling it on columns that don't have a multi-index
I had three levels deep and wanted to drop down to just the middle level. I found that dropping the lowest (level [2]) and then the highest (level [0]) worked best. >>>df.columns = df.columns.droplevel(2) >>>df.columns = df.columns.droplevel(0)
140

As of Pandas 0.24.0, we can now use DataFrame.droplevel():

cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
df = pd.DataFrame([[1,2], [3,4]], columns=cols)

df.droplevel(0, axis=1) 

#   b  c
#0  1  2
#1  3  4

This is very useful if you want to keep your DataFrame method-chain rolling.

3 Comments

This is the "purest" solution in that a new DataFrame is returned rather than have it modified "in place".
df.droplevel(0, axis='columns') is even more explicit and easy to understand
I will come here forever, because I always forget to set axis=1.
111

Another way to drop the index is to use a list comprehension:

df.columns = [col[1] for col in df.columns]

   b  c
0  1  2
1  3  4

This strategy is also useful if you want to combine the names from both levels like in the example below where the bottom level contains two 'y's:

cols = pd.MultiIndex.from_tuples([("A", "x"), ("A", "y"), ("B", "y")])
df = pd.DataFrame([[1,2, 8 ], [3,4, 9]], columns=cols)

   A     B
   x  y  y
0  1  2  8
1  3  4  9

Dropping the top level would leave two columns with the index 'y'. That can be avoided by joining the names with the list comprehension.

df.columns = ['_'.join(col) for col in df.columns]

    A_x A_y B_y
0   1   2   8
1   3   4   9

That's a problem I had after doing a groupby and it took a while to find this other question that solved it. I adapted that solution to the specific case here.

4 Comments

[col[1] for col in df.columns] is more directly df.columns.get_level_values(1).
Had a similar need wherein some columns had empty level values. Used the following: [col[0] if col[1] == '' else col[1] for col in df.columns]
That's awesome. I was needing an easy way to bind level + columns. Thank you.
To reply Logan's answer: df.columns = [col[0] if col[1] == '' else '_'.join(col) for col in df.columns]
54

Another way to do this is to reassign df based on a cross section of df, using the .xs method.

>>> df

    a
    b   c
0   1   2
1   3   4

>>> df = df.xs('a', axis=1, drop_level=True)

    # 'a' : key on which to get cross section
    # axis=1 : get cross section of column
    # drop_level=True : returns cross section without the multilevel index

>>> df

    b   c
0   1   2
1   3   4

3 Comments

This only works whenever there is a single label for an entire column level.
Does not work when you want to drop the second level.
This is a nice solution if you want to slice and drop for the same level. If you wanted to slice on the second level (say b) then drop that level and be left with the first level (a), the following would work: df = df.xs('b', axis=1, level=1, drop_level=True)
20

A small trick using sum with level=1(work when level=1 is all unique)

df.sum(level=1,axis=1)
Out[202]: 
   b  c
0  1  2
1  3  4

More common solution get_level_values

df.columns=df.columns.get_level_values(1)
df
Out[206]: 
   b  c
0  1  2
1  3  4

Comments

18

You could also achieve that by renaming the columns:

df.columns = ['a', 'b']

This involves a manual step but could be an option especially if you would eventually rename your data frame.

1 Comment

This is essentially what Mint's first answer does. Now, there is also no need to specify the list of names (which is generally tedious), as it is given to you by df.columns.get_level_values(1).
8

I have struggled with this problem since I don’t know why my droplevel() function does not work. Work through several and learn that ‘a’ in your table is columns name and ‘b’, ‘c’ are index. Do like this will help

df.columns.name = None
df.reset_index() #make index become label

2 Comments

This does not reproduce the desired output at all.
Based on the date this was posted, drop level might not have been included in your version of Pandas (it was added to the stable version, 24.0, on January 2019)
0
new_columns_cdnr = []
for column in list(df.columns):
    new = [x for x in list(column) if not 'unnamed' in x.lower()]
    new_columns_cdnr.append(new[-1])
df.columns = new_columns_cdnr

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.