Pandas: drop a level from a multi-level column index?

Question

If I've got a multi-level column index:

>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> pd.DataFrame([[1,2], [3,4]], columns=cols)

    a
   ---+--
    b | c
--+---+--
0 | 1 | 2
1 | 3 | 4

How can I drop the "a" level of that index, so I end up with:

    b | c
--+---+--
0 | 1 | 2
1 | 3 | 4

It would be nice to have a DataFrame method that does that for both index and columns. Either of dropping or selecting index levels. — Soerendip
– Soerendip, Commented May 24, 2018 at 17:56
@Sören Check out stackoverflow.com/a/56080234/3198568. droplevel works can work on either multilevel indexes or columns through the parameter axis. — irene
– irene, Commented Apr 23, 2020 at 7:35

President James K. Polk · Accepted Answer · 2024-10-29 14:59:43Z

487

You can use MultiIndex.droplevel:

>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> df = pd.DataFrame([[1,2], [3,4]], columns=cols)
>>> df
   a   
   b  c
0  1  2
1  3  4

[2 rows x 2 columns]
>>> df.columns = df.columns.droplevel()
>>> df
   b  c
0  1  2
1  3  4

[2 rows x 2 columns]

edited Oct 29, 2024 at 14:59

President James K. Polk

42.3k34 gold badges113 silver badges149 bronze badges

answered Mar 6, 2014 at 19:08

DSM

355k67 gold badges606 silver badges504 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Ted Petrou Over a year ago

It's probably best to explicitly say which level is being dropped. Levels are 0-indexed beginning from the top. >>> df.columns = df.columns.droplevel(0)

Idodo Over a year ago

If the index you are trying to drop is on the left (row) side and not the top (column) side, you can change "columns" to "index" and use the same method: >>> df.index = df.index.droplevel(1)

yoonghm Over a year ago

In Panda version 0.23.4, df.columns.droplevel() is no longer available.

Matt Harrison Over a year ago

@yoonghm It is there, you are probably just calling it on columns that don't have a multi-index

Kyle C Over a year ago

I had three levels deep and wanted to drop down to just the middle level. I found that dropping the lowest (level [2]) and then the highest (level [0]) worked best. >>>df.columns = df.columns.droplevel(2) >>>df.columns = df.columns.droplevel(0)

jxc · Accepted Answer · 2019-05-10 15:24:12Z

140

As of Pandas 0.24.0, we can now use DataFrame.droplevel():

cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
df = pd.DataFrame([[1,2], [3,4]], columns=cols)

df.droplevel(0, axis=1) 

#   b  c
#0  1  2
#1  3  4

This is very useful if you want to keep your DataFrame method-chain rolling.

edited May 10, 2019 at 15:24

answered May 10, 2019 at 15:02

jxc

14k4 gold badges20 silver badges37 bronze badges

3 Comments

EliadL Over a year ago

This is the "purest" solution in that a new DataFrame is returned rather than have it modified "in place".

Guy Over a year ago

df.droplevel(0, axis='columns') is even more explicit and easy to understand

igorkf Over a year ago

I will come here forever, because I always forget to set axis=1.

Mint · Accepted Answer · 2017-07-25 00:24:33Z

111

Another way to drop the index is to use a list comprehension:

df.columns = [col[1] for col in df.columns]

   b  c
0  1  2
1  3  4

This strategy is also useful if you want to combine the names from both levels like in the example below where the bottom level contains two 'y's:

cols = pd.MultiIndex.from_tuples([("A", "x"), ("A", "y"), ("B", "y")])
df = pd.DataFrame([[1,2, 8 ], [3,4, 9]], columns=cols)

   A     B
   x  y  y
0  1  2  8
1  3  4  9

Dropping the top level would leave two columns with the index 'y'. That can be avoided by joining the names with the list comprehension.

df.columns = ['_'.join(col) for col in df.columns]

    A_x A_y B_y
0   1   2   8
1   3   4   9

That's a problem I had after doing a groupby and it took a while to find this other question that solved it. I adapted that solution to the specific case here.

edited Jul 25, 2017 at 0:24

answered Jun 28, 2017 at 21:22

Mint

2,0981 gold badge15 silver badges13 bronze badges

4 Comments

Eric O. Lebigot Over a year ago

[col[1] for col in df.columns] is more directly df.columns.get_level_values(1).

Logan Over a year ago

Had a similar need wherein some columns had empty level values. Used the following: [col[0] if col[1] == '' else col[1] for col in df.columns]

igorkf Over a year ago

That's awesome. I was needing an easy way to bind level + columns. Thank you.

Songhua Hu Over a year ago

To reply Logan's answer: df.columns = [col[0] if col[1] == '' else '_'.join(col) for col in df.columns]

spacetyper · Accepted Answer · 2016-04-17 21:57:21Z

54

Another way to do this is to reassign df based on a cross section of df, using the .xs method.

>>> df

    a
    b   c
0   1   2
1   3   4

>>> df = df.xs('a', axis=1, drop_level=True)

    # 'a' : key on which to get cross section
    # axis=1 : get cross section of column
    # drop_level=True : returns cross section without the multilevel index

>>> df

    b   c
0   1   2
1   3   4

answered Apr 17, 2016 at 21:57

spacetyper

1,60221 silver badges29 bronze badges

3 Comments

Ted Petrou Over a year ago

This only works whenever there is a single label for an entire column level.

Soerendip Over a year ago

Does not work when you want to drop the second level.

Tiffany G. Wilson Over a year ago

This is a nice solution if you want to slice and drop for the same level. If you wanted to slice on the second level (say b) then drop that level and be left with the first level (a), the following would work: df = df.xs('b', axis=1, level=1, drop_level=True)

BENY · Accepted Answer · 2018-11-23 15:20:14Z

20

A small trick using sum with level=1(work when level=1 is all unique)

df.sum(level=1,axis=1)
Out[202]: 
   b  c
0  1  2
1  3  4

Comments

sedeh · Accepted Answer · 2015-06-23 00:29:18Z

18

You could also achieve that by renaming the columns:

df.columns = ['a', 'b']

This involves a manual step but could be an option especially if you would eventually rename your data frame.

answered Jun 23, 2015 at 0:29

sedeh

7,3137 gold badges52 silver badges66 bronze badges

1 Comment

Eric O. Lebigot Over a year ago

This is essentially what Mint's first answer does. Now, there is also no need to specify the list of names (which is generally tedious), as it is given to you by df.columns.get_level_values(1).

dhFrank · Accepted Answer · 2018-02-17 17:58:52Z

8

I have struggled with this problem since I don’t know why my droplevel() function does not work. Work through several and learn that ‘a’ in your table is columns name and ‘b’, ‘c’ are index. Do like this will help

df.columns.name = None
df.reset_index() #make index become label

answered Feb 17, 2018 at 17:58

dhFrank

991 silver badge3 bronze badges

2 Comments

Eric O. Lebigot Over a year ago

This does not reproduce the desired output at all.

LinkBerest - SO sold our work Over a year ago

Based on the date this was posted, drop level might not have been included in your version of Pandas (it was added to the stable version, 24.0, on January 2019)

Amol kale · Accepted Answer · 2023-03-21 06:00:07Z

0

new_columns_cdnr = []
for column in list(df.columns):
    new = [x for x in list(column) if not 'unnamed' in x.lower()]
    new_columns_cdnr.append(new[-1])
df.columns = new_columns_cdnr

answered Mar 21, 2023 at 6:00

Amol kale

1

Collectives™ on Stack Overflow

Pandas: drop a level from a multi-level column index?

8 Answers 8

5 Comments

3 Comments

4 Comments

3 Comments

Comments

1 Comment

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

5 Comments

3 Comments

4 Comments

3 Comments

Comments

1 Comment

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related