Sort DataFrame index that has a string and number

Question

My df DataFrame index looks like this:

Com_Lag_01
Com_Lag_02
Com_Lag_03
Com_Lag_04
Com_Lag_05
Com_Lag_06
Com_Lag_07
Com_Lag_08
Com_Lag_09
Com_Lag_10
Com_Lag_101
Com_Lag_102
Com_Lag_103
...
Com_Lag_11
Com_Lag_111
Com_Lag_112
Com_Lag_113
Com_Lag_114
...
Com_Lag_12
Com_Lag_120
...
Com_Lag_13
Com_Lag_14
Com_Lag_15

I want to sort this index so that the numbers go from Com_Lag_1 to Com_Lag_120. If I use df.sort_index() I will get the same thing as above. Any suggestion on how to sort this index properly?

You'd have to do a reverse find of the last '_', then cast to an int and order by this number — EdChum
– EdChum, Commented May 6, 2014 at 11:52

Guillaume Jacquenot · Accepted Answer · 2017-11-27 16:09:57Z

9

One could try something like this, by performing a sort on a numbered version of the index

import pandas as pd
# Create a DataFrame example
df = pd.DataFrame(\
    {'Year': [1991 ,2004 ,2001 ,2009 ,1997],\
    'Age': [27 ,25 ,22 ,34 ,31],\
    },\
    index = ['Com_Lag_1' ,'Com_Lag_12' ,'Com_Lag_3' ,'Com_Lag_24' ,'Com_Lag_5'])

# Add of a column containing a numbered version of the index
df['indexNumber'] = [int(i.split('_')[-1]) for i in df.index]
# Perform sort of the rows
df.sort(['indexNumber'], ascending = [True], inplace = True)
# Deletion of the added column
df.drop('indexNumber', 1, inplace = True)

Edit 2017 - V1:

To avoid SettingWithCopyWarning:

df = df.assign(indexNumber=[int(i.split('_')[-1]) for i in df.index])

Edit 2017 - V2 for Pandas Version 0.21.0

import pandas as pd
print(pd.__version__)
# Create a DataFrame example
df = pd.DataFrame(\
    {'Year': [1991 ,2004 ,2001 ,2009 ,1997],\
    'Age': [27 ,25 ,22 ,34 ,31],\
    },\
    index = ['Com_Lag_1' ,'Com_Lag_12' ,'Com_Lag_3' ,'Com_Lag_24' ,'Com_Lag_5'])

df.reindex(index=df.index.to_series().str.rsplit('_').str[-1].astype(int).sort_values().index)

edited Nov 27, 2017 at 16:09

answered May 6, 2014 at 12:05

Guillaume Jacquenot

11.8k6 gold badges45 silver badges50 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

ic_fl2 Over a year ago

This no loger works as .sort has been depreciated stackoverflow.com/questions/44123874/… . Use the answer with .sort_index instead. Also is only one line! pandas.pydata.org/docs/reference/api/…

jezrael · Accepted Answer · 2017-07-27 08:40:22Z

Solution without new column with DataFrame.reindex by index of sorted Series :

a = df.index.to_series().str.rsplit('_').str[-1].astype(int).sort_values()
print (a)
Com_Lag_1      1
Com_Lag_3      3
Com_Lag_5      5
Com_Lag_12    12
Com_Lag_24    24
dtype: int32

df = df.reindex(index=a.index)
print (df)
            Age  Year
Com_Lag_1    27  1991
Com_Lag_3    22  2001
Com_Lag_5    31  1997
Com_Lag_12   25  2004
Com_Lag_24   34  2009

But if duplicated values is necessary add new column:

df = pd.DataFrame(\
    {'Year': [1991 ,2004 ,2001 ,2009 ,1997],\
    'Age': [27 ,25 ,22 ,34 ,31],\
    },\
    index = ['Com_Lag_1' ,'Com_Lag_12' ,'Com_Lag_3' ,'Com_Lag_24' ,'Com_Lag_12'])

print (df)
            Age  Year
Com_Lag_1    27  1991
Com_Lag_12   25  2004
Com_Lag_3    22  2001
Com_Lag_24   34  2009
Com_Lag_12   31  1997

df['indexNumber'] = df.index.str.rsplit('_').str[-1].astype(int)
df = df.sort_values(['indexNumber']).drop('indexNumber', axis=1)
print (df)
            Age  Year
Com_Lag_1    27  1991
Com_Lag_3    22  2001
Com_Lag_12   25  2004
Com_Lag_12   31  1997
Com_Lag_24   34  2009

KarenJG · Accepted Answer · 2021-04-23 12:52:35Z

5

Another solution is

    df.sort_index(key=lambda x: (x.to_series().str[8:].astype(int)), inplace=True)

The 8 comes from the position where the numeric values start

answered Apr 23, 2021 at 12:52

KarenJG

511 silver badge2 bronze badges

2 Comments

ic_fl2 Over a year ago

This is the correct approach and should be accepted, as .sort is depreciated in pandas! pandas.pydata.org/docs/reference/api/…

farid Over a year ago

This correct answer should be accepted

Collectives™ on Stack Overflow

Sort DataFrame index that has a string and number

3 Answers 3

1 Comment

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related