Add a ranking ordered column to a Pandas Dataframe

Question

I know this question may seem trivial, but I can't find the solution anywhere. I have a really large pandas dataframe df that looks something like this:

                                            conference     IF2013  AR2013
0                                            HOTMOBILE  16.333333   31.50
1                                                 FOGA  13.772727   60.00
2                                              IEA/AIE  10.433735   28.20
3    IEEE Real-Time and Embedded Technology and App...  10.250000   29.00
4                  Symposium on Computational Geometry   9.880342   35.00
5                                                 WISA   9.693878   43.60
6                                                 ICMT   8.750000   22.00
7                                              Haskell   8.703704   39.00

I would like to add an extra column at the end that orders it 1,2,3,4, etc. So it looks like this:

                                               conference     IF2013    AR2013  Ranking 
    0                                            HOTMOBILE  16.333333   31.50   1  
    1                                                 FOGA  13.772727   60.00   2
    2                                              IEA/AIE  10.433735   28.20   3
    3    IEEE Real-Time and Embedded Technology and App...  10.250000   29.00   4

I can't seem to figure out how to add a filled extra column that just put a Series of consecutive numbers.

jezrael · Accepted Answer · 2015-12-07 08:16:51Z

You can try:

df['rank'] = df.index + 1

print df
#                                          conference     IF2013  AR2013  rank
#0                                          HOTMOBILE  16.333333    31.5     1
#1                                               FOGA  13.772727    60.0     2
#2                                            IEA/AIE  10.433735    28.2     3
#3  IEEE Real-Time and Embedded Technology and App...  10.250000    29.0     4
#4                Symposium on Computational Geometry   9.880342    35.0     5
#5                                               WISA   9.693878    43.6     6
#6                                               ICMT   8.750000    22.0     7
#7                                            Haskell   8.703704    39.0     8

Or use rank with parameter ascending=False:

df['rank'] = df['conference'].rank(ascending=False)
print df
#                                          conference     IF2013  AR2013  rank
#0                                          HOTMOBILE  16.333333    31.5     1
#1                                               FOGA  13.772727    60.0     2
#2                                            IEA/AIE  10.433735    28.2     3
#3  IEEE Real-Time and Embedded Technology and App...  10.250000    29.0     4
#4                Symposium on Computational Geometry   9.880342    35.0     5
#5                                               WISA   9.693878    43.6     6
#6                                               ICMT   8.750000    22.0     7
#7                                            Haskell   8.703704    39.0     8

Colonel Beauvel · Accepted Answer · 2015-12-07 08:08:16Z

2

I guess you are looking for the rank function:

df['rank'] = df['IF2013'].rank()

This way your result will be independant of the index.

answered Dec 7, 2015 at 8:08

Colonel Beauvel

31.3k11 gold badges49 silver badges88 bronze badges

Comments

Anton Protopopov · Accepted Answer · 2015-12-07 08:25:28Z

You could add column with range:

df['Ranking'] = range(1, len(df) + 1)

Example:

import pandas as pd
from io import StringIO

data = """
                                        conference     IF2013  AR2013
                                        HOTMOBILE  16.333333   31.50
                                             FOGA  13.772727   60.00
                                          IEA/AIE  10.433735   28.20
IEEE Real-Time and Embedded Technology and App...  10.250000   29.00
              Symposium on Computational Geometry   9.880342   35.00
                                             WISA   9.693878   43.60
                                             ICMT   8.750000   22.00
                                          Haskell   8.703704   39.00

"""

df = pd.read_csv(StringIO(data), sep=' \s+')

df['Ranking'] = range(1, len(df) + 1)

In [183]: df
Out[183]:
                                          conference     IF2013  AR2013        Ranking  
0                                          HOTMOBILE  16.333333    31.5            1
1                                               FOGA  13.772727    60.0            2
2                                            IEA/AIE  10.433735    28.2            3
3  IEEE Real-Time and Embedded Technology and App...  10.250000    29.0            4
4                Symposium on Computational Geometry   9.880342    35.0            5
5                                               WISA   9.693878    43.6            6
6                                               ICMT   8.750000    22.0            7
7                                            Haskell   8.703704    39.0            8

EDIT

Benchmarking:

In [202]: %timeit df['rank'] = range(1, len(df) + 1)
10000 loops, best of 3: 127 us per loop

In [203]: %timeit df['rank'] = df.AR2013.rank(ascending=False)
1000 loops, best of 3: 248 us per loop

Collectives™ on Stack Overflow

Add a ranking ordered column to a Pandas Dataframe

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related