1

Working with Pandas in Python 3.8.

Given an Index of string values that looks like this:

import pandas as pd

foo = pd.Index(['score_1', 'score_10', 'score_11', 'score_12', 'score_13', 'score_14',
       'score_15', 'score_16', 'score_17', 'score_18', 'score_19', 'score_2',
       'score_20', 'score_21', 'score_22', 'score_23', 'score_24', 'score_25',
       'score_26', 'score_27', 'score_3', 'score_4', 'score_5', 'score_6',
       'score_7', 'score_8', 'score_9'],
      dtype='object', name='score_field')

What's the "right" way to get it sorted so that the values are in numerical order, ex: 'score_1', 'score_2' ... 'score_9', 'score_10', etc... ?

This doesn't work:

foo.sort_values(key=lambda x: int(x.split('_')[1]))
AttributeError: 'Index' object has no attribute 'split'

And this doesn't work:

foo.sort_values(key=lambda val: val.str.split('_').str[1].astype(int))
AttributeError: Can only use .str accessor with string values!

This does work, but feels ugly:

foo = pd.Index(sorted(foo.to_list(), key=lambda x: int(x.split('_')[1])),
      dtype=foo.dtype, name=foo.name)

1 Answer 1

1

Honestly, what you have makes sense to me, however, if you want to use pure pandas way, use Index.str.split and argsort:

foo[foo.str.split('_').str[1].astype(int).argsort()]

Index(['score_1', 'score_2', 'score_3', 'score_4', 'score_5', 'score_6',
   'score_7', 'score_8', 'score_9', 'score_10', 'score_11', 'score_12',
   'score_13', 'score_14', 'score_15', 'score_16', 'score_17', 'score_18',
   'score_19', 'score_20', 'score_21', 'score_22', 'score_23', 'score_24',
   'score_25', 'score_26', 'score_27'],
  dtype='object', name='score_field')

Or if you are okay for a 3rd party lib:

import natsort as ns
pd.Index(ns.natsorted(foo),name=foo.name)
Sign up to request clarification or add additional context in comments.

2 Comments

Interesting. That argsort() method worked. I'm sure there must be an answer with the .sort_values() method, but I guess I'll let it go for now. Thanks!
@sql_knievel As far as I now, sort_values shouldnt work on strings unless you split and convert to int which is a long process compared to argsort, if you find something, please let me know. TIA :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.