8

if I have a list of strings e.g. ["a143.txt", "a9.txt", ] how can I sort it in ascending order by the numbers in the list, rather than by the string. I.e. I want "a9.txt" to appear before "a143.txt" since 9 < 143.

thanks.

4

5 Answers 5

14

It's called "natural sort order", From http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html

Try this:

import re 

def sort_nicely( l ): 
  """ Sort the given list in the way that humans expect. 
  """ 
  convert = lambda text: int(text) if text.isdigit() else text 
  alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
  l.sort( key=alphanum_key ) 
Sign up to request clarification or add additional context in comments.

4 Comments

I'd use text.lower() at the end of the convert = line to make it case-insensitive.
+1. You might want to replace the lambda with a proper function definition, for readability. Incidentally, Debian package version numbers are compared more or less like this. debian.org/doc/debian-policy/ch-controlfields.html#s-f-Version
+1 Nice answer. The only thing I didin't like are extra white-spaces. I mean here: [ convert(c) for c in re.split('([0-9]+)', key) ] and l.sort( key=alphanum_key ) and sort_nicely( l )
+1, nicely done! I redid alphanum_key as alphanum_key = lambda key: map(convert, re.split('([0-9]+)', key)).
0

Use list.sort() and provide your own function for the key argument. Your function will be called for each item in the list (and passed the item), and is expected to return a version of that item that will be sorted.

See http://wiki.python.org/moin/HowTo/Sorting/#Key_Functions for more information.

Comments

0

If you want to completely disregard the strings, then you should do

import re
numre = re.compile('[0-9]+')
def extractNum(s):
    return int(numre.search(s).group())

myList = ["a143.txt", "a9.txt", ]
myList.sort(key=extractNum)

Comments

0
>>> paths = ["a143.txt", "a9.txt"]
>>> sorted(paths, key=lambda s: int(re.search("\d+", s).group()))
['a9.txt', 'a143.txt']

More generic, if you want it to work also for files like: a100_32_12 (and sorting by numeric groups):

>>> paths = ["a143_2.txt", "a143_1.txt"]
>>> sorted(paths, key=lambda s: map(int, re.findall("\d+", s)))
['a143_1.txt', 'a143_1.txt']

Comments

0

list.sort() is deprecated (see Python.org How-To) . sorted(list, key=keyfunc) is better.

import re

def sortFunc(item):
  return int(re.search(r'[a-zA-Z](\d+)', item).group(1))

myList = ["a143.txt", "a9.txt"]

print sorted(myList, key=sortFunc)

5 Comments

list.sort() is deprecated? "Usually it's less convenient than sorted()" is the only thing in this direction I found. I have to say, though, that I'd be more than happy to see the in-place sorting go away, but it seems unlikely.
It may not be technically depreciated but it is considered the "old" method and is labeled as such on Python.org.
True list.sort() is certainly slightly more memory efficient but the difference is negligible as far sensible sized lists are concerned. I can't find a good explanation for why but as far as I am aware the preferred and more 'pythonic' way of doing sorting is using sorted().
It says sorted() returns a NEW list, not that's it's NEW. It's definitely less efficient if you really interested in modifying your list. sorted() is good for tuples.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.