Weird behavior of Python's built-in max() method

Question

While using Python's max() built-in method I've found an interesting thing....

input_one = u'A測試測試;B測試;測試;D測試測試測試;E測試測試測試測試測試測試測試測試測試'
input_two = u'測試測試;測試;測試;測試測試測試;測試測試測試測試測試測試測試測試測試'
input_en = u'test;test,test,test;testtesttest;testtesttesttest'
input_ja = u'ああああああ;あああ;あああああああ;ああああああああああああ'
input_ja_mixed = u'aああああああ;bあああ;cあああああああ;dああああああああああああ'
input_ascii = u'egfwergreger;@#@$fgdfdfdfdsfsdfsdf;sdfsdfsfsdfs233'


def test_length(input):
    lengths = []
    for i in input:
        lengths.append(len(i))
    index = find_index(input, max(lengths))
    return input[index]


def find_index(input, to_find):
    for index, value in enumerate(input):
        print('index: %s, length: %s, value: %s' % (index, len(value), value))
        if len(value) == to_find:
            return index

def test_one(input):
    input = input.split(';')
    print('input:', input)
    print('using test_length: ', test_length(input))
    print('using max():', max(input))

If using max() to find the max element in a list which only contains English alphabets, it works good.

But, if the element is mixed with symbols(like @ # $), it behaves differently.

For example,

In [80]: test_one(input_ascii)
input: ['egfwergreger', '@#@$fgdfdfdfdsfsdfsdf', 'sdfsdfsfsdfs233']
index: 0, length: 12, value: egfwergreger
index: 1, length: 21, value: @#@$fgdfdfdfdsfsdfsdf
using test_length:  @#@$fgdfdfdfdsfsdfsdf
using max(): sdfsdfsfsdfs233

The special case is, Chinese mixed with English alphabets:

In [82]: test_one(input_one)
input: ['A測試測試', 'B測試', '測試', 'D測試測試測試', 'E測試測試測試測試測試測試測試測試測試']
index: 0, length: 5, value: A測試測試
index: 1, length: 3, value: B測試
index: 2, length: 2, value: 測試
index: 3, length: 7, value: D測試測試測試
index: 4, length: 19, value: E測試測試測試測試測試測試測試測試測試
using test_length:  E測試測試測試測試測試測試測試測試測試
using max(): 測試

The documentation doesn't specify any special behavior the max() method has.

Python version is Python 3.4.

Is this my problem or it's something that behaviors I don't know about?

Anand S Kumar · Accepted Answer · 2015-08-23 04:45:12Z

5

Well, your test_length() function does not do the same thing that max() does, max() , when the given inputs are strings, returns the lexicographically largest element from the input , not the one with the largest length.

A simple example to show this -

>>> a = 'aaaaaaaaaa'
>>> b = 'b'
>>> max(a,b)
'b'

Your test_length() function works based on the length of the string, which is different from what max() does.

max() also supports a key argument to which you can pass a function object, which will be used to determine which is the maximum element in the input. In your case you can pass in len to make max() work on the lengths of the string, Example -

>>> a = 'aaaaaaaaaa'
>>> b = 'b'
>>> max(a,b,key=len)
'aaaaaaaaaa'

answered Aug 23, 2015 at 4:45

Anand S Kumar

91.4k18 gold badges196 silver badges179 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

esfy Over a year ago

I see. Went re-read the docstring. Finds out that I skipped a couple line:(

dawg · Accepted Answer · 2015-08-23 04:47:25Z

2

Consider:

>>> max(['aaa','b','cc'])
'cc'

vs:

>>> max(['aaa','b','cc'], key=len)
'aaa'

If you want the 'max' to use the length of the string vs the ascii code of the first character of the string, use a key function -- in this case with the built-in len function.

answered Aug 23, 2015 at 4:47

dawg

105k24 gold badges142 silver badges217 bronze badges

Collectives™ on Stack Overflow

Weird behavior of Python's built-in max() method

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related