0

While using Python's max() built-in method I've found an interesting thing....

input_one = u'A測試測試;B測試;測試;D測試測試測試;E測試測試測試測試測試測試測試測試測試'
input_two = u'測試測試;測試;測試;測試測試測試;測試測試測試測試測試測試測試測試測試'
input_en = u'test;test,test,test;testtesttest;testtesttesttest'
input_ja = u'ああああああ;あああ;あああああああ;ああああああああああああ'
input_ja_mixed = u'aああああああ;bあああ;cあああああああ;dああああああああああああ'
input_ascii = u'egfwergreger;@#@$fgdfdfdfdsfsdfsdf;sdfsdfsfsdfs233'


def test_length(input):
    lengths = []
    for i in input:
        lengths.append(len(i))
    index = find_index(input, max(lengths))
    return input[index]


def find_index(input, to_find):
    for index, value in enumerate(input):
        print('index: %s, length: %s, value: %s' % (index, len(value), value))
        if len(value) == to_find:
            return index

def test_one(input):
    input = input.split(';')
    print('input:', input)
    print('using test_length: ', test_length(input))
    print('using max():', max(input))

If using max() to find the max element in a list which only contains English alphabets, it works good.

But, if the element is mixed with symbols(like @ # $), it behaves differently.

For example,

In [80]: test_one(input_ascii)
input: ['egfwergreger', '@#@$fgdfdfdfdsfsdfsdf', 'sdfsdfsfsdfs233']
index: 0, length: 12, value: egfwergreger
index: 1, length: 21, value: @#@$fgdfdfdfdsfsdfsdf
using test_length:  @#@$fgdfdfdfdsfsdfsdf
using max(): sdfsdfsfsdfs233

The special case is, Chinese mixed with English alphabets:

In [82]: test_one(input_one)
input: ['A測試測試', 'B測試', '測試', 'D測試測試測試', 'E測試測試測試測試測試測試測試測試測試']
index: 0, length: 5, value: A測試測試
index: 1, length: 3, value: B測試
index: 2, length: 2, value: 測試
index: 3, length: 7, value: D測試測試測試
index: 4, length: 19, value: E測試測試測試測試測試測試測試測試測試
using test_length:  E測試測試測試測試測試測試測試測試測試
using max(): 測試

The documentation doesn't specify any special behavior the max() method has.

Python version is Python 3.4.

Is this my problem or it's something that behaviors I don't know about?

2 Answers 2

5

Well, your test_length() function does not do the same thing that max() does, max() , when the given inputs are strings, returns the lexicographically largest element from the input , not the one with the largest length.

A simple example to show this -

>>> a = 'aaaaaaaaaa'
>>> b = 'b'
>>> max(a,b)
'b'

Your test_length() function works based on the length of the string, which is different from what max() does.

max() also supports a key argument to which you can pass a function object, which will be used to determine which is the maximum element in the input. In your case you can pass in len to make max() work on the lengths of the string, Example -

>>> a = 'aaaaaaaaaa'
>>> b = 'b'
>>> max(a,b,key=len)
'aaaaaaaaaa'
Sign up to request clarification or add additional context in comments.

1 Comment

I see. Went re-read the docstring. Finds out that I skipped a couple line:(
2

Consider:

>>> max(['aaa','b','cc'])
'cc'

vs:

>>> max(['aaa','b','cc'], key=len)
'aaa'

If you want the 'max' to use the length of the string vs the ascii code of the first character of the string, use a key function -- in this case with the built-in len function.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.