While using Python's max() built-in method I've found an interesting thing....
input_one = u'A測試測試;B測試;測試;D測試測試測試;E測試測試測試測試測試測試測試測試測試'
input_two = u'測試測試;測試;測試;測試測試測試;測試測試測試測試測試測試測試測試測試'
input_en = u'test;test,test,test;testtesttest;testtesttesttest'
input_ja = u'ああああああ;あああ;あああああああ;ああああああああああああ'
input_ja_mixed = u'aああああああ;bあああ;cあああああああ;dああああああああああああ'
input_ascii = u'egfwergreger;@#@$fgdfdfdfdsfsdfsdf;sdfsdfsfsdfs233'
def test_length(input):
lengths = []
for i in input:
lengths.append(len(i))
index = find_index(input, max(lengths))
return input[index]
def find_index(input, to_find):
for index, value in enumerate(input):
print('index: %s, length: %s, value: %s' % (index, len(value), value))
if len(value) == to_find:
return index
def test_one(input):
input = input.split(';')
print('input:', input)
print('using test_length: ', test_length(input))
print('using max():', max(input))
If using max() to find the max element in a list which only contains English alphabets, it works good.
But, if the element is mixed with symbols(like @ # $), it behaves differently.
For example,
In [80]: test_one(input_ascii)
input: ['egfwergreger', '@#@$fgdfdfdfdsfsdfsdf', 'sdfsdfsfsdfs233']
index: 0, length: 12, value: egfwergreger
index: 1, length: 21, value: @#@$fgdfdfdfdsfsdfsdf
using test_length: @#@$fgdfdfdfdsfsdfsdf
using max(): sdfsdfsfsdfs233
The special case is, Chinese mixed with English alphabets:
In [82]: test_one(input_one)
input: ['A測試測試', 'B測試', '測試', 'D測試測試測試', 'E測試測試測試測試測試測試測試測試測試']
index: 0, length: 5, value: A測試測試
index: 1, length: 3, value: B測試
index: 2, length: 2, value: 測試
index: 3, length: 7, value: D測試測試測試
index: 4, length: 19, value: E測試測試測試測試測試測試測試測試測試
using test_length: E測試測試測試測試測試測試測試測試測試
using max(): 測試
The documentation doesn't specify any special behavior the max() method has.
Python version is Python 3.4.
Is this my problem or it's something that behaviors I don't know about?