As others have pointed out, len("CCC-A-H") == 7 and python employs short-circuit evaluation on boolean operations. The end result is that:
(len("CCC-A-H") == 5 or len("CCC-A-H") == 7 or "CCC-A-H"[6] != "H")
will return true because len("CCC-A-H") == 7 evaluates to true before "CCC-A-H"[6] != "H" is evaluated.
This may be easier to see by using the filter(...) function instead of a list comprehension:
list1 = ["CCC-C", "CCC-P", "CCC-A-P", "CCC-A-H", "CCC-J", "CCC-S-X"]
def len57notHWrong(item):
return len(item) == 5 or len(item) == 7 or item[6] != "H"
print "Wrong : ", filter(len57notHWrong, list1)
This is a simple direct translation of the list comprehension you used to using the filter(...) function.
If we were to rewrite this using if ... elif ... else constructs, it would look something like this:
def len57notHWrongExpanded(item):
if len(item) == 5: # first check if length is 5
return True
elif len(item) == 7: # now check if length is 7
return True # it's 7? Short-circuit, return True
elif item[6] != "H": # This will never get seen (on this particular dataset)
return True
return False
print "Wrong (Expanded): ", filter(len57notHWrongExpanded, list1)
A correct expression would look like:
def len57notH(item):
return len(item) == 5 or (len(item) == 7 and item[6] != "H")
print "Correct : ", filter(len57notH, list1)
Expanded:
def len57notHExpanded(item):
if len(item) == 5:
return True
elif len(item) == 7:
if item[6] != "H":
return True
return False
print "Correct (Expand): ", filter(len57notHExpanded, list1)
This would make the list comprehension look like:
new_list = [i for i in list1 if (len(i) == 5 or (len(i) == 7 and i[6] != "H"))]
The reason your code doesn't raise an IndexError is because all of your data items are either 5 or 7 elements long. This causes the code to short circuit prior to hitting the i[6] != "H" expression. If you try this code on a list that contains data items that are not of length 5 or 7 and less than 7 elements long, then IndexError is raised:
list2 = ["CCC-C", "CCC-P", "CCC", "CCC-A-P", "CCC-A-H", "CCC-J", "CCC-S-X"]
new_list = [i for i in list2 if (len(i) == 5 or len(i) == 7 or i[6] != "H")]
Traceback (most recent call last):
File "C:/Users/xxxxxxxx/Desktop/t.py", line 44, in <module>
new_list = [i for i in list2 if (len(i) == 5 or len(i) == 7 or i[6] != "H")]
IndexError: string index out of range
Sorry, it's a bit of a long answer...