How to check strings in list

Question

I have a string:

a = "sky high"

and a file in csv style that I have opened and converted to list:

mylist = [["sky high",'77'],["sky high and high",'88']]

I want to check if the string exist in the first position in every list. But, if I do this:

for row in mylist:
    if a in row[0]:
       print row[1]

it will give me result 77 88 instead of just 77. I don't know why I can't use if a == row[0] as it will produce no result. Have any idea what to do?

EDIT:

So actually my code looks like this:

data = open("text.qrel",'rb')
new = []
for row in data:
    d = row[:-1].split(',')

    if a == d[0]:
       new.append(d[1])

and it doesn't work!

Please show us the code where you try to use if a == row[0]. Because that will work, unless you do something else wrong. — abarnert
– abarnert, Commented May 15, 2013 at 0:27
OK, your problem is most likely that you're not parsing the CSV file correctly. You have to either show us (the relevant lines of) the CSV file, or have your code print out each row and show us what it says. But my guess is that either the columns are quoted, or that there's extra spaces. See my answer for how to deal with that. — abarnert
– abarnert, Commented May 15, 2013 at 0:58
And this is why, in the future, you should show us a sample of real code (with relevant data) that actually doesn't work, so we don't have to spend 40 minutes going back and forth, frustrating you with followup questions and incorrect answers, before figuring things out. See SSCCE for some tips. — abarnert
– abarnert, Commented May 15, 2013 at 1:03

abarnert · Accepted Answer · 2013-05-15 01:31:02Z

2

Try running this through an interactive visualizer, like this one. When you can't do that for some reason, at least try experimenting in the normal interactive interpreter, or printing out intermediate results in your program.

When a is "sky high", and row is ["sky high and high",'88'], that means row[0] is "sky high and high", so a in row[0] is True.

That's why (if you fix it to use [1] instead of [2]) it will print both 77 and 88.

Try this at the interactive interpreter (or the visualizer):

>>> a = "sky high"
>>> mylist = [["sky high",'77'],["sky high and high",'88']]
>>> row = mylist[1]
>>> row[0]
"sky high and high"
>>> a in row[0]
True

Meanwhile, you say "I don't know why I can't use if a == row[0] as it will produce no result."

But if you use a == row[0] it won't produce no result; it will produce 77.

Try this at the interactive interpreter (or in the online visualizer):

>>> a = "sky high"
>>> mylist = [["sky high",'77'],["sky high and high",'88']]
>>> for row in mylist:
...     if a == row[0]:
...         print row[1]
77

So, you must have a bug in some other part of the code. Show us the version that you claim isn't working, and we can find the bug.

Most likely, the problem with your real code is that row (or, actually, d) is not actually ["sky high", '77'], but something with extra characters in it:

data = open("text.qrel",'rb')
new = []
for row in data:
    d = row[:-1].split(',')

Let's say text.qrel looked like this:

sky high , 77

This would make d[0] be "sky high " (with a space), not "sky high".

Or:

"sky high",'77'

Then d[0] would be '"sky high"' (with extra quotes), not "sky high".

You could show us an extract of that CSV file, or have your code print out each row and show us what it prints; otherwise, we're just guessing.

You can try to fix things manually. For example, to handle both of the above cases, instead of this:

d = row[:-1].split(',')

… you'd do:

def remove_quotes(x):
    if x[0] == '"' and x[-1] == '"': return x[1:-1]
    elif x[0] == "'" and x[-1] == "'": return x[1:-1]
    else: return x
for row in data:
    d = [remove_quotes(col.strip()) for col in row[:-1].split(',')]

If you don't understand list comprehensions, this line:

d = [remove_quotes(col.strip()) for col in row[:-1].split(',')]

… is a shortcut for:

d = []
for col in row[:-1].split(','):
    d.append(remove_quotes(col.strip())

You already have the [:-1] to remove the trailing \n and the split(',') to split into two columns. But instead of just using the columns as-is, on each one, I call strip() to remove any extra whitespace at the edges (which turns out not to matter in your specific case, but it is a common problem in CSVs), and then call remove_quotes on the result to remove any matched pairs of quotes, and use that for the column value.

As you can see, that's tedious and complicated.

And there are still plenty of common cases it won't handle.

This is exactly why you usually want to use the csv module instead of trying to parse CSV files yourself:

for d in csv.reader(data):

Now, d[0] will be "sky high".

If your CSV files aren't quite "standard"-enough for CSV to handle out-of-the-box, you can give a dialect object, or just some format parameters, to the reader, and it's still usually easier than trying to build it from scratch yourself.

edited May 15, 2013 at 1:31

answered May 15, 2013 at 0:24

abarnert

368k54 gold badges626 silver badges691 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Bennett Brown Over a year ago

a="sky high"; row=["sky high and high",'88']; a in row is False because although a is in row[0], it is not an element of row.

abarnert Over a year ago

@BBrown: Copied and pasted from the wrong place; fixed. (But you're quoting from my sample code, which was already correct, not from the place I had it wrong, which is odd…)

Fynn Mahoney Over a year ago

@abarnert so now I know the problem, when I print the list, it's actually ['"sky high"','77'] with double quote because the original data is "sky high",77. and the reason I'm doing d = row[:-1].split(',') is because if i didn't state row[:-1], it will produce a list ['"sky high"','77'\n] with newline at the back. But I don't really understand your solution for the remove quotes

abarnert Over a year ago

@FynnMahoney: Do you not understand the remove_quotes function, or the list comprehension that uses it? I edited the answer to try to explain what it's doing. But meanwhile, have you tried using csv.reader instead of trying to do it the hard way?

Bennett Brown Over a year ago

@abarnert: I was quoting from your prose, which you've fixed. :-)

Linuxios · Accepted Answer · 2013-05-15 00:24:03Z

1

You are asking it if the string sky high is in the first string in each row, not if the string is in the row. This code should do what you want:

for row in mylist:
    if a == row[0]:
       print row[1]

This only outputs 77.

answered May 15, 2013 at 0:24

Linuxios

35.9k13 gold badges96 silver badges118 bronze badges

Comments

timss · Accepted Answer · 2013-05-15 00:30:01Z

0

For your example, you would need to use row[1]. Then it should work with ==.
in checks if a string is in another string, and "sky high and high" contains "sky high", so that's correct.

>>> a = "sky high"
>>> mylist = [["sky high",'77'],["sky high and high",'88']]
>>> for row in mylist:
...     if a == row[0]:
...         print row[1]
77

You could also use list comprehension for something as simple as this, if you like one-liners:

>>> [row[1] for row in mylist if a == row[0]][0]
'77'

edited May 15, 2013 at 0:30

answered May 15, 2013 at 0:24

timss

10.3k4 gold badges38 silver badges56 bronze badges

Collectives™ on Stack Overflow

How to check strings in list

3 Answers 3

5 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related