0

I have a string:

a = "sky high"

and a file in csv style that I have opened and converted to list:

mylist = [["sky high",'77'],["sky high and high",'88']]

I want to check if the string exist in the first position in every list. But, if I do this:

for row in mylist:
    if a in row[0]:
       print row[1]

it will give me result 77 88 instead of just 77. I don't know why I can't use if a == row[0] as it will produce no result. Have any idea what to do?

EDIT:

So actually my code looks like this:

data = open("text.qrel",'rb')
new = []
for row in data:
    d = row[:-1].split(',')

    if a == d[0]:
       new.append(d[1])

and it doesn't work!

5
  • 2
    Please show us the code where you try to use if a == row[0]. Because that will work, unless you do something else wrong. Commented May 15, 2013 at 0:27
  • @abarnert I have edit my actual code Commented May 15, 2013 at 0:44
  • OK, your problem is most likely that you're not parsing the CSV file correctly. You have to either show us (the relevant lines of) the CSV file, or have your code print out each row and show us what it says. But my guess is that either the columns are quoted, or that there's extra spaces. See my answer for how to deal with that. Commented May 15, 2013 at 0:58
  • And this is why, in the future, you should show us a sample of real code (with relevant data) that actually doesn't work, so we don't have to spend 40 minutes going back and forth, frustrating you with followup questions and incorrect answers, before figuring things out. See SSCCE for some tips. Commented May 15, 2013 at 1:03
  • Sorry for the inconvenience and thanks for the help :) Commented May 15, 2013 at 1:05

3 Answers 3

2

Try running this through an interactive visualizer, like this one. When you can't do that for some reason, at least try experimenting in the normal interactive interpreter, or printing out intermediate results in your program.


When a is "sky high", and row is ["sky high and high",'88'], that means row[0] is "sky high and high", so a in row[0] is True.

That's why (if you fix it to use [1] instead of [2]) it will print both 77 and 88.

Try this at the interactive interpreter (or the visualizer):

>>> a = "sky high"
>>> mylist = [["sky high",'77'],["sky high and high",'88']]
>>> row = mylist[1]
>>> row[0]
"sky high and high"
>>> a in row[0]
True

Meanwhile, you say "I don't know why I can't use if a == row[0] as it will produce no result."

But if you use a == row[0] it won't produce no result; it will produce 77.

Try this at the interactive interpreter (or in the online visualizer):

>>> a = "sky high"
>>> mylist = [["sky high",'77'],["sky high and high",'88']]
>>> for row in mylist:
...     if a == row[0]:
...         print row[1]
77

So, you must have a bug in some other part of the code. Show us the version that you claim isn't working, and we can find the bug.


Most likely, the problem with your real code is that row (or, actually, d) is not actually ["sky high", '77'], but something with extra characters in it:

data = open("text.qrel",'rb')
new = []
for row in data:
    d = row[:-1].split(',')

Let's say text.qrel looked like this:

sky high , 77

This would make d[0] be "sky high " (with a space), not "sky high".

Or:

"sky high",'77'

Then d[0] would be '"sky high"' (with extra quotes), not "sky high".

You could show us an extract of that CSV file, or have your code print out each row and show us what it prints; otherwise, we're just guessing.

You can try to fix things manually. For example, to handle both of the above cases, instead of this:

d = row[:-1].split(',')

… you'd do:

def remove_quotes(x):
    if x[0] == '"' and x[-1] == '"': return x[1:-1]
    elif x[0] == "'" and x[-1] == "'": return x[1:-1]
    else: return x
for row in data:
    d = [remove_quotes(col.strip()) for col in row[:-1].split(',')]

If you don't understand list comprehensions, this line:

d = [remove_quotes(col.strip()) for col in row[:-1].split(',')]

… is a shortcut for:

d = []
for col in row[:-1].split(','):
    d.append(remove_quotes(col.strip())

You already have the [:-1] to remove the trailing \n and the split(',') to split into two columns. But instead of just using the columns as-is, on each one, I call strip() to remove any extra whitespace at the edges (which turns out not to matter in your specific case, but it is a common problem in CSVs), and then call remove_quotes on the result to remove any matched pairs of quotes, and use that for the column value.

As you can see, that's tedious and complicated.

And there are still plenty of common cases it won't handle.

This is exactly why you usually want to use the csv module instead of trying to parse CSV files yourself:

for d in csv.reader(data):

Now, d[0] will be "sky high".

If your CSV files aren't quite "standard"-enough for CSV to handle out-of-the-box, you can give a dialect object, or just some format parameters, to the reader, and it's still usually easier than trying to build it from scratch yourself.

Sign up to request clarification or add additional context in comments.

5 Comments

a="sky high"; row=["sky high and high",'88']; a in row is False because although a is in row[0], it is not an element of row.
@BBrown: Copied and pasted from the wrong place; fixed. (But you're quoting from my sample code, which was already correct, not from the place I had it wrong, which is odd…)
@abarnert so now I know the problem, when I print the list, it's actually ['"sky high"','77'] with double quote because the original data is "sky high",77. and the reason I'm doing d = row[:-1].split(',') is because if i didn't state row[:-1], it will produce a list ['"sky high"','77'\n] with newline at the back. But I don't really understand your solution for the remove quotes
@FynnMahoney: Do you not understand the remove_quotes function, or the list comprehension that uses it? I edited the answer to try to explain what it's doing. But meanwhile, have you tried using csv.reader instead of trying to do it the hard way?
@abarnert: I was quoting from your prose, which you've fixed. :-)
1

You are asking it if the string sky high is in the first string in each row, not if the string is in the row. This code should do what you want:

for row in mylist:
    if a == row[0]:
       print row[1]

This only outputs 77.

Comments

0

For your example, you would need to use row[1]. Then it should work with ==.
in checks if a string is in another string, and "sky high and high" contains "sky high", so that's correct.

>>> a = "sky high"
>>> mylist = [["sky high",'77'],["sky high and high",'88']]
>>> for row in mylist:
...     if a == row[0]:
...         print row[1]
77

You could also use list comprehension for something as simple as this, if you like one-liners:

>>> [row[1] for row in mylist if a == row[0]][0]
'77'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.