2

I'm trying to iterate through a search list, I've written it like I would in C but I want to re-write this more pythonic.

I've been trying with enumerate but I can't seem to get it to work, It is searching lines of data for key-words which are saved in an array called strings, can someone show me or explain the correct python syntax please.

thanks

for line in f:
    jd = json.loads(line)
    N=0
    while N<=(len(strings)-1):
        if findWholeWord(strings[N])(line) != None:
            print (jd['user_id'], jd['text'])
            break
        N=N+1
1
  • Please, provide the example input and expected output for your problem. You could also use for loop, instead of while. Commented Jun 1, 2015 at 9:34

2 Answers 2

1

There seems to be no need to use enumerate here. Just iterate over strings directly:

for s in strings:
    if findWholeWord(s)(line) != None:
        print (jd['user_id'], jd['text'])
        break

If you also need the index variable n, then use enumerate:

for n, s in enumerate(strings):
    if findWholeWord(s)(line) != None:
        # do something with n here?
        print (jd['user_id'], jd['text'])
        break

But since you break after the first match anyway, you could probably also use the any builtin:

if any(findWholeWord(s)(line) != None for s in strings):
    jd = json.loads(line)
    print (jd['user_id'], jd['text'])

Also, as pointed out in @Ben's answer, you can probably improve the performance of the check by turning either strings or line into a set of words and then just using the in operator to check whether some word from the one set is in the other. But this is difficult to tell without knowing what exactly findWholeWord is doing.

Sign up to request clarification or add additional context in comments.

Comments

1

Make strings a set instead of an array (for performance, won't change functionality)

strings = set(strings)

I don't know what findWholeWord(strings[N])(line) is meant to do. But I'm guessing it's something like this:

jd = json.loads(s)
## json.loads needs to be used instead json.load since 's' will be a STRING

if any(w in strings for w in tokenize(line)):
    print (jd['user_id'], jd['text'])

I'm guessing findWholeWords gets whole words from the line and checks them against your set of strings. If so, you could use a proper tokenizer (look at NLTK) or just use:

def tokenize(line):
  return line.split(' ')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.