2

Hi I'm new to Python, so this may come across as a simple problem but I've been searching through Google many times and I can't seem to find a way to overcome it. Basically I have a list of strings, taken from a CSV file. And I have another list of strings in a text file. My job is to see if the words from my text file are in the CSV file.

Let's say this is what the CSV file looks like (it's made up):

  name,author,genre,year
  Private Series,Kate Brian,Romance,2003
  Mockingbird,George Orwell,Romance,1956
  Goosebumps,Mary Door,Horror,1990
  Geisha,Mary Door,Romance,2003

And let's say the text file looks like this: Romance 2003

What I'm trying to do is, create a function which returns the names of a book which have the words "Romance" and "2003" in them. So in this case, it should return "Private Series" and "Geisha" but not "Mockingbird". But my problem is, it doesn't seem to return them. However when I change my input to "Romance" it returns all three books with Romance in them. I assume it's because "Romance 2003" aren't together because if I change my input to "Mary Door" both "Goosebumps" and "Geisha" show up. So how can I overcome this?

Also, how do I make my function case insensitive?

Any help would be much appreciated :)

1 Answer 1

3
import csv

def read_input(filename):
    f = open(filename)
    return csv.DictReader(f, delimiter = ',')

def search_filter(src, term):
    term = term.lower()
    for s in src:
        if term in map(str.lower, s.values()):
            yield s

def query(src, terms):
    terms = terms.split()
    for t in terms:
        src = search_filter(src, t)
    return src

def print_query(q):    
    for row in q:
        print row

I tried to split the logic into small, re-usable functions.

First, we have read_input which takes a filename and returns the lines of a CSV file as an iterable of dicts.

The search_filter filters a stream of results with the given term. Both the search term and the row values are changed to lowercase for the comparison to achieve case-independent matching.

The query function takes a query string, splits it into search terms and then makes a chain of filters based on the terms and returns the final, filtered iterable.

>>> src = read_input("input.csv")
>>> q = query(src, "Romance 2003")
>>> print_query(q)
{'genre': 'Romance', 'year': '2003', 'name': 'Private Series', 'author': 'Kate Brian'}
{'genre': 'Romance', 'year': '2003', 'name': 'Geisha', 'author': 'Mary Door'}

Note that the above solution only returns full matches. If you want to e.g. return the above matcher with the search query "Roman 2003", then you can use this alternative version of search_filter:

def search_filter(src, term):
    term = term.lower()
    for s in src:
        if any(term in v.lower() for v in s.values()):
            yield s
Sign up to request clarification or add additional context in comments.

2 Comments

Hi thanks so much for the help, but I have a few problems. One is that I need to be able to get the query "Romance 2003" from a text file stored away (input.txt) & also when I tried running your solution, this came up: line 10, in <genexpr> if any(term in v.lower() for v in s.values()): AttributeError: 'NoneType' object has no attribute 'lower'
Actually never mind, I've gotten my original code to work from looking at yours. So thank you so so much, couldn't have done it without your help, really appreciate it :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.