I have a list containing strings that contain descriptions from a body of text that looks as follows:
stringlist = ['I have a dog and cat and the dog is seven years old', 'that dog is old']
and I need to filter these strings by a list of keywords that are located in another list:
keywords = ['dog', 'cat', 'old']
and appending each keyword to a row depending on how many times it is located in the string.
filteredlist = [['dog', 'dog', 'cat', 'old'], ['dog', 'old']]
I am splitting the strings in the stringslist and using list comprehension to check if the keyword is in the list but is not outputting correctly when I am looping through the keywords.
The code is working when I use one specific string to search for as follows:
filteritem = 'dog'
filteredlist = []
for string in stringlist:
string = string.split()
res = [x for x in string if filteritem in x]
filteredlist.append(res)
The resulting filteredlist is as follows:
filteredlist = [['dog', 'dog'], ['dog']]
which appends the keyword for each instance that the keyword is located in the string sequence.
When I try looping through the keyword list as follows with a for loop the output loses the structure.
filteredlist = []
for string in stringlist:
string = string.split()
for keyword in keywords:
res = [x for x in string if keyword in x]
filteredlist.append(res)
Here is the output:
filteredlist = [['dog', 'dog'], ['cat'], ['old'], [], ['dog'], [], ['old'], []]
I think I'm approaching this problem completely wrong so any other method or solution would be helpful.
filtered_listabove what you want the output to look like?