I have a list of strings and i want to remove the stop words inside each string. The thing is, the length of the stopwords is much longer than the strings and I don't want to repeat comparing each string with the stopwords list. Is there a way in python that these multiple strings at the same time?
lis = ['aka', 'this is a good day', 'a pretty dog']
stopwords = [] # pretty long list of words
for phrase in lis:
phrase = phrase.split(' ') # get list of words
for word in phrase:
if stopwords.contain(word):
phrase.replace(word, '')
This is my current method. But these means I have to go through all the phrases in the list. Is there a way that I can process these phrases with only one time compare?
Thanks.
stopwordsinto a set, asx in setchecking is very fast.0(1)x in stopwordsis linear in time ifstopwordsis a list and constant in time if it is a set (as Kevin said). In other words, with a set, you (almost) wouldn't feel the difference between a little one and a huge one (it's fast in both case).