0

i have a list of stopwords (in German) that i want to use to Filter out the same ones from an input Text, it looks like this:

stopwortlist = ['ab', 'aber','abgesehen', 'alle', 'allein', 'aller', 'alles']
text = input('please put in a Text')
#i have found a way of controlling them online, but it doesnt quite work,
#cause it gives out a list, and all i want is a text (where the words from 
#the list are filtered out

def filterStopwords (eingabeText, stopwords):

    out = [word for word in eingabeText if word not in stopwords]
    return out;

how should i modify the function to get my Result ? thanks a lot in Advance

2 Answers 2

2

Split your incoming text into words (otherwise you are iterating over characters), filter the stop words and then rejoin the resulting list.

stopwortlist = ['ab', 'aber','abgesehen', 'alle', 'allein', 'aller', 'alles']
text = 'Some text ab aber with stopwords allein in'

def filterStopwords(eingabeText, stopwords):
    out = [word for word in eingabeText.split() if word not in stopwords]
    return ' '.join(out)

filterStopwords(text, stopwortlist) # => 'Some text with stopwords in'
Sign up to request clarification or add additional context in comments.

Comments

-1

Here's a one liner using the filter and join methods.

stopwortlist = ['ab', 'aber','abgesehen', 'alle', 'allein', 'aller', 'alles']
text = 'There are ab aber multiple allein abgesehen words in alles this ab list'

print " ".join(filter(lambda x: x not in stopwortlist, text.split()))

#Output
There are multiple words in this list

This basically uses a lambda function to check if a word is in the stopwortlist and then filters it out of the string.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.