I have a data frame where one of 20 columns is named "name". Some values in the column are real names, but others are trash (set of letters, adjectives, adverbs, prepositions, etc.). So, there is a whole list of such not names (like below):
not_names = ['such', 'a', 'not', 'one', 'an', 'actually', 'this','the', 'by']
(the list is long, but no longer than 20 variations)
I want to replace all not names with "NaN". If there were only three not_names, I could have simply create a list for replacement, e.g. nan_list = ['NaN', 'NaN', 'NaN'] (the length of not_names list must eventually match the length of the nan_list).
So, I would proceeded with a replacement like this:
df['name'].replace(not_names, nan__list, inplace=True)
But if I have a list of 20+ not_names, creation of nan_list looks odd as I need to repeat 'NaN' 20+ times, which does not seem to be optimal.
I'm fairly new to Python, so understanding all concepts is not always easy, but I feel like my task can be simplified with for loops or isin() or map() or user-defined functions.
Any suggestions? Please advise.