Python: how to optimize replacing with for loop or a function or if statements, or all together, or

Question

I have a data frame where one of 20 columns is named "name". Some values in the column are real names, but others are trash (set of letters, adjectives, adverbs, prepositions, etc.). So, there is a whole list of such not names (like below):

not_names = ['such', 'a', 'not', 'one', 'an', 'actually', 'this','the', 'by']

(the list is long, but no longer than 20 variations)

I want to replace all not names with "NaN". If there were only three not_names, I could have simply create a list for replacement, e.g. nan_list = ['NaN', 'NaN', 'NaN'] (the length of not_names list must eventually match the length of the nan_list).

So, I would proceeded with a replacement like this:

df['name'].replace(not_names, nan__list, inplace=True)

But if I have a list of 20+ not_names, creation of nan_list looks odd as I need to repeat 'NaN' 20+ times, which does not seem to be optimal.

I'm fairly new to Python, so understanding all concepts is not always easy, but I feel like my task can be simplified with for loops or isin() or map() or user-defined functions.

Any suggestions? Please advise.

Can you put the code in code blocks for easy reading?

Gerardo Zinno
– Gerardo Zinno

2020-08-15 15:59:26 +00:00
Commented Aug 15, 2020 at 15:59 — Gerardo Zinno
– Gerardo Zinno, Commented Aug 15, 2020 at 15:59

Anthony · Accepted Answer · 2020-08-16 16:40:22Z

1

Instead of doing:

df['name'].replace(not_names, nan__list, inplace=True)

you can just do:

df['name'].replace(not_names, 'NaN', inplace=True)

and it will replace anything that matches an element of not_names with 'NaN'.

Here's an example:

In [32]: df = pd.DataFrame(np.arange(0,9).reshape(3,3))

In [33]: df
Out[33]: 
   0  1  2
0  0  1  2
1  3  4  5
2  6  7  8

In [34]: df[2].replace([2,5], 0, inplace=True)

In [35]: df
Out[35]: 
   0  1  2
0  0  1  0
1  3  4  0
2  6  7  8

edited Aug 16, 2020 at 16:40

answered Aug 15, 2020 at 16:02

Anthony

1355 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Maria P Over a year ago

Unfortunately, this one is not working either. I tried this before, no replacement is happening.

Anthony Over a year ago

Your issue is likely elsewhere then as the above line definitely works. You can edit your original post with more or your code so we can take a look.

Collectives™ on Stack Overflow

Python: how to optimize replacing with for loop or a function or if statements, or all together, or

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related