Pandas: Change column based on its value

Question

When cluster_name contains "demo", I want to change it to "unknown".

This is the best I've managed:

df["cluster_name"] = "unknown" if "demo" is in df["cluster_name"] else df["cluster_name"]

But getting:

SyntaxError: invalid syntax

Mayank Porwal · Accepted Answer · 2020-09-27 11:04:35Z

2

You can use numpy.where:

import numpy as np
df["cluster_name"] = np.where(df["cluster_name"].str.contains("demo"), "unknown", df["cluster_name"])

See below example:

In [814]: df1
Out[814]: 
        State  Year  Incident  new     nn
0           a  1980       513    1    0.0
1  demo is in  1981       453    0    1.0
2           b  1982       424    1  100.0
3     my demo  1983       372  100    NaN

In [816]: df1.State = np.where(df1.State.str.contains('demo'), 'unknown', df1.State)

In [817]: df1
Out[817]: 
     State  Year  Incident  new     nn
0        a  1980       513    1    0.0
1  unknown  1981       453    0    1.0
2        b  1982       424    1  100.0
3  unknown  1983       372  100    NaN

edited Sep 27, 2020 at 11:04

answered Sep 27, 2020 at 10:58

Mayank Porwal

34.2k9 gold badges45 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Michael Szczesny · Accepted Answer · 2020-09-27 11:13:52Z

2

You can use Series.replace if you don't need to search for the substring 'demo'. 'Contains' is ambiguous.

df['cluster_name'] = df['cluster_name'].replace('demo','unknown')

Or replace inplace

df['cluster_name'].replace('demo','unknown', inplace=True)

edited Sep 27, 2020 at 11:13

answered Sep 27, 2020 at 11:02

Michael Szczesny

5,0465 gold badges20 silver badges36 bronze badges

Comments

Grayrigel · Accepted Answer · 2020-09-27 11:21:10Z

One potential solution is to use map with a lambda function, which is syntactically similar to what you were trying to do:

Simple map solution:

#replaces the row with 'unknown' if it is 'demo'
df['cluster_name'] = df['cluster_name'].map(lambda x : 'unknown' if x=='demo' else x)

More generalized map solution:

#replaces the row with 'unknown' if contains 'demo'
df['cluster_name'] = df['cluster_name'].map(lambda x : 'unknown' if 'demo' in x else x)

Examples:

>>> #simple map solution
>>> df
  cluster_name
0         demo
1         demo
2            1

>>> df['cluster_name'] = df['cluster_name'].map(lambda x : 'unknown' if x=='demo' else x)
>>> df
  cluster_name
0      unknown
1      unknown
2            1

>>> #More generalized  map solution:
>>> df1
  cluster_name
0    demo is a
1         demo
2            1
>>> df1['cluster_name'] = df1['cluster_name'].map(lambda x : 'unknown' if 'demo' in x else x)
>>> df1
  cluster_name
0      unknown
1      unknown
2            1

Collectives™ on Stack Overflow

Pandas: Change column based on its value

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related