python3 - apply a regex map to column

Question

How to apply a regex to a data frame column?

import pandas as pd

df = pd.DataFrame({'col1': ['negative', 'positive', 'neutral', 'neutral', 'positive']})
cdict = {'n.*': -1, 'p.*': 0}
df['col2'] = df['col1'].map(cdict)

print(df.head())

Current output is:

:        col1  col2
: 0  negative   NaN
: 1  positive   NaN
: 2   neutral   NaN
: 3   neutral   NaN
: 4  positive   NaN

But expected results:

:        col1  col2
: 0  negative   -1
: 1  positive   1
: 2   neutral   -1
: 3   neutral   -1
: 4  positive   1

Note that your dict should be cdict = {'n.*': -1, 'p.*': 1} for your expected output, I assume its a typo — anky
– anky, Commented Apr 8, 2021 at 17:49

anky · Accepted Answer · 2021-04-08 17:44:54Z

4

Instead of using a series.map use series.replace with regex=True

df['col2'] = df['col1'].replace(cdict,regex=True)

edited Apr 8, 2021 at 17:44

answered Apr 8, 2021 at 17:44

anky

75.3k11 gold badges46 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mayank Porwal · Accepted Answer · 2021-04-08 17:54:57Z

2

To be honest, you don't need to have a dict for this at all. You can save on some space there.

Use numpy.select with Series.str.startswith:

In [1927]: import numpy as np

In [1928]: conds = [df.col1.str.startswith('n'), df.col1.str.startswith('p')]

In [1929]: choices = [-1, 0]

In [1930]: df['col2'] = np.select(conds, choices)

In [1931]: df
Out[1931]: 
       col1  col2
0  negative    -1
1  positive     0
2   neutral    -1
3   neutral    -1
4  positive     0

edited Apr 8, 2021 at 17:54

answered Apr 8, 2021 at 17:43

Mayank Porwal

34.2k9 gold badges45 silver badges65 bronze badges

Collectives™ on Stack Overflow

python3 - apply a regex map to column

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related