how to add column on panda based on another column partial string

Question

I´m quite new with python and pandas. I´m trying to add a new column to a data frame (group column) with values based on a partial string in another column (user column). Users are coded like this: AA1, AA2, BB1, BB2 and so on. What I want is the group column to have a 'AA' value for all the AA users. After looking for a way to do this, I came up with the following line:

df['group'] = ['AA' if x x.startswith('AA') else 'other' for x in df['user']]

Well,it does´t work: 1) I get invalid syntax and line too long error 2) However, it does work if I change x.startswith('AA') for x == 'AA1', so is it something with the startswith part? 3) I don´t know how to add the 'BB' if x x.starts with('BB') in the same line, or should I write a line for each category of user? Thank you so much

MaThMaX · Accepted Answer · 2016-06-06 16:33:13Z

2

df['group'] = ['AA' if x.startswith('AA') else 'other' for x in df['user']]

you just have an extra x before x.startswith('AA')

answered Jun 6, 2016 at 16:33

MaThMaX

2,0151 gold badge14 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jezrael · Accepted Answer · 2016-06-06 16:34:49Z

1

I think you can use numpy.where with str.startswith or str.contains:

import pandas as pd
import numpy as np

df = pd.DataFrame({'user':['AA1','AA2','BB1','BB2']})
print (df)
  user
0  AA1
1  AA2
2  BB1
3  BB2

df['group'] = np.where(df.user.str.startswith('AA'), 'AA', 'other')
df['group1'] = np.where(df.user.str.contains('AA'), 'AA', 'other')
#if need extract first 2 chars from each user
df['g1'] = df.user.str[:2]
print (df)
  user  group group1  g1
0  AA1     AA     AA  AA
1  AA2     AA     AA  AA
2  BB1  other  other  BB
3  BB2  other  other  BB

For extract substring check indexing with str.

edited Jun 6, 2016 at 16:34

answered Jun 6, 2016 at 16:26

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Collectives™ on Stack Overflow

how to add column on panda based on another column partial string

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related