0

I have following dataframe named df.

id letter
1 x,y
2 z
3 a

The mapping condition is {'x' : 1, 'z' : 2, 'ELSE' : 0}

my desired output dataframe should look like,

id letter map
1 x,y 1
2 z 2
2 a 0

Which means, even any of the letters in column letter is x, then the column map should be 1.

Without iterating through each row of the dataframe, is there any way to do that?

3
  • what if you have 'x,z'? Commented Nov 18, 2022 at 9:08
  • Assume that x and z cannot be together Commented Nov 18, 2022 at 9:10
  • if xb(not x, b) exist in letter, map 1 or 0? if 1 use my answer Commented Nov 18, 2022 at 9:41

2 Answers 2

1

You can use

pure pandas

cond = {'x' : 1, 'z' : 2, 'ELSE' : 0}

df['map'] = (df['letter']
 .str.split(',').explode()
 .map(lambda x: cond.get(x, cond['ELSE']))
 .groupby(level=0).max()
)

In case of multiple values I would get the max.

Alternative for the first valid match:

df['map'] = (df['letter']
 .str.split(',').explode()
 .map(cond)
 .groupby(level=0).first()
 .fillna(cond['ELSE'], downcast='infer')
)

list comprehension

Or using a list comprehension, here the first valid match would be used:

cond = {'x' : 1, 'z' : 2, 'ELSE' : 0}

df['map'] = [next((cond[x] for x in s.split(',') if x in cond),
                  cond['ELSE'])  for s in df['letter']]
   id letter  map
0   1    x,y    1
1   2      z    2
2   3      a    0
Sign up to request clarification or add additional context in comments.

Comments

1

use np.select

import numpy as np

cond1 = df['letter'].str.contains('x')
cond2 = df['letter'].str.contains('z')
df.assign(map=np.select([cond1, cond2], [1, 2], 0))

output:

    id  letter  map
0   1   x,y     1
1   2   z       2
2   3   a       0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.