mapping missing values in one column of pandas dataframe using dictionary with reference to another column values

Question

I have a dataframe as

> print(df)
[Out:]
activity-code    activity
-------------------------
0                unknown
99               NaN
84               sports
72;99            NaN
57               recreational
57;99;11         NaN
11               NaN

and a dictionary with activity-codes as keys,

> print(act_dict)
[Out:]
{10: 'unknown',
11: 'cultural',
57: 'recreational',
72: 'social service',
84: 'sports',
99: 'education'}

All the values inside the dataframe are stored as strings even the activity-code has values as string. Whereas the dictionary keys are of integer type I want to somehow map and replace with missing values in activity using the dictionary with reference to the values stored in activity-code column. So the desired output dataframe should be something like this,

> print(df)
[Out:]
activity-code    activity
-------------------------
0                unknown
99               education
84               sports
72;99            social service;education
57               recreational
57;99;11         recreational;education;cultural
11               cultural

This is what I've tried so far,

df['new-activity'] = df['activity-code'].str.split(';').apply(lambda x: ';'.join([act_dict[int(i)] for i in x]))

but I'm getting KeyError for single values where the activity-codes aren't single code values. The error says KeyError: 0

How do i map the dictionary values to the missing values in activity column of dataframe?

U13-Forward · Accepted Answer · 2019-04-02 04:16:36Z

2

Use apply and str.split, than in apply, use a list comprehension and join it by ';':

df['activity'] = df['activity-code'].str.split(';').apply(lambda x: ';'.join([act_dict[int(i)] for i in x]))

And now:

print(df)

Output:

  activity-code                         activity
0             0                          unknown
1            99                        education
2            84                           sports
3         72;99         social service;education
4            57                     recreational
5      57;99;11  recreational;education;cultural
6            11                         cultural

answered Apr 2, 2019 at 4:16

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Aman Singh Over a year ago

Hi @U9-Forward Please see the description in question, I already tried this approach.....I'm getting KeyError for records where there's a single activity-code.....this line of code works for records where there are multiple activity-codes in the same record but not at places where there's only single activity-code

Aman Singh Over a year ago

I just rechecked it was probabaly because the dictionary had no key value pair for '0'. I'm sorry it was a mistake on my end, I never validated the keys in dictionary. But thanks....the moment you said it worked for you, i just cross checked and found out I didn't had any key as zero. Thanks for the help :)

Loochie · Accepted Answer · 2019-04-02 06:35:44Z

0

Well in case there is no values against 0 in your dictionary you can use filter():

df['activity']= df['activity-code'].apply(lambda x:'; '.join(list(filter(None,map(act_dict.get,list(map(int,x.split(';'))))))))

answered Apr 2, 2019 at 6:35

Loochie

2,47215 silver badges20 bronze badges

Collectives™ on Stack Overflow

mapping missing values in one column of pandas dataframe using dictionary with reference to another column values

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related