1

I have a dataframe as

> print(df)
[Out:]
activity-code    activity
-------------------------
0                unknown
99               NaN
84               sports
72;99            NaN
57               recreational
57;99;11         NaN
11               NaN

and a dictionary with activity-codes as keys,

> print(act_dict)
[Out:]
{10: 'unknown',
11: 'cultural',
57: 'recreational',
72: 'social service',
84: 'sports',
99: 'education'}

All the values inside the dataframe are stored as strings even the activity-code has values as string. Whereas the dictionary keys are of integer type I want to somehow map and replace with missing values in activity using the dictionary with reference to the values stored in activity-code column. So the desired output dataframe should be something like this,

> print(df)
[Out:]
activity-code    activity
-------------------------
0                unknown
99               education
84               sports
72;99            social service;education
57               recreational
57;99;11         recreational;education;cultural
11               cultural

This is what I've tried so far,

df['new-activity'] = df['activity-code'].str.split(';').apply(lambda x: ';'.join([act_dict[int(i)] for i in x]))

but I'm getting KeyError for single values where the activity-codes aren't single code values. The error says KeyError: 0

How do i map the dictionary values to the missing values in activity column of dataframe?

2 Answers 2

2

Use apply and str.split, than in apply, use a list comprehension and join it by ';':

df['activity'] = df['activity-code'].str.split(';').apply(lambda x: ';'.join([act_dict[int(i)] for i in x]))

And now:

print(df)

Output:

  activity-code                         activity
0             0                          unknown
1            99                        education
2            84                           sports
3         72;99         social service;education
4            57                     recreational
5      57;99;11  recreational;education;cultural
6            11                         cultural
Sign up to request clarification or add additional context in comments.

2 Comments

Hi @U9-Forward Please see the description in question, I already tried this approach.....I'm getting KeyError for records where there's a single activity-code.....this line of code works for records where there are multiple activity-codes in the same record but not at places where there's only single activity-code
I just rechecked it was probabaly because the dictionary had no key value pair for '0'. I'm sorry it was a mistake on my end, I never validated the keys in dictionary. But thanks....the moment you said it worked for you, i just cross checked and found out I didn't had any key as zero. Thanks for the help :)
0

Well in case there is no values against 0 in your dictionary you can use filter():

df['activity']= df['activity-code'].apply(lambda x:'; '.join(list(filter(None,map(act_dict.get,list(map(int,x.split(';'))))))))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.