How to replace pandas dataframe values based on lookup values in another dataframe?

Question

I have a large pandas dataframe with numerical values structured like this:

I need to replace all of the the above cell values with a 'description' that maps to the field name and cell value as referenced in another dataframe structured like this:

>>> df2
  field_name  code description
0          A     1          NO
1          A     2         YES
2          A     3       MAYBE
3          B     1           x
4          B     2           y
5          B     3           z
6          C     1        GOOD
7          C     2         BAD
8          C     3        FINE

The desired output would be like:

>>> df3
     A  B     C
0  YES  x   BAD
1   NO  y  FINE
2  YES  z  GOOD

I could figure out a way to do this on a small scale using something like .map or .replace - however the actual datasets contain thousands of records with hundreds of different combinations to replace. Any help would be really appreciated.

Thanks.

please copy and paste your dataframe, we can copy it with pd.read_clipboard. Also I think you could show the expected output ( for example dataframe) — ansev
– ansev, Commented Apr 15, 2020 at 16:01
Sorry all - i've now pasted the dataframes in and included the desired output - thanks! — J.James
– J.James, Commented Apr 15, 2020 at 16:19
Please provide the data in a convenient format. See stackoverflow.com/questions/20109391/…. — AMC
– AMC, Commented Apr 15, 2020 at 17:43

ansev · Accepted Answer · 2020-04-15 16:21:15Z

1

Use DataFrame.replace with DataFrame.pivot:

df1 = df1.replace(df2.pivot(columns='field_name', index='code', values='description')
                     .to_dict())

maybe you need select columns previously:

df1[cols] = df1[cols].replace(df2.pivot(columns='field_name',
                                        index='code', values='description')
                                 .to_dict())

Output

print(df1)
     A  B     C
0  YES  x   BAD
1   NO  y  FINE
2  YES  z  GOOD

edited Apr 15, 2020 at 16:21

answered Apr 15, 2020 at 16:09

ansev

31k5 gold badges21 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

J.James Over a year ago

Ah looks like this might've worked perfectly! Will try it on the full dataset

Serge Ballesta · Accepted Answer · 2020-04-15 16:43:58Z

0

You can unstack df1, merge with df2 and pivot the result:

df3 = df1.stack().reset_index().rename(
    columns={'level_1': 'field_name', 0: 'code'}).merge(
        df2, 'left', on=['field_name', 'code']).pivot(
            index='level_0', columns='field_name',
            values='description').rename_axis(None).rename_axis(None, axis=1)

answered Apr 15, 2020 at 16:43

Serge Ballesta

150k13 gold badges137 silver badges267 bronze badges

1 Comment

ansev Over a year ago

isn't this clearly slower? I think there are too many changes in the format

Collectives™ on Stack Overflow

How to replace pandas dataframe values based on lookup values in another dataframe?

2 Answers 2

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related