8

I have a problem to solve in my pandas dataframe with Python3. I have two dataframes - the first one is as;

    ID Name  Linked Model 1  Linked Model 2  Linked Model 3
0  100    A          1111.0          1112.0             NaN
1  101    B          1112.0          1113.0          1115.0
2  102    C             NaN             NaN             NaN
3  103    D          1114.0             NaN             NaN
4  104    E          1114.0          1111.0          1112.0

the second one is;

   Model ID Name
0      1111    A
1      1112  A,B
2      1113    C
3      1114    D
4      1115    Q
5      1116    Z
6      1117    E
7      1118    W

So the code should look up the value in - for instance in Linked Model 1 column and find the corresponding value in Name column in the second dataframe so that the ID can be replaced with name just like as shown in the result;

enter image description here

So as you can see in the result output, None stays as None (could be replaced numpy N/As) and the names from the second dataframe are now replaced with their corresponding Model IDs in the first dataframe.

I am looking forward to hearing your solutions!

Thanks

5
  • 1
    This is replace problem . Commented Dec 17, 2018 at 15:41
  • 3
    However, the Question is good but again the basic problem of attaching picture is there which really not helps to reproduce the problem and place a answer, better would be placing code syntax would help in real. Creating dataframe and devices solution is time consuming. Commented Dec 17, 2018 at 15:53
  • @coldspeed, thnx, this indeed helps a lot because showcasing pics hels no way! Commented Dec 17, 2018 at 16:00
  • Yes right. Sorry guys for picture-source data, I appreciate your feedback! Lesson learned Commented Dec 18, 2018 at 8:47
  • Does this answer your question? Python - replace values in dataframe based on another dataframe match Commented Jul 22, 2020 at 17:18

2 Answers 2

10

Initialise a replacement dictionary and use df.replace to map those IDs to Names.

m = df2.set_index('Model ID')['Name'].to_dict()
v = df.filter(like='Linked Model')
df[v.columns] = v.replace(m)

df
    ID Name Linked Model 1 Linked Model 2 Linked Model 3
0  100    A              A            A,B            NaN
1  101    B            A,B              C              Q
2  102    C            NaN            NaN            NaN
3  103    D              D            NaN            NaN
4  104    E              D              A            A,B
Sign up to request clarification or add additional context in comments.

1 Comment

I wonder whether there is a faster parallelized way similar to stata since on my 1mln rows x 15 cols data pandas takes more than one hour (I stopped it) while with a simple stata foreach loop takes about 3-4 minutes foreach var of varlist var1-var15{ vlookup `var', gen(q`var') key(id) value(name)}
2

First attempt to answer a python question, so while this is certainly longer than coldspeed's answer, it makes more sense to me using the melt, merge, and pivot funcitons.

import pandas as pd
import numpy as np

# make an object from the first dataset

df_1 = pd.DataFrame(
  {"ID" : [100, 101, 102, 103, 104],
  "Name" : ["A", "B", "C", "D", "E"],
  "Linked Model 1" : [1111, 1112, np.nan, 1114, 1114],
  "Linked Model 2" : [1112, 1113, np.nan, np.nan, 1111],
  "Linked Model 3" : [np.nan, 1115, np.nan, np.nan, 1112]})

# make an object for the second data set

df_2 = pd.DataFrame(
  {"Model ID" : [1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118],
  "Name" : ["A", "A,B", "C", "D", "Q", "Z", "E", "W"]})

# tidy the data
df_1 = pd.melt(df_1, ["ID", "Name"]) 

# left join the second data set
df_1 = pd.merge(df_1, df_2, how='left', left_on='value', right_on='Model ID').reset_index()

#pivot the data back out to achieve the desired format
df_1 = df_1.pivot_table(index='ID', 
                        columns='variable', 
                        values='Name_y', 
                        aggfunc='first', 
                        dropna=False))

variable Linked Model 1 Linked Model 2 Linked Model 3
ID                                                   
100                   A            A,B            NaN
101                 A,B              C              Q
102                 NaN            NaN            NaN
103                   D            NaN            NaN
104                   D              A            A,B

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.