1

I have 2 DataFrames, and I want to replace the values in one dataframe, with the values of the other dataframe, base on the columns on the first one. I put the compositions to clarify.

DF1:

             A  B   C   D   E
Date
01/01/2019  1   2   3   4   5
02/01/2019  1   2   3   4   5
03/01/2019  1   2   3   4   5

DF2:

          name1 name2   name3
Date
01/01/2019  A       B       D
02/01/2019  B       C       E
03/01/2019  A       D       E

THE RESULT I WANT:

          name1 name2   name3   
Date
01/01/2019  1       2        4  
02/01/2019  2       3        5  
03/01/2019  1       4        5  

2 Answers 2

1

Try:

result = df2.melt(id_vars="index").merge(
    df1.melt(id_vars="index"),
    left_on=["index", "value"],
    right_on=["index", "variable"],
).drop(columns=["value_x", "variable_y"]).pivot(
    index="index", columns="variable_x", values="value_y"
)

print(result)

The two melt's transform your dataframes to only contain the numbers in one column, and an additional column for the orignal column names:

df1.melt(id_vars='index')

         index variable  value
0   01/01/2019        A      1
1   02/01/2019        A      1
2   03/01/2019        A      1
3   01/01/2019        B      2
4   02/01/2019        B      2
5   03/01/2019        B      2
...

These you can now join on index and value/variable. The last part is just removing a couple of columns and then reshaping the table back to the desired form.

The result is

variable_x  name1  name2  name3
index                          
01/01/2019      1      2      4
02/01/2019      2      3      5
03/01/2019      1      4      5
Sign up to request clarification or add additional context in comments.

2 Comments

consider adding 1-2 line description explaining your code
Sorry, I was running to a meeting and planned to come back later with the explanation :-)
1

Use DataFrame.lookup for each column separately:

for c in df2.columns:
    df2[c] = df1.lookup(df1.index, df2[c])
print (df2)
            name1  name2  name3
01/01/2019      1      2      4
02/01/2019      2      3      5
03/01/2019      1      4      5

General solution is possible different index and columns names:

print (df1)
            A  B  C  D  G
01/01/2019  1  2  3  4  5
02/01/2019  1  2  3  4  5
05/01/2019  1  2  3  4  5

print (df2)
           name1 name2 name3
01/01/2019     A     B     D
02/01/2019     B     C     E
08/01/2019     A     D     E

df1.index = pd.to_datetime(df1.index, dayfirst=True)
df2.index = pd.to_datetime(df2.index, dayfirst=True)

cols = df2.stack().unique()
idx = df2.index
df11 = df1.reindex(columns=cols, index=idx)
print (df11)
              A    B    D    C   E
2019-01-01  1.0  2.0  4.0  3.0 NaN
2019-01-02  1.0  2.0  4.0  3.0 NaN
2019-01-08  NaN  NaN  NaN  NaN NaN

for c in df2.columns:
    df2[c] = df11.lookup(df11.index, df2[c])
print (df2)
            name1  name2  name3
2019-01-01    1.0    2.0    4.0
2019-01-02    2.0    3.0    NaN
2019-01-08    NaN    NaN    NaN

1 Comment

try both I get this error: Row labels must have same size as column labels

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.