1

I have the following 2 dataframes with the same structure (Same row names and column names) but different values:

DF1

date A B C D
2005-01-01 5 10 15 20
2005-01-02 1 2 3 4
2005-01-03 5 10 15 20
2005-01-04 5 10 15 20
... ... ... ... ...

DF2

date A B C D
2005-01-01 0,1 0,2 0,3 0,4
2005-01-02 0,2 0,4 0,6 0,8
2005-01-03 0,3 0,6 0,9 1
2005-01-04 0,4 0,5 0,6 0,7
... ... ... ... ...

My goal is to combine them as follows:

date Name DF1_value DF2_value
2005-01-01 A 5 0,1
2005-01-01 B 10 0,2
2005-01-01 C 15 0,3
2005-01-01 D 20 0,4
2005-01-02 A 1 0,2
2005-01-02 B 2 0,4
2005-01-02 C 3 0,6
2005-01-02 D 4 0,8
... ... ... ...

I tried to use melt() function for each dataframe and then concatenate (concat()) them but it creates unnecessary rows for the column "Name".

Any help would be most appreciated!

1 Answer 1

3

Solution with DataFrame.melt for both DataFrames, then use concat only by DF2_value:

df11 = df1.melt('date', var_name='Name', value_name='DF1_value')
df22 = df2.melt('date', var_name='Name', value_name='DF2_value')

df = pd.concat([df11, df22['DF2_value']], axis=1)
print (df)
          date Name  DF1_value DF2_value
0   2005-01-01    A          5       0,1
1   2005-01-02    A          1       0,2
2   2005-01-03    A          5       0,3
3   2005-01-04    A          5       0,4
4   2005-01-01    B         10       0,2
5   2005-01-02    B          2       0,4
6   2005-01-03    B         10       0,6
7   2005-01-04    B         10       0,5
8   2005-01-01    C         15       0,3
9   2005-01-02    C          3       0,6
10  2005-01-03    C         15       0,9
11  2005-01-04    C         15       0,6
12  2005-01-01    D         20       0,4
13  2005-01-02    D          4       0,8
14  2005-01-03    D         20         1
15  2005-01-04    D         20       0,7

Or use DataFrame.set_index with DataFrame.stack for MultiIndex and join together:

df = (pd.concat([df1.set_index('date').stack().rename('DF1_value'),
                df2.set_index('date').stack().rename('DF1_value')], axis=1)
        .rename_axis(['date','Name'])
        .reset_index())
print (df)
          date Name DF1_value DF1_value
0   2005-01-01    A         5       0,1
1   2005-01-01    B        10       0,2
2   2005-01-01    C        15       0,3
3   2005-01-01    D        20       0,4
4   2005-01-02    A         1       0,2
5   2005-01-02    B         2       0,4
6   2005-01-02    C         3       0,6
7   2005-01-02    D         4       0,8
8   2005-01-03    A         5       0,3
9   2005-01-03    B        10       0,6
10  2005-01-03    C        15       0,9
11  2005-01-03    D        20         1
12  2005-01-04    A         5       0,4
13  2005-01-04    B        10       0,5
14  2005-01-04    C        15       0,6
15  2005-01-04    D        20       0,7
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you for the answers. The second method with stack() gives the following error: ValueError: cannot handle a non-unique multi-index!. Any idea why it could be?
@complog - It means there are duplicated datetimes or duplicated columns names. You can test it by print (df1[df1['date'].duplicated(keep=False)]) or df1.columns[df1.columns.duplicated(keep=False)], similar for df2
Thank you for the comment. One last question, what if I want to make only one column (for example only A in the column "Name")? What would you suggest for that? Thanks.
It depends. If need filter before my solution use df1=df1[['date', 'A']].copy() and df2=df2[['date', 'A']].copy(), if need test after use df = df[df['Name'].eq('A')]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.