0

I have two dataframes that look like this:

df1
Date    ScoreMean 06 Jul
0   2020-07-06  6.818316
1   2020-07-07  7.644018
2   2020-07-08  9.076809
3   2020-07-09  8.409158
4   2020-07-10  6.788774
5   2020-07-11  7.302806
6   2020-07-12  7.051699
7   2020-07-13  7.485968
8   2020-07-14  7.759232
9   2020-07-15  7.908419
10  2020-07-16  8.316100
11  2020-07-17  8.830133
12  2020-07-18  8.310192
13  2020-07-19  8.135894

df2
Date    ScoreMean 08 Jul
0   2020-07-08  7.844904
1   2020-07-09  6.565731
2   2020-07-10  6.791728
3   2020-07-11  6.706056
4   2020-07-12  6.545052
5   2020-07-13  7.163959
6   2020-07-14  8.199409
7   2020-07-15  8.115214
8   2020-07-16  8.234860
9   2020-07-17  8.370753
10  2020-07-18  8.036928
11  2020-07-19  7.927622
12  2020-07-20  7.890694
13  2020-07-21  7.989660

I tried pd.merge(df1, df2, on="Date"), but this produces:

    Date    ScoreMean 06 Jul    ScoreMean 08 Jul
0   2020-07-08  9.076809    7.844904
1   2020-07-09  8.409158    6.565731
2   2020-07-10  6.788774    6.791728
3   2020-07-11  7.302806    6.706056
4   2020-07-12  7.051699    6.545052
5   2020-07-13  7.485968    7.163959
6   2020-07-14  7.759232    8.199409
7   2020-07-15  7.908419    8.115214
8   2020-07-16  8.316100    8.234860
9   2020-07-17  8.830133    8.370753
10  2020-07-18  8.310192    8.036928
11  2020-07-19  8.135894    7.927622

However, what I want is for the tail ends of the dates to show up in the new dataframe along with the associated data. For example, in this case the new dataframe should contain 2020-07-06 and 2020-07-07 at the top, and 2020-07-20 and 2020-07-21 at the bottom. In days where the dates do not overlap, I would like the value to be NaN or zero. How would I accomplish this using pandas?

0

2 Answers 2

1

use the how argument of merge set to "outer":

pd.merge(df1, df2, on="Date", how="outer")

To understand the how arugment, read up about types of SQL joins.

Sign up to request clarification or add additional context in comments.

Comments

1

Here you go:

df1.merge(df2, on='Date', how='outer')                                  

Output:

          Date  ScoreMean 06 Jul  ScoreMean 08 Jul
0   2020-07-06               6.8               NaN
1   2020-07-07               7.6               NaN
2   2020-07-08               9.1               7.8
3   2020-07-09               8.4               6.6
4   2020-07-10               6.8               6.8
5   2020-07-11               7.3               6.7
6   2020-07-12               7.1               6.5
7   2020-07-13               7.5               7.2
8   2020-07-14               7.8               8.2
9   2020-07-15               7.9               8.1
10  2020-07-16               8.3               8.2
11  2020-07-17               8.8               8.4
12  2020-07-18               8.3               8.0
13  2020-07-19               8.1               7.9
14  2020-07-20               NaN               7.9
15  2020-07-21               NaN               8.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.