I have data in the following format:
df1:
| date1 | animal | animal cages sold |
|---|---|---|
| 1/1/19 10:00:00 | dog | 3 |
| 1/1/19 11:00:00 | horse | 6 |
| 1/5/19 11:00:00 | ferret | 5 |
| 1/12/19 10:00:00 | bird | 2 |
| 1/12/19 11:00:00 | hamster | 3 |
and I want to merge it with the following dataframe: (df2)
| event date | event type | people attended |
|---|---|---|
| 1/1/19 | charity | 7 |
| 1/4/19 | food drive | 10 |
| 1/12/19 | raffle | 15 |
with the desired output:
(the dates from df2 can also have 1/1/19 00:00:00 format, it doesn't matter. But the dates from df1 MUST have the time)
| date | animal | animal cages sold | event type | people attended |
|---|---|---|---|---|
| 1/1/19 | charity | 7 | ||
| 1/1/19 10:00:00 | dog | 3 | ||
| 1/1/19 11:00:00 | horse | 6 | ||
| 1/4/19 | food drive | 10 | ||
| 1/5/19 11:00:00 | ferret | 5 | ||
| 1/12/19 | raffle | 15 | ||
| 1/12/19 10:00:00 | bird | 2 | ||
| 1/12/19 11:00:00 | hamster | 3 |
I have tried output_df = pd.merge(df1, df2, left_on='date1', right_on='event date') but that leads to repeated matches. I only need the rows from df2 to be there once, and be on their own separate row.
I was thinking maybe use df1.append(df2) and then somehow make date1 and event date in the same column so I could then sort by that column.
Please please help!!!
pd.concat([df1, df2.rename(columns={'event date': 'date1'})])?