I have the two following dataframes that I want to merge.
df1:
id time station
0 a 22.08.2017 12:00:00 A1
1 b 22.08.2017 12:00:00 A3
2 a 22.08.2017 13:00:00 A2
...
pivot:
station A1 A2 A3
0 time
1 22.08.2017 12:00:00 10 12 11
2 22.08.2017 13:00:00 9 7 3
3 22.08.2017 14:00:00 2 3 4
4 22.08.2017 15:00:00 3 2 7
...
it should look like:
merge:
id time station value
0 a 22.08.2017 12:00:00 A1 10
1 b 22.08.2017 12:00:00 A3 11
2 a 22.08.2017 13:00:00 A2 7
...
Now I want to add a column in the data frame with the right value from the pivot table. I failed including the column labels for the merge. I constructed something like that, but it does not work:
merge = pd.merge(df1, pivot, how="left", left_on=["time", "station"], right_on=["station", pivot.columns])
Any help?
EDIT:
As advised, instead of the pivot table I tried to use the following data:
df2:
time station value
22.08.2017 12:00:00 A1 10
22.08.2017 12:00:00 A2 12
22.08.2017 12:00:00 A3 11
...
22.08.2017 13:00:00 A1 9
22.08.2017 13:00:00 A2 7
22.08.2017 13:00:00 A3 3
The table contains about 1300 different stations for every timestamp. All in all I have more than 115.000.000 rows. My df1 have 5.000.000 rows.
Now I tried to merge df1.head(100) and df2, but in the result all values are nan. Therefore I used this:
merge = pd.merge(df1.head(100), df2, how="left", on=["time", "station"])
Another problem is that the merge takes a few minutes so that I expect the whole df1 will take several days.
pivotdataframe. Do you have sample data to recreate this? I'm wondering if there is a better/easier way to pivot this.