1

Given two dataframes df1, df2 and df3, how to join them such that df3 timestamps is in between start and end in dataframe df1 and df2.

I have to merge the Job ID to df3 based whether the df3'Timestamp' is in df1 or df2 'Start time' and 'End Time', and also match the Node(No.

df1(1230rows*3 columns)

Node      Start Time      End Time      JobID
A         00:03:50        00:05:45      12345
A         00:06:10        00:07:39      56789
A         00:08:30        00:10:45      34567
.
.
.

df2(1130rows*3 columns)

Node      Start Time      End Time      JobID
B         00:02:30        00:07:35      13579
B         00:08:56        00:09:39      24680
B         00:10:32        00:13:47      14680
.
.
.

df3(4002rows*3 columns)

Node      Timestamp     
A         00:05:42       
A         00:09:50       
A         00:11:27       
B         00:04:48
B         00:09:59
B         00:10:32
.
.
.
.

Expected Output: df3(4002rows*3 columns)

No.       Timestamp       Job ID
A         00:05:42        12345              
A         00:09:50        34567       
A         00:11:27        NaN
B         00:04:48        13579
B         00:09:59        NaN
B         00:10:32        14680
.
.
.
.
0

2 Answers 2

1

You can use .merge() and filter with .between(), as follows:

df1_3 = df1.merge(df3, on='Node')
df1_3_filtered = df1_3[df1_3['Timestamp'].between(df1_3['Start Time'], df1_3['End Time'])]

df2_3 = df2.merge(df3, on='Node')
df2_3_filtered = df2_3[df2_3['Timestamp'].between(df2_3['Start Time'], df2_3['End Time'])]

df_out = df1_3_filtered.append(df2_3_filtered)[['Node', 'JobID', 'Timestamp']]
df_out = df3.merge(df_out, how='left')

Result:

print(df_out)


  Node Timestamp    JobID
0    A  00:05:42  12345.0
1    A  00:09:50  34567.0
2    A  00:11:27      NaN
3    B  00:04:48  13579.0
4    B  00:09:59      NaN
5    B  00:10:32  14680.0

Edit

If you have multiple dataframes with same structure as df1 and df2 and want to merge with df3, you can do:

Simply put all your dataframes into the list List_dfs below:

List_dfs = [df1, df2]              # put all your dataframes of same structure here

Then, run the codes below. You will get the merged and filtered results of all these dataframes in df_out:

df_all_filtered = pd.DataFrame()   # init. df for acculumating filtered results
for df in List_dfs:
    dfx_3 = df.merge(df3, on='Node')
    dfx_3_filtered = dfx_3[dfx_3['Timestamp'].between(dfx_3['Start Time'], dfx_3['End Time'])]
    df_all_filtered = df_all_filtered.append(dfx_3_filtered)   # append filtered result

df_out = df_all_filtered[['Node', 'JobID', 'Timestamp']]
df_out = df3.merge(df_out, how='left')
Sign up to request clarification or add additional context in comments.

1 Comment

Thx a lot. I can work for it. I would like to ask one more question. If i have more dataframes which is similar to df1 and df2 and i still have to merge to df5 based on time and node. What i have to add?
0

Another method would be to resample your shift data into seconds then merge on the re-sampled data.

def resample_shifts(dataframe : pd.DataFrame, indices : list,
                    start_col : str, end_col : str) -> pd.DataFrame:
    
    return dataframe.set_index(indices)\
                    .apply(lambda x : pd.date_range(x[start_col], 
                                                    x[end_col],freq='s')
                    ,1).explode().rename('Timestamp').reset_index()

df1a = resample_shifts(df1,
               ['Node','JobID'],
               'Start_Time',
                'End_Time'
               )

df2a = resample_shifts(df2,
               ['Node','JobID'],
               'Start_Time',
                'End_Time'
               )

df3['Timestamp'] = pd.to_datetime(df3['Timestamp'])

df3a = pd.merge(pd.concat([df1a,df2a]),df3,on=['Node','Timestamp'],how='right')

print(df3a)
  Node    JobID           Timestamp
0    A  12345.0 2021-06-28 00:05:42
1    A  34567.0 2021-06-28 00:09:50
2    A      NaN 2021-06-28 00:11:27
3    B  13579.0 2021-06-28 00:04:48
4    B      NaN 2021-06-28 00:09:59
5    B  14680.0 2021-06-28 00:10:32

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.