0

I want to slice rows in df1 in between time intervals StartTime and EndTime in df2 on a case by case basis(by values in column Group_Id in df2). Then concatenate the multiple slices together given they are of the same formats.

So this is df1:

      Timestamp           Group_Id      Data
2013-10-20 00:00:05.143    11           14
2013-10-21 00:05:10.377    11           15
2013-10-22 14:22:15.501    11           19
                   ...
2019-02-05 00:00:05.743    101          21
2019-02-10 00:00:10.407    101          33

and df2:

EndTime          StartTime             Group_Id
27/10/13 16:08   20/10/13 16:08          11
03/12/16 16:11   26/11/16 16:11          2
24/10/14 12:08   17/10/14 12:08          11
04/07/17 08:00   27/06/17 08:00          100
03/04/13 14:10   27/03/13 14:10          26
15/11/18 17:00   08/11/18 17:00          46
11/02/19 00:20   04/02/19 00:20          101

Step1: We start from first row in column Group_Id,df2: 11

Step2: Copy & Paste corresponding rows in df1 that lie between EndTime & StartTime for Group_Id==11

Step3: Concat all sliced subsets from each row in Group_Id(df2)

Hopefully final dataset df3 looks like this:

Group_Id EndTime         StartTime      Timestamp                 Data
11       27/10/13 16:08  20/10/13 16:08 2013-10-20 20:00:05.143   14
11       27/10/13 16:08  20/10/13 16:08 2013-10-21 00:05:10.377   15
11       27/10/13 16:08  20/10/13 16:08 2013-10-22 14:22:15.501   19
                             ...
101      11/02/19 00:20  04/02/19 00:20 2019-02-05 00:00:05.743   21
101      11/02/19 00:20  04/02/19 00:20 2019-02-10 00:00:10.407   33
                             ...

A bad Pseudo code:

for i in df2['Group_Id']:
  if i = df1['Group_Id'],
  dfxx = df1[(df1['Timestamp'] <= df2.loc[i, 'EndTime']) & df1['Timestamp'] > (df2.loc['EndTime'] - dt.timedelta(days=7)])                                                                   
  pd.concat(dfxx for all i)
  i = i+1 

Hope this helps to better illustrate the problem.

2
  • df1.Timestamp 2013-10-20 00:00:05.143 is outside of 27/10/13 16:08 20/10/13 16:08. Why is it in the output? Commented Oct 24, 2019 at 2:37
  • @Andy L. thanks that's a typo I just fixed it Commented Oct 24, 2019 at 2:41

2 Answers 2

0

Convert df1.Timestamp to datetime. Merge on Group_Id. Create IntervalIndex from start and end of df3. Use listcomp to create True/False mask m and slice df3.

df1.Timestamp = pd.to_datetime(df1.Timestamp)
df3 = df2.merge(df1, on='Group_Id')
iix = pd.IntervalIndex.from_tuples([*df3[['StartTime','EndTime']].apply(pd.to_datetime, dayfirst=True).to_records(index=False)], 
                                   closed='both')
m = [x in iix[i] for i, x in enumerate(df3.Timestamp)]

df3.loc[m]

Out[494]:
          EndTime       StartTime  Group_Id               Timestamp  Data
0  27/10/13 16:08  20/10/13 16:08        11 2013-10-20 20:00:05.143    14
1  27/10/13 16:08  20/10/13 16:08        11 2013-10-21 00:05:10.377    15
2  27/10/13 16:08  20/10/13 16:08        11 2013-10-22 14:22:15.501    19
6  11/02/19 00:20  04/02/19 00:20       101 2019-02-05 00:00:05.743    21
7  11/02/19 00:20  04/02/19 00:20       101 2019-02-10 00:00:10.407    33
Sign up to request clarification or add additional context in comments.

4 Comments

thank you but I don't know why the output is empty with only headings
@nilsinelabore: there is something unusual in your real dataset which is not in the sample data you provided. You may do line-by-line command above and check the result of each line to see where it fails on your real dataset.
thanks I can run it now but it seems Timestamp is not filtered by the StartTime and EndTime
@nilsinelabore: after creating iix on your real dataset, check it to see whether it is dtype IntervalIndex with values from df3.StartTime, df3.EndTime (note: df3 is the result from merge) and check df1.Timestamp is dtype datetime
0

You should be able to accomplish this with a merge based on your example.

df1.merge(df2,on='Group_Id',how='left')

1 Comment

thanks I don't think it'll work as Group_Id is not unique..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.