2

I am trying to find the min() value between two points using between_time. I have created two columns that I would like to use as my start and end time to find the minimum value and add the output to a new column:

This is a snip of the df:

df[['Date_Time','d1_idx_last','Low']]



                                   Date_Time    d1_idx_last              Low
Timestamp           
2015-09-01 09:30:00.000 2015-09-01 09:30:00.000 2015-09-01 16:14:51.000 1887.750
2015-09-01 09:30:01.000 2015-09-01 09:30:01.000 2015-09-01 16:14:51.000 1888.250
2015-09-01 09:30:01.200 2015-09-01 09:30:01.200 2015-09-01 16:14:51.000 1888.000
2015-09-01 09:30:10.100 2015-09-01 09:30:10.100 2015-09-01 16:14:51.000 1889.250
2015-09-01 09:30:11.100 2015-09-01 09:30:11.100 2015-09-01 16:14:51.000 1889.500

I am trying to use this code:

df.Low.between_time(df.Date_Time, df.d1_idx_last, include_start=True, include_end=True)

and get this error:

Cannot convert arg [1441099800000000000 1441099801000000000 1441099801200000000 ...,
 1470924200100000000 1470924369230000000 1470924793157000000] to a time

The columns 'Date_Time' & 'd1_idx_last' are both type pandas.tslib.Timestamp.

Update to clarify:

So if we look at the first row it shows

'Date_Time' 2015-09-01 09:30:00.000 'd1_idx_last'2015-09-01 16:14:51.000

On this row the time between 'Date_Time' & 'd1_idx_last' captures a full trading day (09:30-16:15) and I want the low of the time between these two points.

On this day the market went as low as 1863.500 so that would be the min value for (09:30-16:15).

df[['Low']]['2015-09-01'].min()

Low   1863.500
dtype: float64

If the low of 1863.500 came at 13:00 the rolling low would be higher after this point.

I want a new column called df['subset_low'] that checks 'Date_Time' & 'd1_idx_last' on each row and finds the low between this period and adds it to df['subset_low']. It is checking the current time and the last point of the day and showing what the low will be between this time.

Another example for @Maxu using fake data in the Low and (desired) subset_low columns: enter image description here

2
  • 1
    You need to pass between_time scalar values - not a Series object... Commented Aug 14, 2016 at 17:16
  • Would I do this by using apply to implement on a per row basis? Can anyone help fix my code please? Commented Aug 14, 2016 at 19:11

1 Answer 1

2

UPDATE: using ugly method - apply(..., axis=1):

In [170]: df['subset_low'] = df.apply(lambda r: df.query('@r.Date_Time <= index <= @r.d1_idx_last').Low.min(), axis=1)

In [171]: df
Out[171]:
                                      Date_Time         d1_idx_last  Low  subset_low
idx
2015-09-01 09:30:00.000 2015-09-01 09:30:00.000 2015-09-01 16:14:51    2           1
2015-09-01 09:30:01.000 2015-09-01 09:30:01.000 2015-09-01 16:14:51    1           1
2015-09-01 09:30:01.200 2015-09-01 09:30:01.200 2015-09-01 16:14:51    3           3
2015-09-01 09:30:10.100 2015-09-01 09:30:10.100 2015-09-01 16:14:51    4           3
2015-09-01 09:30:11.100 2015-09-01 09:30:11.100 2015-09-01 16:14:51    3           3

OLD answer:

as @JonClements already said the between_time() method expects scalar values for the first two arguments- (start_time, end_time) and it checks only the time part.

Demo:

In [72]: df.between_time('09:30:10','09:30:15')
Out[72]:
                                      Date_Time         d1_idx_last      Low
idx
2015-09-01 09:30:10.100 2015-09-01 09:30:10.100 2015-09-01 16:14:51  1889.25
2015-09-01 09:30:11.100 2015-09-01 09:30:11.100 2015-09-01 16:14:51  1889.50

You can use query() method instead

In [70]: df.query('Date_Time <= index <= d1_idx_last')
Out[70]:
                                      Date_Time         d1_idx_last      Low
idx
2015-09-01 09:30:00.000 2015-09-01 09:30:00.000 2015-09-01 16:14:51  1887.75
2015-09-01 09:30:01.000 2015-09-01 09:30:01.000 2015-09-01 16:14:51  1888.25
2015-09-01 09:30:01.200 2015-09-01 09:30:01.200 2015-09-01 16:14:51  1888.00
2015-09-01 09:30:10.100 2015-09-01 09:30:10.100 2015-09-01 16:14:51  1889.25
2015-09-01 09:30:11.100 2015-09-01 09:30:11.100 2015-09-01 16:14:51  1889.50

How do I get the min() of df.Low between Date_Time and d1_idx_last using df.query?

In [74]: df.query('Date_Time <= index <= d1_idx_last').Low.min()
Out[74]: 1887.75
Sign up to request clarification or add additional context in comments.

13 Comments

Thanks Maxu but how do I get the min() of df.Low between Date_Time and d1_idx_last using df.query? The output from your code does not give this.
@adele, i've updated my answer - is that what you want?
the df has 148566 rows and I want to add a new column showing min() value of df.Low between Date_Time and d1_idx_last for each row. I will be using this new column for further analysis later. Do I need to use apply or map maybe to do this? If i use the code you have it gives me one value instead of looking at each row. thanks for the help
@adele, it's not quite clear - could you post a desired data set?
What I am trying too get is essentially a rolling min between Date_Time and d1_idx_last on each row with a new column capturing the output called df['subset_low'].
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.