1

I need to select the rows +- 1sec from specific lines (marked as to_pick)

I can do it with loops, but I looking for more elegant way

import pandas as pd
import numpy as np
N = 1000
tidx = pd.date_range('2019-07-01 09:30:00', periods=N, freq='S')
np.random.seed(3)
data = np.random.randn(N)
ts = tidx + pd.to_timedelta(pd.np.random.randn(N), unit='s')

df = pd.DataFrame({'Time': ts, 'to_pick': pd.np.random.randn(N) > 0.98, 'start': ts - pd.to_timedelta(1, 's'), 'end': ts + pd.to_timedelta(1, 's')})
df.loc[~df['to_pick'], 'start'] = pd.np.nan
df.loc[~df['to_pick'], 'end'] = pd.np.nan

I am looking for something like

 df['Time'].between(df['start', df['end'])

that will work to combine the conditions

2

1 Answer 1

1

If I understand you correctly, you want to select rows whose Time are between start and end when to_pick = true:

start = df.query('to_pick')['start'].values
end = df.query('to_pick')['end'].values
t = df['Time'].values[:, None]

df.loc[np.any((start <= t) & (t <= end), axis=1)]
Sign up to request clarification or add additional context in comments.

2 Comments

this is great! I need to understand how these line works t = df['Time'].values[:, None] (start <= t)
It's using numpy array broadcast. Let me know if you want me to elaborate

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.