2

Let's say you have a dataframe like this:

>>> df = pd.DataFrame({
        'epoch_minute': [i for i in reversed(range(25090627,25635267))],
        'count': [random.randint(11, 35) for _ in range(25090627,25635267)]})
>>> df.head()
   epoch_minute  count
0      25635266     12
1      25635265     20
2      25635264     33
3      25635263     11
4      25635262     35

and some relative epoch minute deltas like this:

day = 1440
week = 10080
month = 302400

How do I accomplish the equivalent of this code block:

for i,r in df.iterrows():
    if r['epoch_minute'] - day in df['epoch_minute'].values and \
            r['epoch_minute'] - week in df['epoch_minute'].values and \
            r['epoch_minute'] - month in df['epoch_minute'].values:
        # do stuff

using this syntax:

valid_rows = df.loc[(df['epoch_minute'] == df['epoch_minute'] - day) &
                    (df['epoch_minute'] == df['epoch_minute'] - week) &
                    (df['epoch_minute'] == df['epoch_minute'] - month]

I understand why the loc select doesn't work, but I'm just asking if there exists a more elegant way to select the valid rows without iterating through the rows of the dataframe.

1 Answer 1

1

Add parentheses and & for bitwise AND with isin for check membership:

valid_rows = df[(df['epoch_minute'].isin(df['epoch_minute'] - day)) &
                (df['epoch_minute'].isin(df['epoch_minute'] - week)) &
                (df['epoch_minute'].isin(df['epoch_minute'] - month))]

valid_rows = df[((df['epoch_minute'] - day).isin(df['epoch_minute'])) &
                ((df['epoch_minute']- week).isin(df['epoch_minute'] )) &
                ((df['epoch_minute'] - month).isin(df['epoch_minute']))]
Sign up to request clarification or add additional context in comments.

3 Comments

Ah yes, you're right about the syntax -- but valid_rows is still empty after executing either of these loc selections whereas the for loop produces correctly identifies 242240 valid rows.
Shouldn't the arguments of isin and the value it's called on be switched like this: (df['epoch_minute'] - day).isin(df['epoch_minute']) & ... ?
@aweeeezy - I test it and get same output, but added to answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.