0

Let's say I have a data frame with 2 columns, 1st column contains the activities such as work, home, sleep etc. and the 2nd column are the duration of each activity.

while iterating through the rows, I want to find out the duration of the last activity of 'sleep' during the current activity I am in.

Is there an easy way to do that?

my data:

duration = np.random.randint(20, size = 30)
activities = ['work', 'home', 'sleep', 'home','work', 'sleep','work', 'home','sleep', 'home','work', 'sleep','work', 'home','work', 'sleep','work', 'home','work', 'sleep','work', 'home','work', 'sleep','work', 'home','work', 'home', 'work', 'sleep']
activity_df = pd.DataFrame({'activities':activities, 'duration':duration})
1
  • 1
    What is expected output? Commented Jun 14, 2018 at 10:41

2 Answers 2

1

I believe need first filter by boolean indexing and last select last value by iloc:

print (activity_df.loc[activity_df['activities'] == 'sleep', 'duration'].iloc[-1])

Or use where for create NaNs by condition and last forward replace values by ffill:

activity_df['new'] = activity_df['duration'].where(activity_df['activities']=='sleep').ffill()
print (activity_df)
   activities  duration   new
0        work         1   NaN
1        home         5   NaN
2       sleep        11  11.0
3        home         8  11.0
4        work        11  11.0
5       sleep         8   8.0
6        work         9   8.0
7        home        13   8.0
8       sleep        19  19.0
9        home         6  19.0
10       work        19  19.0
11      sleep        16  16.0
12       work        16  16.0
13       home         1  16.0
14       work         5  16.0
15      sleep        10  10.0
16       work         1  10.0
17       home         5  10.0
18       work         0  10.0
19      sleep         4   4.0
20       work        12   4.0
21       home         4   4.0
22       work        10   4.0
23      sleep         6   6.0
24       work        17   6.0
25       home        14   6.0
26       work         7   6.0
27       home         5   6.0
28       work        10   6.0
29      sleep         8   8.0
Sign up to request clarification or add additional context in comments.

6 Comments

thank you I think I can complete my function with this
you are a god of stack overflow
@MarvieDemit - I think not, there is many better coders, but maybe have no enough time to answering.
I am sure there are but this is just my personal impression, that your just seconds away answering my questions, mostly dumb but important question for me as a new python coder! I just saw that you are from Bratislava! I currently live in Vienna from the Philippines and I want to treat you some time for a beer xD
I will write you per mail throughout the day :D write you later! Thanks again!
|
1

You can try this also, it is somewhat similar to jezrael answer.

activity_df[activity_df['activities'] == 'sleep']['duration'].iloc[-1]

2 Comments

Except chained indexing is explicitly discouraged in the docs. loc is clearer and less error-prone.
Agreed. It is more effective to use the .loc

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.