2

I want to create a new column in a Pandas DataFrame by evaluating multiple conditions in an if-then-else block.

if events.hour <= 6:
    events['time_slice'] = 'night'
elif events.hour <= 12:
    events['time_slice'] = 'morning'
elif events.hour <= 18:
    events['time_slice'] = 'afternoon'
elif events.hour <= 23:
    events['time_slice'] = 'evening'

When I run this, I get the error below:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

So I tried to solve this by adding the any statement like shown below:

if (events.hour <= 6).any():
    events['time_slice'] = 'night'
elif (events.hour <= 12).any():
    events['time_slice'] = 'morning'
elif (events.hour <= 18).any():
    events['time_slice'] = 'afternoon'
elif (events.hour <= 23).any():
    events['time_slice'] = 'evening'

Now I do not get any error, but when I check the unique values of time_slice, it only shows 'night'

np.unique(events.time_slice)

array(['night'], dtype=object)

How can I solve this, because my data contains samples that should get 'morning', 'afternoon' or 'evening'. Thanks!

3 Answers 3

4

you can use pd.cut() method in order to categorize your data:

Demo:

In [66]: events = pd.DataFrame(np.random.randint(0, 23, 10), columns=['hour'])

In [67]: events
Out[67]:
   hour
0     5
1    17
2    12
3     2
4    20
5    22
6    20
7    11
8    14
9     8

In [71]: events['time_slice'] = pd.cut(events.hour, bins=[-1, 6, 12, 18, 23], labels=['night','morning','afternoon','evening'])

In [72]: events
Out[72]:
   hour time_slice
0     5      night
1    17  afternoon
2    12    morning
3     2      night
4    20    evening
5    22    evening
6    20    evening
7    11    morning
8    14  afternoon
9     8    morning
Sign up to request clarification or add additional context in comments.

1 Comment

Maybe the first bin would be -1 because if the entry is 0, it might make it NaN?
2

You could create a function:

def time_slice(hour):
    if hour <= 6:
        return 'night'
    elif hour <= 12:
        return 'morning'
    elif hour <= 18:
        return 'afternoon'
    elif hour <= 23:
        return 'evening'

then events['time_slice'] = events.hour.apply(time_slice) should do the trick.

Comments

2

Here's a NumPy approach to it -

tags = ['night','morning','afternoon','evening']
events['time_slice'] = np.take(tags,((events.hour.values-1)//6).clip(min=0))

Sample run -

In [130]: events
Out[130]: 
   hour time_slice
0     0      night
1     8    morning
2    16  afternoon
3    20    evening
4     2      night
5    14  afternoon
6     7    morning
7    18  afternoon
8     8    morning
9    22    evening

2 Comments

((events.hour.values-1)//6).clip(min=0) - this is pretty smart
@MaxU I guess so, but works for regular intervals only.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.