2

I am trying to add a new column containing label with this condition:

  • Label 1 if delta time between value in 'time' and dt < 2 hours
  • Label 0 for other case

My current idea:

df = pd.read_csv('./datetimecek.csv')
df['time'] = pd.to_datetime(df['datetime'])

dt = datetime.strptime("19/02/18 19:00", "%d/%m/%y %H:%M")

datetime            time
2018/02/19 16:00    2018-02-19 16:00:00
2018/02/19 17:00    2018-02-19 17:00:00
2018/02/19 18:00    2018-02-19 18:00:00
2018/02/19 19:00    2018-02-19 19:00:00

And then I defined timedelta

a = timedelta(hours=2)

def label(c):
if dt - df['time'] < a:
    return '1'
else:
    return '0'

then

df['label'] = df.apply(label, axis=1)

But I got error: 'The truth value of a Series is ambiguous. Use a.empty, a.bool()...

Is there anyway I can fix this?

1
  • 1
    I think you meant to use c in the function definition of label instead of the entire df existing in global scope. Commented Jan 29, 2019 at 6:15

1 Answer 1

1

If want set strings 0 and 1:

df['label'] = np.where(dt - df['time'] < a, '1','0')

Or alternative by @Dark:

df['label'] = (dt - df['time'] < a).astype(int).astype(str)
print (df)
           datetime                time label
0  2018/02/19 16:00 2018-02-19 16:00:00     0
1  2018/02/19 17:00 2018-02-19 17:00:00     0
2  2018/02/19 18:00 2018-02-19 18:00:00     1
3  2018/02/19 19:00 2018-02-19 19:00:00     1

print (type(df.loc[0, 'label']))
<class 'str'>

If want set integers 0 and 1:

df['label'] = (dt - df['time'] < a).astype(int)

Alternative:

df['label'] = np.where(dt - df['time'] < a, 1,0)
print (df)
           datetime                time label
0  2018/02/19 16:00 2018-02-19 16:00:00     0
1  2018/02/19 17:00 2018-02-19 17:00:00     0
2  2018/02/19 18:00 2018-02-19 18:00:00     1
3  2018/02/19 19:00 2018-02-19 19:00:00     1

print (type(df.loc[0, 'label']))
<class 'numpy.int32'>

Is there anyway I can fix this?

Yes, need change df to c for working with scalars:

def label(c):
    if dt - c['time'] < a:
        return '1'
    else:
        return '0'
Sign up to request clarification or add additional context in comments.

1 Comment

I was close np.where(dt-df['time']<a).astype(int).astype(str) :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.