1

I want to creat a column by another one which dtype is datetime. The details as below:

 df['finished']

0   2019-01-28 15:53:48
1   2019-01-28 17:11:15
2   2019-01-28 17:12:14
3   2019-01-28 17:12:15
4   2019-01-28 17:12:41
Name: finish, dtype: datetime64[ns]

df['finish'].map(lambda x: 30 if x<='2019-02-01 21:00:00' else 5)

TypeError: Cannot compare type 'Timestamp' with type 'str

1 Answer 1

1

If compare in pandas vectorized way - all column with value, is not necessary convert to datetimes, because pandas handle this comparison:

df['new'] = np.where(df['finish'] <='2019-02-01 21:00:00', 30, 5)
print (df)
               finish  new
0 2019-01-28 15:53:48   30
1 2019-01-28 17:11:15   30
2 2019-01-28 17:12:14   30
3 2019-01-28 17:12:15   30
4 2019-01-28 17:12:41   30

Your solution failed, because compare scalars, so is necessary compare by datetimes in loop - call lambda function for each value.

Also is not recommended, because slow. But solution is convert string to Timestamp or datetime:

df['new'] = df['finish'].map(lambda x: 30 if x<=pd.Timestamp('2019-02-01 21:00:00') else 5)

Performance:

#[5000 rows x 1 columns]
df = pd.concat([df] * 1000, ignore_index=True)

In [165]: %timeit df['new1'] = np.where(df['finish'] <='2019-02-01 21:00:00', 30, 5)
465 µs ± 64.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [166]: %timeit df['new2'] = df['finish'].map(lambda x: 30 if x<=pd.Timestamp('2019-02-01 21:00:00') else 5)
22.4 ms ± 228 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Sign up to request clarification or add additional context in comments.

1 Comment

great,but why my solution fails? Could you explain?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.