Pandas faster implementation of date calculation on every row

Question

I have a dataframe with multiple columns, including analysis_date (datetime), and forecast_hour (int). I want to add a new column called total_hours, which is the sum of the hour component of analysis_date plus the corresponding forecast_hour in that row. Here's a visual example:

original dataframe:

analysis_date | forecast_hour
12-2-19-05    | 3
12-2-19-06    | 3
12-2-19-07    | 3
12-2-19-08    | 3

dataframe after calculation:

analysis_date | forecast_hour | total_hours
12-2-19-05    | 3             | 8
12-2-19-06    | 3             | 9
12-2-19-07    | 3             | 10
12-2-19-08    | 3             | 11

Here is the current logic that does what I want:

df['total_hours'] = df.apply(lambda row: row.analysis_date.hour + row.forecast_hours_out, axis=1)

Unfortunately, this is too slow for my application, it takes around 15 seconds for a dataframe with a few hundred thousand entries. I have tried using the swifter library, but unfortunately, it took approximately as long (if not longer) than my current implementation.

0x5453 · Accepted Answer · 2020-01-08 21:37:40Z

3

apply is slow because it is not vectorized. This should do what you want (assuming df['analysis_date'] is a datetime64):

df['total_hours'] = df['analysis_date'].dt.hour + df['forecast_hour']

edited Jan 8, 2020 at 21:37

answered Jan 8, 2020 at 21:28

0x5453

13.7k2 gold badges36 silver badges70 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Preethi Vaidyanathan Over a year ago

Similar/related question: is it possible for me to add the hours to the analysis_date? something along the hours of df['analysis_date'].dt+ timedelta(hours=df['forecast_hour']) where the output is a datetime64?

0x5453 Over a year ago

@P.V. Yep, pd.to_timedelta is what you are looking for. df['analysis_date'] + pd.to_timedelta(df['forecast_hour'], unit='hours')

Collectives™ on Stack Overflow

Pandas faster implementation of date calculation on every row

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related