2

I have my time stored in the format YYYYMMDDhhmm in my dataframe, eg. 200902110403.

Pandas can automatically convert this into a datetime object and I'm doing that like this:

temp_date=(pd.to_datetime(indexed_data.index.str[0:12], infer_datetime_format=True)).to_pydatetime()

(I don't fully understand the difference between a datetime object and a datetimeindex but I don't think that's the source of my problems)

I then use the data2num function from the netcdf4 library to convert this to days since my reference time like this,

days=date2num(temp_date, 'days since 2009-01-01')

This works and returns the days as I want

array([ 212.03333333,  212.03333333,  212.03472222, ...,  242.95416667,
    242.95416667,  242.99583333])

The problem is that it doesn't seem to all work in one go and I don't understand why.

Why doesn't this work?

indexed_data['date']=(pd.to_datetime(indexed_data.index.str[0:12], infer_datetime_format=True)).to_pydatetime()
indexed_data['days']=date2num(indexed_data['date'], 'days since 2009-01-01')

TypeError: ufunc subtract cannot use operands with types dtype('

but this does:

temp_date=(pd.to_datetime(indexed_data.index.str[0:12],infer_datetime_format=True)).to_pydatetime()
indexed_data['date']=temp_date
indexed_data['fdays']=date2num(temp_date, 'days since 2009-01-01')

Thanks!

1 Answer 1

1

I'm not familiar with netcdf4, but you should be able to accomplish what you want without it:

date_strs = ['200902110403', '200902120403', '200902130403', '200902140403', '200902150403']
df = pd.DataFrame(date_strs, columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], infer_datetime_format=True)
df['Date']

0   2009-02-11 04:03:00
1   2009-02-12 04:03:00
2   2009-02-13 04:03:00
3   2009-02-14 04:03:00
4   2009-02-15 04:03:00
Name: Date, dtype: datetime64[ns]

To get the time elapsed since your reference date, you can subtract two datetime objects which returns a timedelta object:

(df['Date'] - pd.to_datetime('2009-01-01'))

0   41 days 04:03:00
1   42 days 04:03:00
2   43 days 04:03:00
3   44 days 04:03:00
4   45 days 04:03:00
Name: Date, dtype: timedelta64[ns]

And if you just want the number of days as an integer, you call the .dt.days accessor on the above series:

df['Days'] = (df['Date'] - pd.to_datetime('2009-01-01')).dt.days
df['Days']

0    41
1    42
2    43
3    44
4    45
Name: Days, dtype: int64

A datetimeindex object is simply a datetime object that is set as the index of your dataframe.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.