4

I am facing an issue while converting one of my datetime columns in to int. My code is:

df['datetime_column'].astype(np.int64)

The error which I am getting is:

invalid literal for int() with base 10: '2018-02-25 09:31:15'

I am quite clueless about what is happening as the conversion for some of my other datetime columns are working fine. Is there some issue with the range of the date which can be converted to int?

4
  • 1
    What integer do you expect that '2018-02-25 09:31:15' be converted to? Commented Dec 22, 2018 at 5:40
  • It seemed to me as well that the values are out of range. But when I tried to convert them to 'float' , I got the following error: could not convert string to float: '2018-02-15 14:28:08' I guess I forgot to convert the column to 'datetime' before converting it to 'int'. I updated the code as: 'pd.to_datetime(df['datetime_column']).astype(np.int64)' and the conversion is working fine. Commented Dec 22, 2018 at 5:50
  • Why would you convert datetime to int by casting? Don’t you want df['date_time'].map(dt.datetime.toordinal) instead? Commented Dec 22, 2018 at 6:02
  • Thanks a lot for the suggestion. I am building a decision tree model and need to extract some features from the datetime field. I have extracted month, year, day, day of the week, hour of the day, weekend indicator, offset from current date etc. I am just experimenting by adding new features and hence doing a conversion simply by casting. By the way, datetime.toordinal and offset from the current date should be correlated, I guess? Commented Dec 22, 2018 at 6:13

2 Answers 2

5

You would use

df['datetime_colum'].apply(lambda x:x.toordinal())

If it fails, the cause could be that your column is an object and not datetime. So you need:

df['datetime_colum'] = pd.to_datetime(df['datetime_colum'])

before sending it to ordinal.

If you are working on features engineering, you can try creating days between date1 and date2, get boolean for if it is winter, summer, autumn or spring by looking at months, and if you have time, boolean of if it is morning, noontime, or night, but all depending on your machines learning problem.

Sign up to request clarification or add additional context in comments.

Comments

0

it seems you solved the problem yourself judging from your comment. My guess is that you created the data frame without specifying that the column should be read as anything other than a string, so it's a string. If I'm right, and you check the column type, it should show as object. If you check an individual entry in the column, it should show as a string.

If the issue is something else, please follow up.

1 Comment

Thanks a lot. It was exactly the same issue.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.