8

What is the proper way of converting integer dates to datetime64 in numpy? I tried:

import numpy
a = numpy.array([20090913, 20101020, 20110125])
numpy.datetime64(a.astype("S8"))

but get an incorrect conversion. How about reading them in correctly as numpy.datetime64 objects using numpy.loadtxt (they are coming from a csv file)?

2 Answers 2

5

You problem is that datetime64 expects a string in the format yyyy-mm-dd, while the type conversion produces strings in the format yyyymmdd. I would suggest something like this:

conversion = lambda x: "%s-%s-%s" % (x[:4], x[4:6], x[6:])
np_conversion = numpy.frompyfunc(conversion,1,1)
b = np_conversion(a.astype('S10'))
numpy.datetime64(b)

However it's not working for me (I have numpy 1.6.1), it fails with the message "NotImplementedError: Not implemented for this type". Unless that is implemented in 1.7, I can only suggest a pure Python solution:

numpy.datetime64(numpy.array([conversion(str(x)) for x in a], dtype="S10"))

...or pre-processing your input, to deliver the dates in the expected format.

Edit: I can also offer an alternative solution, using vectorize, but I don't know very well how it works, so I don't know what's going wrong:

>>> conversion = vectorize(lambda x: "%s-%s-%s" % (x[:4], x[4:6], x[6:]), otypes=['S10'])
>>> conversion(a.astype('S10'))
array(['2009', '2010', '2011'],
      dtype='|S4')

For some reason it's ignoring the otypes and outputting |S4 instead of |S10. Sorry I can't help more, but this should provide a starting point for searching other solutions.

Update: Thanks to OP feedback, I thought of a new possibility. This should work as expected:

>>> conversion = lambda x: numpy.datetime64(str(x))
>>> np_conversion = numpy.frompyfunc(conversion, 1, 1)
>>> np_conversion(a)
array([2009-09-13 00:00:00, 2010-10-20 00:00:00, 2011-01-25 00:00:00], dtype=object)

# Works too:
>>> conversion = lambda x: numpy.datetime64("%s-%s-%s" % (x/10000, x/100%100, x%100))

Weird how, in this case, datetime64 works fine with or without the dashes...

Sign up to request clarification or add additional context in comments.

6 Comments

+1 from me; your first solution seems to work just fine for me (NumPy 2.0 dev).
I'm wondering why numpy.datetime64('20090921') works, if the required format is with dashes?
@Benjamin really strange... But your comment gave me a new idea, try my updated answer, worked fine for me.
So basically, numpy.array([numpy.datetime64(str(i)) for i in a]). I was wondering if there was a more direct way than this, however.
@Benjamin: What do you mean with a more direct way? Directly while reading your data from a csv/text file?
|
5

Oddly, this works: numpy.datetime64(a.astype("S8").tolist()), while this does not: numpy.datetime64(a.astype("S8")). The first method is still a bit less convoluted than: numpy.array([numpy.datetime64(str(i)) for i in a]). I asked why in this question.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.