2

I want to convert a structured NumPy array with datetime64[m] and a timedelta64[m] fields to an equivalent structured array with seconds since the epoch.

The field size in an np.array is important for converting an unstructured array into a structured array. (Convert a numpy array to a structured array)

Since the current np.datetime64 field is longer than an int field for seconds since the epoch converting the array in place is not possible - right? (I would prefer this option.)

The simple and wrong approach would be this:

import numpy as np
import numpy.lib.recfunctions as rf

datetime_t = np.dtype([("start", "datetime64[m]"),
                       ("duration", "timedelta64[m]"),
                       ("score", float)])

seconds_t = np.dtype([("start", "int"),
                      ("duration", "int"),
                      ("score", float)])


unstructured = np.arange(9).reshape((3, 3))
print(unstructured)

datetime_structure = rf.unstructured_to_structured(unstructured, dtype=datetime_t)
print(datetime_structure)

seconds_structure = datetime_structure.astype(seconds_t)
print(seconds_structure.dtype)

giving me this output:

[[0 1 2]
 [3 4 5]
 [6 7 8]]
[('1970-01-01T00:00', 1, 2.) ('1970-01-01T00:03', 4, 5.)
 ('1970-01-01T00:06', 7, 8.)]
[(0, 1, 2.) (3, 4, 5.) (6, 7, 8.)]

Process finished with exit code 0

Since I specified minutes, I should get multiple of 60 seconds, not single digits.

Sidenote: I am confused by the first conversion TO the DateTime format, as the DateTime is not in minutes but in seconds. I specified datetime64[m] and converted 3 (and 0 and 6) into that format, and I would have expected 3 minutes ('1970-01-01T03:00'), not 3 seconds ('1970-01-01T00:03'). Oh well. Perhaps someone could explain?

How do I convert a structured array like this elegantly and efficiently? Do I need to iterate over the array manually (my real array has a few more fields than this example), copy the columns one by one, and convert the time fields? Given that I want to convert multiple different structures containing these time formats, a generalized approach to converting these fields in structured arrays would be welcome without needing to specify the fields individually.

7

1 Answer 1

0

This is the way I am doing it now. I define two alternative data types:

  1. one with datetime64[s] and
  2. one with int,

and convert first to seconds and then to int like this:

import numpy as np

minutes_dt = np.dtype([("time", "datetime64[m]"),
                      ("duration", "timedelta64[m]")])

seconds_dt = np.dtype([("time", "datetime64[s]"),
                      ("duration", "timedelta64[s]")])

basic_dt = np.dtype([("time", int),
                      ("duration", int)])


minutes = np.array([(10,8)], dtype=minutes_dt)
print(minutes)

seconds = minutes.astype(seconds_dt)
print(seconds)

print(minutes.astype(basic_dt), seconds.astype(basic_dt))

which gives me this output:

[('1970-01-01T00:10', 8)]
[('1970-01-01T00:10:00', 480)]
[(10, 8)] [(600, 480)]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.