9

I have a 2d numpy.array, where the first column contains datetime.datetime objects, and the second column integers:

A = array([[2002-03-14 19:57:38, 197],
       [2002-03-17 16:31:33, 237],
       [2002-03-17 16:47:18, 238],
       [2002-03-17 18:29:31, 239],
       [2002-03-17 20:10:11, 240],
       [2002-03-18 16:18:08, 252],
       [2002-03-23 23:44:38, 327],
       [2002-03-24 09:52:26, 334],
       [2002-03-25 16:04:21, 352],
       [2002-03-25 18:53:48, 353]], dtype=object)

What I would like to do is select all rows for a specific date, something like

A[first_column.date()==datetime.date(2002,3,17)]
array([[2002-03-17 16:31:33, 237],
           [2002-03-17 16:47:18, 238],
           [2002-03-17 18:29:31, 239],
           [2002-03-17 20:10:11, 240]], dtype=object)

How can I achieve this?

Thanks for your insight :)

2 Answers 2

8

You could do this:

from_date=datetime.datetime(2002,3,17,0,0,0)
to_date=from_date+datetime.timedelta(days=1)
idx=(A[:,0]>from_date) & (A[:,0]<=to_date)
print(A[idx])
# array([[2002-03-17 16:31:33, 237],
#        [2002-03-17 16:47:18, 238],
#        [2002-03-17 18:29:31, 239],
#        [2002-03-17 20:10:11, 240]], dtype=object)

A[:,0] is the first column of A.

Unfortunately, comparing A[:,0] with a datetime.date object raises a TypeError. However, comparison with a datetime.datetime object works:

In [63]: A[:,0]>datetime.datetime(2002,3,17,0,0,0)
Out[63]: array([False,  True,  True,  True,  True,  True,  True,  True,  True,  True], dtype=bool)

Also, unfortunately,

datetime.datetime(2002,3,17,0,0,0)<A[:,0]<=datetime.datetime(2002,3,18,0,0,0)

raises a TypeError too, since this calls datetime.datetime's __lt__ method instead of the numpy array's __lt__ method. Perhaps this is a bug.

Anyway, it's not hard to work-around; you can say

In [69]: (A[:,0]>datetime.datetime(2002,3,17,0,0,0)) & (A[:,0]<=datetime.datetime(2002,3,18,0,0,0))
Out[69]: array([False,  True,  True,  True,  True, False, False, False, False, False], dtype=bool)

Since this gives you a boolean array, you can use it as a "fancy index" to A, which yields the desired result.

Sign up to request clarification or add additional context in comments.

Comments

2
from datetime import datetime as dt, timedelta as td
import numpy as np

# Create 2-d numpy array
d1 = dt.now()
d2 = dt.now()
d3 = dt.now() - td(1)
d4 = dt.now() - td(1)
d5 = d1 + td(1)
arr = np.array([[d1, 1], [d2, 2], [d3, 3], [d4, 4], [d5, 5]])

# Here we will extract all the data for today, so get date range in datetime
dtx = d1.replace(hour=0, minute=0, second=0, microsecond=0)
dty = dtx + td(hours=24)

# Condition 
cond = np.logical_and(arr[:, 0] >= dtx, arr[:, 0] < dty)

# Full array
print arr
# Extracted array for the range
print arr[cond, :]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.