How to transform an 1D string array to a 2D array about the datetime

Question

Consider I have an array containing the string of datetime:

       new_time[index]
Out[9]: 
array(['2012-09-01_00:00:00', '2012-09-01_01:00:00',
    '2012-09-01_02:00:00', '2012-09-01_03:00:00',
    '2012-09-01_04:00:00', '2012-09-01_05:00:00',
    '2012-09-01_06:00:00', '2012-09-01_07:00:00',
    '2012-09-01_08:00:00', '2012-09-01_09:00:00',
    '2012-09-01_10:00:00', '2012-09-01_11:00:00',
    '2012-09-01_12:00:00', '2012-09-01_13:00:00',
    '2012-09-01_14:00:00', '2012-09-01_15:00:00',
    '2012-09-01_16:00:00', '2012-09-01_17:00:00',
    '2012-09-01_18:00:00', '2012-09-01_19:00:00',
    '2012-09-01_20:00:00', '2012-09-01_21:00:00',
    '2012-09-01_22:00:00', '2012-09-01_23:00:00'], dtype='<U19')

Its shape is (24,).The question is how can I assign it to a (24,19)array,and the rows of new array could look like following:

 ## one row of new array 
Out[10]: 
array([[b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
    b'0', b'0', b':', b'0', b'0', b':', b'0', b'0']], dtype='|S1')

Thanks for your help.

Probably with a view where dtype was uint8 or U1 or something — Mad Physicist
– Mad Physicist, Commented Mar 13, 2018 at 7:15
@MadPhysicist indeed X.view('U1').reshape(X.size, -1).astype('S1'). — Paul Panzer
– Paul Panzer, Commented Mar 13, 2018 at 7:17
@PaulPanzer why does dtype='U1' work, but dtype='S1' does not? — Mad Physicist
– Mad Physicist, Commented Mar 13, 2018 at 7:27
@MadPhysicist because the itemsizes don't match. If you view-cast from U* to S1 every character gets distributed across 4 bytes. To get from U to S you have to use something like astype, i.e. actually create a new data buffer with the unicode characters expressed as single bytes. — Paul Panzer
– Paul Panzer, Commented Mar 13, 2018 at 7:34
@MadPhysicist If you are ok with a non-contiguous result you could actually do X.view('S1').reshape(X.size, -1, 4)[..., 0] — Paul Panzer
– Paul Panzer, Commented Mar 13, 2018 at 7:37

Mike Müller · Accepted Answer · 2018-03-13 07:48:08Z

For your array:

import numpy as np

a = np.array(['2012-09-01_00:00:00', '2012-09-01_01:00:00',
    '2012-09-01_02:00:00', '2012-09-01_03:00:00',
    '2012-09-01_04:00:00', '2012-09-01_05:00:00',
    '2012-09-01_06:00:00', '2012-09-01_07:00:00',
    '2012-09-01_08:00:00', '2012-09-01_09:00:00',
    '2012-09-01_10:00:00', '2012-09-01_11:00:00',
    '2012-09-01_12:00:00', '2012-09-01_13:00:00',
    '2012-09-01_14:00:00', '2012-09-01_15:00:00',
    '2012-09-01_16:00:00', '2012-09-01_17:00:00',
    '2012-09-01_18:00:00', '2012-09-01_19:00:00',
    '2012-09-01_20:00:00', '2012-09-01_21:00:00',
    '2012-09-01_22:00:00', '2012-09-01_23:00:00'], dtype='<U19')

You need to get to S1 and reshape:

>>> a.view('U1').astype('S1').reshape(a.size, -1)
array([[b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'0', b'0', b':', b'0', b'0', b':', b'0', b'0'],
       [b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'0', b'1', b':', b'0', b'0', b':', b'0', b'0'],
       ...
       [b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'2', b'3', b':', b'0', b'0', b':', b'0', b'0']], 
      dtype='|S1')

Viewing directly as S1 does not work, because there are 4 bytes per charater:

>>> a.view('S1').shape
(1824,)
>>> a.view('U1').shape
(456,)

I you start with S19, you can view as S1 immediately:

>>> b.dtype
dtype('S19')
>>> b.view('S1').reshape(b.size, -1)
array([[b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'0', b'0', b':', b'0', b'0', b':', b'0', b'0'],
       ...
       [b'2', b'0', b'1', b'2', b'-', b'0', b'9', b'-', b'0', b'1', b'_',
        b'2', b'3', b':', b'0', b'0', b':', b'0', b'0']], 
      dtype='|S1')

Unicode is always good for a surprise. ;) Looks like there are 4 bytes per character.

Paul Panzer · Accepted Answer · 2018-03-13 07:50:44Z

1

If you are ok with a non-contiguous view you can simply do:

X.view('S1').reshape(X.size, -1, 4)[..., 0]

or

X.view('S1').reshape(X.size, -1)[:, ::4]

Since this shares data with the original array it is very cheap, but you have to be aware that modifying this in-place will also change the original array. Of course, you can always make a copy.

edited Mar 13, 2018 at 7:50

answered Mar 13, 2018 at 7:43

Paul Panzer

53.3k3 gold badges60 silver badges103 bronze badges

Comments

JahKnows · Accepted Answer · 2018-03-13 07:30:54Z

0

You can split your strings using list comprehension. Then you can get the 2D array using np.asarray() as

x = np.asarray(['2012-09-01_00:00:00', '2012-09-01_01:00:00',
    '2012-09-01_02:00:00', '2012-09-01_03:00:00',
    '2012-09-01_04:00:00', '2012-09-01_05:00:00',
    '2012-09-01_06:00:00', '2012-09-01_07:00:00',
    '2012-09-01_08:00:00', '2012-09-01_09:00:00',
    '2012-09-01_10:00:00', '2012-09-01_11:00:00',
    '2012-09-01_12:00:00', '2012-09-01_13:00:00',
    '2012-09-01_14:00:00', '2012-09-01_15:00:00',
    '2012-09-01_16:00:00', '2012-09-01_17:00:00',
    '2012-09-01_18:00:00', '2012-09-01_19:00:00',
    '2012-09-01_20:00:00', '2012-09-01_21:00:00',
    '2012-09-01_22:00:00', '2012-09-01_23:00:00'])

temp = []
for i in x:
    temp.append([j for j in i])
np.asarray(temp, dtype = 'S1')

Or in a very concise way you can do

temp = [[j for j in i] for i in x]   
temp = np.asarray(temp, dtype = 'S1')

edited Mar 13, 2018 at 7:30

answered Mar 13, 2018 at 7:22

JahKnows

2,7113 gold badges25 silver badges37 bronze badges

6 Comments

JahKnows Over a year ago

Same time complexity as the other answer.

Mad Physicist Over a year ago

Not really. The other one is O(1) because it's making a view. You're copying the data, so O(n), and with a heavy load factor because you're using Python loops.

Mad Physicist Over a year ago

A view just creates another array structure with different metadata, but leaves the original underlying data untouched.

Mad Physicist Over a year ago

Well, nevwrmind, astype does make a copy. But there's a no-copy solution in the comments to the question.

JahKnows Over a year ago

Good call. Other solution is better. With an array size up to 1000 I still get the same execution time on my machine.

|

Shrey · Accepted Answer · 2018-03-13 08:53:43Z

iterating through each value and then assigning it to a list, will solve this.

import numpy as np
array_24 = np.array(['2012-09-01_00:00:00', '2012-09-01_01:00:00',
    '2012-09-01_02:00:00', '2012-09-01_03:00:00',
    '2012-09-01_04:00:00', '2012-09-01_05:00:00',
    '2012-09-01_06:00:00', '2012-09-01_07:00:00',
    '2012-09-01_08:00:00', '2012-09-01_09:00:00',
    '2012-09-01_10:00:00', '2012-09-01_11:00:00',
    '2012-09-01_12:00:00', '2012-09-01_13:00:00',
    '2012-09-01_14:00:00', '2012-09-01_15:00:00',
    '2012-09-01_16:00:00', '2012-09-01_17:00:00',
    '2012-09-01_18:00:00', '2012-09-01_19:00:00',
    '2012-09-01_20:00:00', '2012-09-01_21:00:00',
    '2012-09-01_22:00:00', '2012-09-01_23:00:00'])
array_24.shape        #(24,)
array_24_19 = np.asarray([[j for j in i] for i in array_24])
array_24_19.shape     #(24, 19)
array_24_19[0]        #array(['2', '0', '1', '2', '-', '0', '9', '-','0','1', '_', '0', '0',':', '0', '0', ':', '0', '0'], dtype='|S1')

I hope this helps

Collectives™ on Stack Overflow

How to transform an 1D string array to a 2D array about the datetime

4 Answers 4

1 Comment

Comments

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related