1

Data = [day(1) day(2)...day(N)...day(2N)..day(K-N)...day(K)]

I am looking to create a numpy array with two arrays, N and K with shapes (120,) and (300,). The array needs to be of the form:

x1 = [day(1) day(2) day (3)...day(N)] x2 = [day(2) day(3)...day(N) day(N+1)] xN = [day(N) day(N+1) day(N+2)...day(2N)] xK-N = [day(K-N) day(K-N+1)...day(K)]

X is basically of shape (K-N)xN, with the above x1,x2,...xK-N as rows. I have tried using iloc for getting two arrays N and K with the same shapes. Good till then. But, when I try to merge the arrays using X = np.array([np.concatenate((N[i:], K[:i] )) for i in range(len(N)]), I am getting an NxN array in the form of an overlap array only, and not in the desired format.

2 Answers 2

1

Is this what you are trying to produce (with simpler data)?

In [253]: N,K=10,15
In [254]: data = np.arange(K)+10
In [255]: data
Out[255]: array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24])
In [256]: np.array([data[np.arange(N)+i] for i in range(K-N+1)])
Out[256]: 
array([[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [12, 13, 14, 15, 16, 17, 18, 19, 20, 21],
       [13, 14, 15, 16, 17, 18, 19, 20, 21, 22],
       [14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
       [15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])

There's another way of generating this, using advanced ideas about strides:

np.lib.stride_tricks.as_strided(data, shape=(K-N+1,N), strides=(4,4))

In the first case, all values in the new array are copies of the original. The strided case is actually a view. So any changes to data appear in the 2d array. And without data copying, the 2nd is also faster. I can try to explain it if you are interested.


Warren suggests using hankel. That's a short function, which in our case does essentially:

a, b = np.ogrid[0:K-N+1, 0:N]
data[a+b]

a+b is an array like:

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13],
       [ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14]])

In this example case it is just a bit better than the list comprehension solution, but I expect it will be a lot better for much larger cases.

Sign up to request clarification or add additional context in comments.

3 Comments

Perfect, mate! That is exactly what I wanted, the first example that you provided. Thanks. I am interested in the strides concept too, it sounds interesting.
Instead of strides=(4,4), it would be safer to use strides=(data.stride[0],)*2. Even if you know the elements are integers, the size of each element is not necessarily 4 bytes. On my computer, the default size of the items in an integer array is 8 bytes. (And in "production code", you should carefully validate that data is, in fact, a 1-d array of sufficient length before using as_strided.)
Good point about the stride size. I've added a solution based on what hankel does - which is probably the best non-striding solution.
1

It is probably not worth adding a dependence on scipy for the following, but if you are already using scipy in your code, you could use the function scipy.linalg.hankel:

In [75]: from scipy.linalg import hankel

In [76]: K = 16

In [77]: x = np.arange(K)

In [78]: x
Out[78]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [79]: N = 8

In [80]: hankel(x[:K-N+1], x[K-N:])
Out[80]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 1,  2,  3,  4,  5,  6,  7,  8],
       [ 2,  3,  4,  5,  6,  7,  8,  9],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 4,  5,  6,  7,  8,  9, 10, 11],
       [ 5,  6,  7,  8,  9, 10, 11, 12],
       [ 6,  7,  8,  9, 10, 11, 12, 13],
       [ 7,  8,  9, 10, 11, 12, 13, 14],
       [ 8,  9, 10, 11, 12, 13, 14, 15]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.