Split a large numpy array into separate arrays with a list of grouped indices

Question

Given 2 arrays: One for a master dataset, and the second as list of grouped indices that reference the master dataset. I'm looking for the fastest to generate new arrays from the given index data?

Here's my current solution for generating 2 arrays from a list of double keys:

# Lets make a large point cloud with 1 million entries and a list of random paired indices
import numpy as np
COUNT = 1000000
POINT_CLOUD = np.random.rand(COUNT,3) * 100
INDICES = (np.random.rand(COUNT,2)*COUNT).astype(int)  # (1,10),(233,12),...

# Split into sublists, np.squeeze is needed here because i don't want arrays of single elements.
LIST1 = POINT_CLOUD[np.squeeze(INDICES[:,[0]])]
LIST2 = POINT_CLOUD[np.squeeze(INDICES[:,[1]])]

This works, but it's a little slow, and it's only good for generating 2 lists, it would be great to have a solution that could tackle any size of index groups (ex: ((1,2,3,4),(8,4,5,3),...)

so something like:

# PSEUDO CODE using quadruple keys
INDICES = (np.random.rand(COUNT,4)*COUNT).astype(int)
SPLIT = POINT_CLOUD[<some pythonic magic>[INDICES]]
SPLIT[0] = np.array([points from INDEX #1])
SPLIT[1] = np.array([points from INDEX #2])
SPLIT[2] = np.array([points from INDEX #3])
SPLIT[3] = np.array([points from INDEX #4])

YXD · Accepted Answer · 2015-07-30 09:58:22Z

1

You just have to reshape the index array:

>>> result = POINT_CLOUD[INDICES.T]
>>> np.allclose(result[0], LIST1)
True
>>> np.allclose(result[1], LIST2)
True

If you know the number of sub-arrays you can also unpack the list

>>> result.shape
(2, 1000000, 3)
>>> L1, L2 = result
>>> np.allclose(L1, LIST1)
True
>>> # etc

This works for larger index groups. For the second example in your question:

>>> INDICES = (np.random.rand(COUNT,4)*COUNT).astype(int)
>>> SPLIT = POINT_CLOUD[INDICES.T]
>>> SPLIT.shape
(4, 1000000, 3)
>>>

answered Jul 30, 2015 at 9:58

YXD

32.6k15 gold badges79 silver badges117 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Split a large numpy array into separate arrays with a list of grouped indices

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related