why is converting a long 2D list to numpy array so slow?

Question

I have a long list of xy coordinates, and would like to convert it into numpy array.

>>> import numpy as np
>>> xy = np.random.rand(1000000, 2).tolist()

The obvious way would be:

>>> a = np.array(xy) # Very slow...

However, the above code is unreasonably slow. Interestingly, to transpose the long list first, convert it into numpy array, and then transpose back would be much faster (20x on my laptop).

>>> def longlist2array(longlist):
...     wide = [[row[c] for row in longlist] for c in range(len(longlist[0]))]
...     return np.array(wide).T
>>> a = longlist2array(xy) # 20x faster!

Is this a bug of numpy?

EDIT:

This is a list of points (with xy coordinates) generated on-the-fly, so instead of preallocating an array and enlarging it when necessary, or maintaining two 1D lists for x and y, I think current representation is most natural.

Why is looping through 2nd index faster than 1st index, given that we are iterating through a python list in both directions?

EDIT 2:

Based on @tiago's answer and this question, I found the following code twice as fast as my original version:

>>> from itertools import chain
>>> def longlist2array(longlist):
...     flat = np.fromiter(chain.from_iterable(longlist), np.array(longlist[0][0]).dtype, -1) # Without intermediate list:)
...     return flat.reshape((len(longlist), -1))

Then what is this feature good for? The only thing I can think about it to check whether each of the inner lists is of the same length, but I don't think it would take so long... — herrlich10
– herrlich10, Commented Jul 31, 2013 at 15:22
@herrlich10 lists are not necessarily contiguous in memory so np.array is looping through the first index (the list index) and adding it to the array. This is why it takes longer when the first index is much larger than the second. — tiago
– tiago, Commented Jul 31, 2013 at 15:36
@tiago following similar logic, a inner list may not be contiguous in memory either. why looping through the second index so fast? — herrlich10
– herrlich10, Commented Jul 31, 2013 at 15:45

IanH · Accepted Answer · 2013-07-31 19:06:21Z

6

Implementing this in Cython without the extra checking involved to determine dimensionality, etc. nearly eliminates the time difference you are seeing. Here's the .pyx file I used to verify that.

from numpy cimport ndarray as ar
import numpy as np
cimport cython

@cython.boundscheck(False)
@cython.wraparound(False)
def toarr(xy):
    cdef int i, j, h=len(xy), w=len(xy[0])
    cdef ar[double,ndim=2] new = np.empty((h,w))
    for i in xrange(h):
        for j in xrange(w):
            new[i,j] = xy[i][j]
    return new

I would assume that the extra time is spent in checking the length and content of each sublist in order to determine the datatype, dimension, and size of the desired array. When there are only two sublists, it only has to check two lengths to determine the number of columns in the array, instead of checking 1000000 of them.

answered Jul 31, 2013 at 19:06

IanH

10.8k1 gold badge30 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

herrlich10 Over a year ago

This makes a lot of sense. Thank you, IanH.

IanH Over a year ago

By the way, if you're looking for a faster implementation, the Cython I included here is a good bit faster than the built in version in either case since it bypasses the checking entirely. It isn't as general though.

herrlich10 Over a year ago

If we keep boundscheck(True) and wraparound(True), just use cython to do the two for loops, will it be nearly as slow as the direct np.array(xy) method?

IanH Over a year ago

In this case I'm not sure why they would ever need to be set to True, the optimized indexing only applies to the array, not to the list, so an out of bounds memory access isn't going to happen. That being said, I ran some quick benchmarks and it didn't change much. Here they are, for 1000000 2D pts: original lists: Cython(as above) 98.5ms, Cython(without extra instructions) 103ms, pure Python loop 870ms, NumPy builtin 6.41s, transposed lists: Cython(as above) 85.3ms, Cython(without extra instructions) 92.5ms, Python 527ms, NumPy, 289ms. I didn't include the time taken in transposing the lists.

herrlich10 Over a year ago

Just a way to verify whether these additional checking are really the cause of Numpy builtin's bad performance, which is still hard to believe:)

|

tiago · Accepted Answer · 2013-07-31 15:26:02Z

6

This is because the fastest-varying index of your list is the last one, so np.array() has to traverse the array many times because the first index is much larger. If your list was transposed, np.array() would be faster than your longlist2array:

In [65]: import numpy as np

In [66]: xy = np.random.rand(10000, 2).tolist()

In [67]: %timeit longlist2array(xy)
100 loops, best of 3: 3.38 ms per loop

In [68]: %timeit np.array(xy)
10 loops, best of 3: 55.8 ms per loop

In [69]: xy = np.random.rand(2, 10000).tolist()

In [70]: %timeit longlist2array(xy)
10 loops, best of 3: 59.8 ms per loop

In [71]: %timeit np.array(xy)
1000 loops, best of 3: 1.96 ms per loop

There is no magical solution for your problem. It's just how Python stores your list in memory. Do you really need to have a list with that shape? Can't you reverse it? (And do you really need a list, given that you're converting to numpy?)

If you must convert a list, this function is about 10% faster than your longlist2array:

from itertools import chain

def convertlist(longlist)
    tmp = list(chain.from_iterable(longlist))
    return np.array(tmp).reshape((len(longlist), len(longlist[0])))

edited Jul 31, 2013 at 15:26

answered Jul 31, 2013 at 15:04

tiago

23.7k13 gold badges75 silver badges89 bronze badges

2 Comments

herrlich10 Over a year ago

Definitely related with dimension order, but I'm wondering why the impact is so large given that numpy is implemented in C/C++. Thanks for the itertools solution!

tiago Over a year ago

@herrlich10: lists are high level objects, so the fact that numpy is written in C doesn't make anything faster: it still has to deal with the Python objects.

HYRY · Accepted Answer · 2013-08-01 10:06:30Z

3

If you have pandas, you can use pandas.lib.to_object_array(), it's the fastest method:

import numpy as np
import pandas as pd
a = np.random.rand(100000, 2)
b = a.tolist()

%timeit np.array(b, dtype=float, ndmin=2)
%timeit np.array(b, dtype=object).astype(float)
%timeit np.array(zip(*b)).T
%timeit pd.lib.to_object_array(b).astype(float)

outputs:

1 loops, best of 3: 462 ms per loop
1 loops, best of 3: 192 ms per loop
10 loops, best of 3: 39.9 ms per loop
100 loops, best of 3: 13.7 ms per loop

answered Aug 1, 2013 at 10:06

HYRY

97.8k28 gold badges197 silver badges192 bronze badges

2 Comments

herrlich10 Over a year ago

Thank you. It is indeed ~30% faster than the flatten generator method, though as the cost of requiring additional package.

nosbor Over a year ago

This solution seems to be deprecated as this attribute no longer exists in pandas. AttributeError: module 'pandas' has no attribute 'lib'. There is also a thread regarding this on github: github.com/Neurosim-lab/netpyne/issues/406

Collectives™ on Stack Overflow

why is converting a long 2D list to numpy array so slow?

3 Answers 3

6 Comments

2 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related