time optimization: performance, accessing values in an list of list, list of array numpy

Question

I have been trying to optimize my code.

I compared 4 possible coding choices for getting the value in one cell of a list of list ( or replace list with array).

M = 1000
my_list = [[] for i in range(M)]
for i in range(M):
    for j in range(M):
        my_list[i].append(0)
my_numpy_list = [ np.full(M,1) for i in range(M) ]
time1 = time.time()
for j in range(1000):
    for i in range(10000):
        my_list[0][0]
print( "1  ", time.time() - time1)

time1 = time.time()
for j in range(1000):
    test_list = my_list[0]
    for i in range(10000):
        test_list[0]
print("2 ",time.time() - time1)

for j in range(1000):
    for i in range(10000):
        my_numpy_list[0][0]
print("3 ", time.time() - time1)


for j in range(1000):
    my_numpy_test_list = my_numpy_list[0]
    for i in range(10000):
        my_numpy_test_list[0]
print( "4  ", time.time() - time1)

on my computer, it gives the following times :

1   0.9008669853210449
2  0.7616724967956543
3  2.9174351692199707
4   4.883266925811768

The question is, why is it longer to access values in a numpy array ? If it's longer, what about converting an array into a list in order to access data faster. In particular, I am very surprised that storing the array which was in a list ( case 4) is the slowest case. Shoudln't the time be :

4 < 2 < 3 < 1 ?

Cheers

you have forgotten to reassign time1 = time.time() in the last two loops — Commissar Vasili Karlovic
– Commissar Vasili Karlovic, Commented May 30, 2020 at 16:00
I didn't downvote. Btw, I don't know if this is intentional, but my_numpy_list is a list of np.arrays and not an np.array — Commissar Vasili Karlovic
– Commissar Vasili Karlovic, Commented May 30, 2020 at 16:20
@CommissarVasiliKarlovic yes, the difficulty I have is that I am dealing with lists of different sizes. I discovered that a good idea could be having a list of arrays instead of a list of lists. This is why I am refractoring my code, and then I discovered it runned much slower... this is the reason why I asked for help. Your solution using 'map' is extremely effective though. Thank you. — Please don't hit me
– Please don't hit me, Commented May 30, 2020 at 16:26

riccardo nizzolo · Accepted Answer · 2020-05-30 16:03:59Z

1

Because the goal of numpy is not to make your access to data faster. Instead the goal of numpy is to allow you to write vectorized code and avoid loops.

Let's modify your example and make your code adding 1 to every element of your list/np.array

M = 1000
my_list = [[] for i in range(M)]
for i in range(M):
    for j in range(M):
        my_list[i].append(0)
my_numpy_array = np.array([ np.full(M,1) for i in range(M) ])
time1 = time.time()

time1 = time.time()
for j in range(1000):
    test_list = my_list[0]
    for i in range(10000):
        test_list[0]+1
print("list case addition",time.time() - time1)

time2 = time.time()
my_numpy_list = my_numpy_array+1
print("numpy case addition",time.time() - time2)

The output is:

list case addition 0.7961978912353516
numpy case addition 0.0031096935272216797

which is about 250 times faster

answered May 30, 2020 at 16:03

riccardo nizzolo

6211 gold badge8 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Please don't hit me Over a year ago

Is there a way to access data faster than list of list ? I need to keep the notion of "list". However, i can store the lists I have anyhow.

Please don't hit me Over a year ago

Also, I discovered that creating numpy arrays is pretty slow. This could give some elements of answer

Commissar Vasili Karlovic Over a year ago

@Pleasedon'thitme map, filter, reduce functions are efficient to access your data if there's a pattern. In your case map(lambda x : x[0], my_numpy_list)

Please don't hit me Over a year ago

@CommissarVasiliKarlovic oh wonderful, I wouldn't have thought about using maps like that. Thank you for the tip !!! Actually, your solution is extremely effective, if you can write as a solution, I could accept it.

juanpa.arrivillaga Over a year ago

if speed is the problem a list comprehension will be faster than map for that

Collectives™ on Stack Overflow

time optimization: performance, accessing values in an list of list, list of array numpy

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related