which is more efficient initializing list with 0s and then populating vs directly appending items to list

Question

I saw two versions of the same code and wanted to know which one is efficiently better and which one is more frequently used in python.

output_list = [0 for _ in range(output_length)]
k=0
for i in range(input_length):
    output_list[k] = input[i]
    k += 1

And the second method is initializing a list with an empty list and appending item one by one. In C++ I think the former method will be method if you already know your output length. Does it matter for python stylistically?

output_list = []
   for i in range(input_length):
        output.append(input[i])

This is the not the actual problem I was concerned about but the problem was of the merge problem in merge sort where you have an auxiliary output array whose size will be the sum of the two halves. But I just gave a simple example above.

The simplest version is just output_list = list(input_list). Why move stuff around manually if there's a builtin for that? — ForceBru
– ForceBru, Commented Jun 9, 2019 at 21:43
Oh actually I was talking more of generic case like a merge sort recursive call where you have an output array. — vkaul11
– vkaul11, Commented Jun 9, 2019 at 21:48

Ecir Hana · Accepted Answer · 2019-06-09 22:23:31Z

3

First, some cleanups:

output_list = [0]*output_length
for i in range(input_length):
    output_list[i] = input[i]

and:

output_list = []
for i in range(input_length):
    output.append(input[i])

Now, Python over-allocates lists so append is O(1) amortised. But even then, it has to copy the underlaying array from time to time in the second case so it is better to use the first alternative which allocates the array of right size right away.

Reply to the comment:

[0]*n is simpler than list comprehension, both because of readability (no _ or loops) and in terms of bytecode:

from dis import dis

def init():
    output_list = [0 for _ in range(output_length)]
    # output_list = [0]*output_length

dis(init)

First:

4         0 LOAD_GLOBAL              0 (range)
          3 LOAD_GLOBAL              1 (output_length)
          6 CALL_FUNCTION            1
          9 BUILD_LIST_FROM_ARG      0
         12 GET_ITER            
    >>   13 FOR_ITER                12 (to 28)
         16 STORE_FAST               0 (_)
         19 LOAD_CONST               1 (0)
         22 LIST_APPEND              2
         25 JUMP_ABSOLUTE           13
    >>   28 STORE_FAST               1 (output_list)
         31 LOAD_CONST               0 (None)
         34 RETURN_VALUE

Second:

5         0 LOAD_CONST               1 (0)
          3 BUILD_LIST               1
          6 LOAD_GLOBAL              0 (output_length)
          9 BINARY_MULTIPLY     
         10 STORE_FAST               0 (output_list)
         13 LOAD_CONST               0 (None)
         16 RETURN_VALUE

edited Jun 9, 2019 at 22:23

answered Jun 9, 2019 at 21:50

Ecir Hana

11.6k13 gold badges74 silver badges133 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

vkaul11 Over a year ago

Thanks why is initializing using your method of [0]*output_array better than mine?

vkaul11 Over a year ago

Thanks what is the tool to see byte code for python?

Alain T. · Accepted Answer · 2019-06-10 00:52:58Z

You will usually get better performance if you can avoid manipulating indexes. List comprehension is fastest, multiple appends is slowest:

But if you're not changing the content of the elements, output_list = list(input_list) or output_list = input_list.copy() will be at least 10 times faster.

TESTS AND RESULTS

input_list = list(range(10000))
input_length = len(input_list)

def test1():
    output_list = [0]*input_length
    for i in range(input_length):
        output_list[i] = input_list[i] * 3

def test2():
    output_list = []
    for i in range(input_length):
        output_list.append(input_list[i] * 3)

def test3():
    output_list = [ value* 3 for value in input_list]

from timeit import timeit
count = 1000

t = timeit(lambda:test1(), number=count)
print("test1()",t)

t = timeit(lambda:test2(), number=count)
print("test2()",t)

t = timeit(lambda:test3(), number=count)
print("test3()",t)

...

test1() 0.7698256160000001   # [0]*size, assign at indexes
test2() 1.0704999219999998   # multiple appends
test3() 0.36982549099999984  # list comprehension

SnakeException · Accepted Answer · 2019-06-09 21:51:44Z

0

Well, it depends on your needs and if you want it to be time-efficient or memory efficient (though it probably wouldn't differ by much). If you need a list with an exact amount of elements then initializing first would be fine. Otherwise, appending items would give you more flexibility.

answered Jun 9, 2019 at 21:51

SnakeException

1,1942 gold badges9 silver badges27 bronze badges

Collectives™ on Stack Overflow

which is more efficient initializing list with 0s and then populating vs directly appending items to list

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related