How to create a fix size list in python? [duplicate]

Question

In C++, I can create a array like...

int* a = new int[10];

in python,I just know that I can declare a list,than append some items,or like..

l = [1,2,3,4]
l = range(10)

Can I initialize a list by a given size,like c++,and do not do any assignment?

You do not need to declare a list in Python. Just initialize it when you want to use it. — ronakg
– ronakg, Commented May 16, 2012 at 10:56
@WeaselFox: sometimes there is; for example say you wanted to do the Sieve or Eratoshenes. — ninjagecko
– ninjagecko, Commented May 16, 2012 at 11:02
Note that range(10) is actually a generator object in python3; you will not be able to mutate it. You will need to do list(range(10)) — ninjagecko
– ninjagecko, Commented May 16, 2012 at 11:03
Actually if you know the length of the list it will be faster to create the "empty list" first of length n and then assign the values by index than it would to append each additional item. — mtnpaul
– mtnpaul, Commented Feb 24, 2015 at 19:26

Community · Accepted Answer · 2017-05-23 12:34:34Z

131

(tl;dr: The exact answer to your question is numpy.empty or numpy.empty_like, but you likely don't care and can get away with using myList = [None]*10000.)

Simple methods

You can initialize your list to all the same element. Whether it semantically makes sense to use a non-numeric value (that will give an error later if you use it, which is a good thing) or something like 0 (unusual? maybe useful if you're writing a sparse matrix or the 'default' value should be 0 and you're not worried about bugs) is up to you:

>>> [None for _ in range(10)]
[None, None, None, None, None, None, None, None, None, None]

(Here _ is just a variable name, you could have used i.)

You can also do so like this:

>>> [None]*10
[None, None, None, None, None, None, None, None, None, None]

You probably don't need to optimize this. You can also append to the array every time you need to:

>>> x = []
>>> for i in range(10):
>>>    x.append(i)

Performance comparison of simple methods

Which is best?

>>> def initAndWrite_test():
...  x = [None]*10000
...  for i in range(10000):
...   x[i] = i
... 
>>> def initAndWrite2_test():
...  x = [None for _ in range(10000)]
...  for i in range(10000):
...   x[i] = i
... 
>>> def appendWrite_test():
...  x = []
...  for i in range(10000):
...   x.append(i)

Results in python2.7:

>>> import timeit
>>> for f in [initAndWrite_test, initAndWrite2_test, appendWrite_test]:
...  print('{} takes {} usec/loop'.format(f.__name__, timeit.timeit(f, number=1000)*1000))
... 
initAndWrite_test takes 714.596033096 usec/loop
initAndWrite2_test takes 981.526136398 usec/loop
appendWrite_test takes 908.597946167 usec/loop

Results in python 3.2:

initAndWrite_test takes 641.3581371307373 usec/loop
initAndWrite2_test takes 1033.6499214172363 usec/loop
appendWrite_test takes 895.9040641784668 usec/loop

As we can see, it is likely better to do the idiom [None]*10000 in both python2 and python3. However, if one is doing anything more complicated than assignment (such as anything complicated to generate or process every element in the list), then the overhead becomes a meaninglessly small fraction of the cost. That is, such optimization is premature to worry about if you're doing anything reasonable with the elements of your list.

Uninitialized memory

These are all however inefficient because they go through memory, writing something in the process. In C this is different: an uninitialized array is filled with random garbage memory (sidenote: that has been reallocated from the system, and can be a security risk when you allocate or fail to mlock and/or fail to delete memory when closing the program). This is a design choice, designed for speedup: the makers of the C language thought that it was better not to automatically initialize memory, and that was the correct choice.

This is not an asymptotic speedup (because it's O(N)), but for example you wouldn't need to first initialize your entire memory block before you overwrite with stuff you actually care about. This, if it were possible, is equivalent to something like (pseudo-code) x = list(size=10000).

If you want something similar in python, you can use the numpy numerical matrix/N-dimensional-array manipulation package. Specifically, numpy.empty or numpy.empty_like

That is the real answer to your question.

edited May 23, 2017 at 12:34

CommunityBot

11 silver badge

answered May 16, 2012 at 11:05

ninjagecko

91.5k24 gold badges144 silver badges153 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Ray Over a year ago

_ is just a "dumb" name for a variable not really needed when iterating a range? I wish just for range(10) could be written sometimes.

Amit Tripathi Over a year ago

x = [[None]]*10 is "wrong". Try x[0].append(1) and see the magic.

ninjagecko Over a year ago

@Death-Stalker: yes, I think that's what I was actually trying to point out and illustrate ("working with mutable objects"). But thank you, I think you've made me realize my answer is horribly worded. Fixed.

Nhu Trinh Over a year ago

how about xrange?

ninjagecko Over a year ago

David's comment is confusing because this answer does prominently suggest numpy.empty... as the first thing in fact.

jadkik94 · Accepted Answer · 2012-05-16 11:12:06Z

16

You can use this: [None] * 10. But this won't be "fixed size" you can still append, remove ... This is how lists are made.

You could make it a tuple (tuple([None] * 10)) to fix its width, but again, you won't be able to change it (not in all cases, only if the items stored are mutable).

Another option, closer to your requirement, is not a list, but a collections.deque with a maximum length. It's the maximum size, but it could be smaller.

import collections
max_4_items = collections.deque([None] * 4, maxlen=4)

But, just use a list, and get used to the "pythonic" way of doing things.

answered May 16, 2012 at 11:12

jadkik94

7,0882 gold badges34 silver badges39 bronze badges

1 Comment

Jonathan Over a year ago

Beware deque does NOT allow you to pop an element from the middle of the list.

Rolf of Saxony · Accepted Answer · 2019-02-01 16:52:42Z

14

This is more of a warning than an answer.
Having seen in the other answers my_list = [None] * 10, I was tempted and set up an array like this speakers = [['','']] * 10 and came to regret it immensely as the resulting list did not behave as I thought it should.
I resorted to:

speakers = []
for i in range(10):
    speakers.append(['',''])

As [['','']] * 10 appears to create an list where subsequent elements are a copy of the first element.
for example:

>>> n=[['','']]*10
>>> n
[['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', '']]
>>> n[0][0] = "abc"
>>> n
[['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', '']]
>>> n[0][1] = "True"
>>> n
[['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True']]

Whereas with the .append option:

>>> n=[]
>>> for i in range(10):
...  n.append(['',''])
... 
>>> n
[['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', '']]
>>> n[0][0] = "abc"
>>> n
[['abc', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', '']]
>>> n[0][1] = "True"
>>> n
[['abc', 'True'], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', '']]

I'm sure that the accepted answer by ninjagecko does attempt to mention this, sadly I was too thick to understand.
Wrapping up, take care!

answered Feb 1, 2019 at 16:52

Rolf of Saxony

22.6k5 gold badges43 silver badges61 bronze badges

1 Comment

Kevin Over a year ago

[expr] * n will evaluate expr, and then create an n-element list using that value. Critically, expr is only evaluated once. This is fine if expr evaluates to an immutable value, but is definitely not what you want if it evaluates to a mutable value, because every element would point to the same object. What you actually want in that case is for expr to evaluated once for every element. The Pythonic solution is [expr for _ in range(n)], so in this case [['',''] for _ in range(10)].

Vlad Bezden · Accepted Answer · 2019-01-18 14:57:56Z

9

You can do it using array module. array module is part of python standard library:

from array import array
from itertools import repeat

a = array("i", repeat(0, 10))
# or
a = array("i", [0]*10)

repeat function repeats 0 value 10 times. It's more memory efficient than [0]*10, since it doesn't allocate memory, but repeats returning the same number x number of times.

edited Jan 18, 2019 at 14:57

answered Jan 18, 2019 at 14:51

Vlad Bezden

90.7k27 gold badges261 silver badges190 bronze badges

Comments

BluePeppers · Accepted Answer · 2012-05-16 10:57:25Z

4

It's not really the python way to initialize lists like this. Anyway, you can initialize a list like this:

>>> l = [None] * 4
>>> l
[None, None, None, None]

answered May 16, 2012 at 10:57

BluePeppers

1,61314 silver badges12 bronze badges

Comments

ulidtko · Accepted Answer · 2021-01-22 17:12:48Z

3

Note also that when you used arrays in C++ you might have had somewhat different needs, which are solved in different ways in Python:

You might have needed just a collection of items; Python lists deal with this usecase just perfectly.
You might have needed a proper array of homogenous items. Python lists are not a good way to store arrays.

Python solves the need in arrays by NumPy, which, among other neat things, has a way to create an array of known size:

from numpy import *

l = zeros(10)

edited Jan 22, 2021 at 17:12

answered May 16, 2012 at 11:14

ulidtko

15.8k10 gold badges59 silver badges93 bronze badges

2 Comments

Lauritz V. Thaulow Over a year ago

Using from numpy import * will hide the python builtins all, abs, min, max, sum, any and round with the numpy equivalents, which might not always be what you want.

ulidtko Over a year ago

Yes, be careful that numpy module contains quite a lot of names (which are nevertheless convenient to have in your module namespace when you are writing array code). If possible name clashes cause trouble for you, use qualified imports.

Russell Dias · Accepted Answer · 2012-05-16 10:57:38Z

2

Python has nothing built-in to support this. Do you really need to optimize it so much as I don't think that appending will add that much overhead.

However, you can do something like l = [None] * 1000.

Alternatively, you could use a generator.

answered May 16, 2012 at 10:57

Russell Dias

73.9k5 gold badges58 silver badges72 bronze badges

1 Comment

wtm Over a year ago

Right,I not very familiar with python's memory management,I will change my mind.Thank you~

cobie · Accepted Answer · 2012-05-16 11:01:38Z

2

your_list = [None]*size_required

answered May 16, 2012 at 11:01

cobie

7,27912 gold badges41 silver badges63 bronze badges

Comments

Mishaa1 · Accepted Answer · 2017-06-29 14:00:06Z

2

fix_array = numpy.empty(n, dtype = object)

where n is the size of your array

though it works, it may not be the best idea as you have to import a library for this purpose. Hope this helps!

answered Jun 29, 2017 at 14:00

Mishaa1

314 bronze badges

Comments

David · Accepted Answer · 2024-03-24 15:40:55Z

1

The accepted answer didn't consider that the ops purely asked about array initialization, without any assignment.

I've ran a benchmark on Python 3.12.2 with all the proposed solutions and this illustrates that using the built-in array() package approach is faster than using the classic [None]*10000 approach, and should be the recommended way.

Additionally, using numpy without initialization is another 10X faster.

Using plain Python

def direct_none():
    return [None]*10000

def direct_zero():
    return [0]*10000

def inline_loop_none():
    return [None for _ in range(10000)]

def inline_loop_zero():
    return [0 for _ in range(10000)]

def loop_none():
    x = []
    for i in range(10000):
            x.append(None)
    return x

def loop_zero():
    x = []
    for i in range(10000):
            x.append(0)
    return x

Using Python's builtin Array package:

import array
import itertools

def array_zero_simple_int():
    return array.array("i", (0,)) * 10000

def array_zero_simple_long():
    return array.array("l", (0,)) * 10000

def array_zero_simple_float():
    return array.array("f", (0,)) * 10000

def array_zero_simple_double():
    return array.array("d", (0,)) * 10000

def array_zero_itertools():
    return array.array("i", itertools.repeat(0, 10000))

Using Numpy

import numpy

def numpy_zero():
    return numpy.zeros(10000)

def numpy_empty():
    return numpy.empty(10000)

Results

import sys
import timeit

for fct in [direct_none, direct_zero, inline_loop_none, inline_loop_zero, loop_none, loop_zero, array_zero_simple_int, array_zero_simple_long, array_zero_simple_float, array_zero_simple_double, array_zero_itertools, numpy_zero, numpy_empty]:
    timer = timeit.timeit(fct, number=1000)
    r = fct()
    el_type = type(r[9999]).__name__
    size = sys.getsizeof(r)
    print(f'{timer * 1000:7.3f} usec/loop for {fct.__name__}. Returns a {type(r).__name__} of {len(r)} {el_type} elements, and uses {size} bytes')

 12.132 usec/loop for direct_none. Returns a list of 10000 NoneType elements, and uses 80056 bytes
 12.132 usec/loop for direct_zero. Returns a list of 10000 int elements, and uses 80056 bytes
150.838 usec/loop for inline_loop_none. Returns a list of 10000 NoneType elements, and uses 85176 bytes
137.435 usec/loop for inline_loop_zero. Returns a list of 10000 int elements, and uses 85176 bytes

167.163 usec/loop for loop_none. Returns a list of 10000 NoneType elements, and uses 85176 bytes
167.730 usec/loop for loop_zero. Returns a list of 10000 int elements, and uses 85176 bytes
  0.794 usec/loop for array_zero_simple_int. Returns a array of 10000 int elements, and uses 40080 bytes
  1.328 usec/loop for array_zero_simple_long. Returns a array of 10000 int elements, and uses 80080 bytes
  0.795 usec/loop for array_zero_simple_float. Returns a array of 10000 float elements, and uses 40080 bytes
  1.262 usec/loop for array_zero_simple_double. Returns a array of 10000 float elements, and uses 80080 bytes
303.050 usec/loop for array_zero_itertools. Returns a array of 10000 int elements, and uses 40420 bytes
  1.330 usec/loop for numpy_zero. Returns a ndarray of 10000 float64 elements, and uses 80112 bytes
  0.131 usec/loop for numpy_empty. Returns a ndarray of 10000 float64 elements, and uses 80112 bytes

edited Mar 24, 2024 at 15:40

answered Mar 15, 2024 at 23:16

David

2,0382 gold badges23 silver badges33 bronze badges

9 Comments

no comment Over a year ago

Bad usage of Python's array. Do array.array("i", (0,)) * 10000. And I get slower times than you for all but the direct_* ones, where I get three times faster times than you. Using Python 3.12. Probably because of your old Python version, which is lacking some optimizations. Use 3.11 or newer.

David Over a year ago

@nocomment the point is not the exact timings which will be computer dependent, but the order of magnitude between the different solutions

no comment Over a year ago

Well your array code is like two or three orders of magnitude slower than the proper one (683 ms vs 1.6 ms when I just tried again). And for me, the first approach isn't 3-5X faster than using for loops but about 30X faster, also an order of magnitude difference from what you report.

David Over a year ago

well i upgraded the benchmark to python 3.12.2, but how you the array is very different. @nocomment can you explain why your method is faster even than numpy ?

no comment Over a year ago

Thanks. Well, array_zero_simple can just copy its one value 10000 times. Very low-level very fast. Whereas array_zero_itertools has interaction between the array and the iterator for every element, and they're objects whose bare value needs to get extracted. Why is numpy_zero slower? Likely it's twice as large, 8 bytes per element vs 4 bytes. Try print(x.__sizeof__()).

|

Collectives™ on Stack Overflow

How to create a fix size list in python? [duplicate]

10 Answers 10

Simple methods

Performance comparison of simple methods

Uninitialized memory

5 Comments

1 Comment

1 Comment

Comments

Comments

2 Comments

1 Comment

Comments

Comments

Using plain Python

Using Python's builtin Array package:

Using Numpy

Results

9 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

Simple methods

Performance comparison of simple methods

Uninitialized memory

5 Comments

1 Comment

1 Comment

Comments

Comments

2 Comments

1 Comment

Comments

Comments

Using plain Python

Using Python's builtin Array package:

Using Numpy

Results

9 Comments

Linked

Related