difference between ways to generate index list in python

Question

I am reading Joel Grus's data science from scratch book and found something a bit mysterious. Basically, in some sample code, he wrote

a = [1, 2 ,3 ,4]
xs = [i for i,_ in enumerate(a)]

Why would he prefer to do this way? Instead of

xs = range(len(a))

Honestly, I don't know. Range is more readable than enumerate and avoids the uneccessary generated index... — Christian Sauer
– Christian Sauer, Commented Apr 15, 2016 at 12:44
this just looks like he doesn't know what he is doing TBH. an extra throwaway variable, and throwing away the only extra thing enumerate gets you? — Tommy
– Tommy, Commented Apr 15, 2016 at 12:59

Joel · Accepted Answer · 2016-04-16 03:53:56Z

18

Answer: personal preference of the author. I find

[i for i, _ in enumerate(xs)]

clearer and more readable than

list(range(len(xs)))

which feels clunky to me. (I don't like reading the nested functions.) Your mileage may vary (and apparently does!).

That said, I am pretty sure I didn't say not to do the second, I just happen to prefer the first.

Source: I am the author.

P.S. If you're the commenter who had no intention of reading anything I write about Python, I apologize if you read this answer by accident.

answered Apr 16, 2016 at 3:53

Joel

5274 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

PM 2Ring Over a year ago

I guess the list comp with enumerate is a little clearer to read than the triply nested function call... but I still don't like it. ;)

PM 2Ring Over a year ago

Perhaps my saying that I have no intention of reading anything you write about Python was a bit extreme. And although I don't like your your list comp, I guess it is Pythonic, since "Flat is better than nested". OTOH, "There should be one-- and preferably only one --obvious way to do it". :)

Antti Haapala · Accepted Answer · 2016-04-15 13:35:29Z

8

I looked at the code available on github and frankly, I do not see any other reason for this except the personal preference of the author.

However, the result needs to be a list in places like this:

indexes = [i for i, _ in enumerate(data)]  # create a list of indexes
random.shuffle(indexes)                    # shuffle them
for i in indexes:                          # return the data in that order
    yield data[i]

Using bare range(len(data)) in that part on Python 3 would be wrong, because random.shuffle() requires a mutable sequence as the argument, and the range objects in Python 3 are immutable sequences.

I personally would use list(range(len(data))) on Python 3 in the case that I linked to, as it is guaranteed to be more efficient and would fail if a generator/iterator was passed in by accident, instead of a sequence.

edited Apr 15, 2016 at 13:35

answered Apr 15, 2016 at 13:00

Antti Haapala

135k23 gold badges298 silver badges349 bronze badges

1 Comment

PM 2Ring Over a year ago

Nice point about len raising an error (TypeError) if data isn't a valid arg for it.

Simon Fraser · Accepted Answer · 2016-04-15 12:47:16Z

2

Without being the author, I would have to guess, but my guess is that it's for Python 2 and 3 compatibility.

In Python 2:

>>> a = [1,2,3,4]
>>> xs = range(len(a))
>>> xs
[0, 1, 2, 3]
>>> type(xs)
<type 'list'>

In Python 3:

>>> a = [1,2,3,4]
>>> xs = range(len(a))
>>> xs
range(0, 4)
>>> type(xs)
<class 'range'>

Now, that doesn't make a difference when you're directly iterating over the range, but if you're planning to use the index list for something else later on, the author may feel that the enumerate is simpler to understand than list(range(len(a)))

answered Apr 15, 2016 at 12:47

Simon Fraser

2,81821 silver badges26 bronze badges

2 Comments

PM 2Ring Over a year ago

If the author feels that the enumerate is simpler to understand than list(range(len(a))) I have no intention of reading anything he writes about Python! Sure, list(range(len(a))) is slightly inefficient in Python 2, but both those calls run at C speed so it's still pretty fast, and for large len(a) it will be much faster than the Python speed loop in a list comp using enumerate (or range or xrange).

Simon Fraser Over a year ago

I've not read the book, but it may also be a poor choice of syllabus ordering, too - if the book hasn't introduced the range statement, but has introduced enumerate, that might be a reason. Mind you, even if that's true, it's better to introduce range

Saloparenator · Accepted Answer · 2016-04-15 13:13:22Z

-2

Both are ok. When I started coding in python I was more list(range(len(a))) . Now I am more in pythonic way . Both are readable.

answered Apr 15, 2016 at 13:13

Saloparenator

3281 silver badge13 bronze badges

7 Comments

PM 2Ring Over a year ago

Sure, range(len(a)) can often be a symptom of un-Pythonic code. And one should often use enumerate instead. But in this case, list(range(len(a))) is more Pythonic than that list comp in the OP.

Saloparenator Over a year ago

I dont think range(len(a)) is unpythonic. And I really like list comprehension solution.

PM 2Ring Over a year ago

I didn't say that range(len(a)) is unpythonic, I said it can often be a symptom of un-Pythonic code. That's because it's often used to indirectly iterate over a list, IOW, to iterate via the index, rather than to iterate directly over the list items. Generally, it's better to iterate directly, and if you also need the index then you should use enumerate. But to use enumerate simply to get the index when you don't want the list items is just plain weird, IMHO.

PM 2Ring Over a year ago

(cont) Antti Happala's answer explains why list(range(len(data))) is better here (or just range(len(data)) on Python 2 if you don't care about Python 3 compatibility).

Saloparenator Over a year ago

"I personally would use list(range(len(data)))" this is not an explaination, but a personnal choice. I find personnaly find [index for index,_ in enumerate(lst)] really easy to read and I dont clain it is the only way to do it. reason (still personnal preference):

|

Collectives™ on Stack Overflow

difference between ways to generate index list in python

4 Answers 4

2 Comments

1 Comment

2 Comments

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

1 Comment

2 Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related