97

I am interested in understanding the new language design of Python 3.x.

I do enjoy, in Python 2.7, the function map:

Python 2.7.12
In[2]: map(lambda x: x+1, [1,2,3])
Out[2]: [2, 3, 4]

However, in Python 3.x things have changed:

Python 3.5.1
In[2]: map(lambda x: x+1, [1,2,3])
Out[2]: <map at 0x4218390>

I understand the how, but I could not find a reference to the why. Why did the language designers make this choice, which, in my opinion, introduces a great deal of pain. Was this to arm-wrestle developers in sticking to list comprehensions?

IMO, list can be naturally thought as Functors; and I have been somehow been thought to think in this way:

fmap :: (a -> b) -> f a -> f b
23
  • 3
    The rationale should be the same as to why we use generators instead of list comprehensions. By using lazy evaluation we don't need to keep huge things in memory. Check the accepted answer here: stackoverflow.com/questions/1303347/… Commented Oct 13, 2016 at 8:04
  • 8
    Could you explain why this brings you "a great deal of pain"? Commented Oct 13, 2016 at 8:06
  • 3
    I think it's because years of usage showed that most common uses of map simply iterated over the result. Building a list when you don't need it is inefficient so the devs decided to make map lazy. There's a lot to be gained here for performance and not a lot to be lost (If you need a list, just ask for one ... list(map(...))). Commented Oct 13, 2016 at 8:09
  • 3
    Ok, I find it interesting that rather than keeping the Functor pattern and offering a lazy version of List, they somehow made it a decision to force a lazy evaluation of a list whenever it is mapped. I would have preferred to have the right to make my own choice, aka, Generator -> map -> Generator or List -> map -> List (up to me to decide) Commented Oct 13, 2016 at 8:09
  • 5
    @NoIdeaHowToFixThis, actually is up to you, if you need the whole list, just transform it to a list, easy as hell Commented Oct 13, 2016 at 8:10

4 Answers 4

41

I think the reason why map still exists at all when generator expressions also exist, is that it can take multiple iterator arguments that are all looped over and passed into the function:

>>> list(map(min, [1,2,3,4], [0,10,0,10]))
[0,2,0,4]

That's slightly easier than using zip:

>>> list(min(x, y) for x, y in zip([1,2,3,4], [0,10,0,10]))

Otherwise, it simply doesn't add anything over generator expressions.

Sign up to request clarification or add additional context in comments.

5 Comments

I think that if we add the desire to stress that list comprehensions are more pythonic and the language designers wanted to stress that, this is the most on-spot answer, I think. @vishes_shell somehow does not focus enough on language design.
Produces different results in Python 2 and 3 if the two lists are not of equal length. Try c = list(map(max, [1,2,3,4], [0,10,0,10, 99])) in python 2 and in python 3.
Here is a reference for the original plan to remove map altogether from python3: artima.com/weblogs/viewpost.jsp?thread=98196
Hmm how odd when I wrap map in list, I get a list of 1 element lists.
this doesn't really address the why. Map returns an iterator so that it can be consumed lazily
27

Because it returns an iterator, it omit storing the full size list in the memory. So that you can easily iterate over it in the future not making any pain to memory. Possibly you even don't need a full list, but the part of it, until your condition is reached.

You can find this docs useful, iterators are awesome.

An object representing a stream of data. Repeated calls to the iterator’s __next__() method (or passing it to the built-in function next()) return successive items in the stream. When no more data are available a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its __next__() method just raise StopIteration again. Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as a list) produces a fresh new iterator each time you pass it to the iter() function or use it in a for loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.

Comments

14

Guido answers this question here: "since creating a list would just be wasteful".

He also says that the correct transformation is to use a regular for loop.

Converting map() from 2 to 3 might not just be a simple case of sticking a list( ) around it. Guido also says:

If the input sequences are not of equal length, map() will stop at the termination of the shortest of the sequences. For full compatibility with map() from Python 2.x, also wrap the sequences in itertools.zip_longest(), e.g.

map(func, *sequences)

becomes

list(map(func, itertools.zip_longest(*sequences)))

2 Comments

Guido comment is on map() invoked for the side effects of the function, not on its use as a functor.
The transformation with zip_longest is wrong. you have to use itertools.starmap for it to be equivalent: list(starmap(func, zip_longest(*sequences))). That's because zip_longest produces tuples, so the func would receive a single n-uple argument instead of n distinct arguments as is the case when calling map(func, *sequences).
11

In Python 3 many functions (not just map but zip, range and others) return an iterator rather than the full list. You might want an iterator (e.g. to avoid holding the whole list in memory) or you might want a list (e.g. to be able to index).

However, I think the key reason for the change in Python 3 is that while it is trivial to convert an iterator to a list using list(some_iterator) the reverse equivalent iter(some_list) does not achieve the desired outcome because the full list has already been built and held in memory.

For example, in Python 3 list(range(n)) works just fine as there is little cost to building the range object and then converting it to a list. However, in Python 2 iter(range(n)) does not save any memory because the full list is constructed by range() before the iterator is built.

Therefore, in Python 2, separate functions are required to create an iterator rather than a list, such as imap for map (although they're not quite equivalent), xrange for range, izip for zip. By contrast Python 3 just requires a single function as a list() call creates the full list if required.

7 Comments

AFAIK in Python 2.7 functions from itertools return iterators too. Also, I would not see iterators as lazy lists, since lists can be iterated multiple times and accessed randomly.
@abukaj ok thanks, I've edited my answer to try to be clearer
@IgorRivin what do you mean? Python 3 map objects do have a next() method. Python 3 range range objects are not strictly iterators I know
@Chris_Rands in my Anaconda distribution python 3.6.2, doing foo = map(lambda x: x, [1, 2, 3]) returns a map object foo. doing foo.next() comes back with an error: 'map' object has no attribute 'next'
@IgorRivin: Methods beginning and ending with __ are reserved to Python; without that reservation, you have the problem distinguishing things for which next is just a method (they're not really iterators) and things that are iterators. In practice, you should skip the methods and just use the next() function (e.g. next(foo)), which works properly on every Python version from 2.6 on. It's the same way you use len(foo) even though foo.__len__() would work just fine; the dunder methods are generally intended not to be called directly, but implicitly as part of some other operation.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.