2

I'm comparing methods to do calculations against large arrays and wanted to compare the speed of broadcasting operators in numpy versus alternatives. I was surprised to see the speed of the python map() function though, and am wondering if someone could explain how this is so much faster than broadcasting.

Broadcasting

%%timeit farenheit = np.linspace( -10, 20, 1000 )
celcius = (farenheit - 32) * (5/9)

4.5 µs ± 99.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

List comprehension

%%timeit farenheit = np.linspace( -10, 20, 1000 )
[(temp - 32) * (5/9) for temp in farenheit]

886 µs ± 4.56 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Python 3 map()

%%timeit farenheit = np.linspace( -10, 20, 1000 )
celcius = map(lambda temp: (temp - 32) * (5/9), farenheit)

248 ns ± 41.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

6
  • Why are including farenheit = np.linspace( -10, 20, 1000 ) part in the timings too? For a better benchmarking (to compare NumPy vs map, etc.), think it's better to pre-process that part. Commented Sep 14, 2019 at 17:01
  • The cell magic %%timeit should exclude the array creation (whatever is on the first line is ignored). ipython.readthedocs.io/en/stable/interactive/magics.html Commented Sep 14, 2019 at 17:04
  • 1
    %%timeit includes everything in that block. So, linspace one is included too. Commented Sep 14, 2019 at 17:04
  • @Divakar, I routinely use the cell %%timeit to precalculate objects. For example %%timeit x = arr.copy() \n x *= 100 lets me time the *= without timing the copy. Commented Sep 14, 2019 at 19:05
  • @hpaulj My point was that the focus is to compare NumPy vs map, etc. and with this Q&A it seems it's the calculation of celcius through those different ways. Timing everything, doesn't let us do that. I won't mind seeing the timings of the pre-calculation part separately though. Commented Sep 14, 2019 at 19:10

1 Answer 1

4

map is so fast because it's not actually running the calculation. It doesn't return a new list/array with new values, it returns a map object (an iterator) that does the calculation only when the items are needed.

For a fair comparison, you should do list(celcius) at the end of your first part. Only then are the calculations executed. If your lambda (or another function) had a print somewhere in it, you would see that map() by itself isn't really executing those commands yet.

To read more on map: https://docs.python.org/3/library/functions.html#map

An example:

def double(x):
    print('hi')
    return x*2

a = [1,2,3]
b = map(double, a)

# notice nothing is printing, the calculation isn't happening as well

c = list(b) # this will print 'hi' 3 times as well as returning the doubled list
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.