0

As an example I am reading through the following:

https://docs.scipy.org/doc/numpy-dev/neps/new-iterator-ufunc.html

So I run a bit of code on my computer using iPython shown for example in the manual as:

   def iter_add_itview(x, y, out=None):
        it = np.nditer([x,y,out], [],
                    [['readonly'],['readonly'],['writeonly','allocate']])

        (a, b, c) = it.itviews
        np.add(a, b, c)

        return it.operands[2]

Which per the example results in tests cases as follows:

In [10]: a = np.arange(1000000,dtype='f4').reshape(100,100,100).T
In [12]: b = np.arange(10000,dtype='f4').reshape(100,100,1).T
In [11]: c = np.arange(10000,dtype='f4').reshape(1,100,100).T

In [4]: timeit np.add(np.add(np.add(a,b), c), a)
1 loops, best of 3: 99.5 ms per loop

In [9]: timeit iter_add_itview(iter_add_itview(iter_add_itview(a,b), c), a)
10 loops, best of 3: 29.3 ms per loop

SO naturally I want to try this excitement for myself on using NumPy1.12.1 in Python 2.7 on Linux with an Intel chipset, the only problem is I consistently get null results for the exact same experimental setups as in the example above:

In [12]: timeit np.add(np.add(np.add(a,b), c), a)
100 loops, best of 3: 10.7 ms per loop

In [13]: timeit iter_add_itview(iter_add_itview(iter_add_itvie
    ...: w(a,b), c), a)
100 loops, best of 3: 10.7 ms per loop

In this case I was supposed to be seeing improvements from buffer cache optimization.

Why is it that I am not able to replicate a section of the dev manual for a updated NumPy version using a legacy version of the code?

8
  • Those improvements have probably been folded into the standard ufunc implementation already. Commented Apr 28, 2017 at 20:31
  • so the dev manual is a release ahead but out of date? Commented Apr 28, 2017 at 20:33
  • They don't take NEPs out of the docs just because they've been implemented. They just put them under the "Implemented NEPs" section, like this one is. Commented Apr 28, 2017 at 20:36
  • 2
    Go to the NEP overview page, and you'll see this one in the Implemented section. Commented Apr 28, 2017 at 20:38
  • 1
    Note that the document you're reading is from 2010. It's over 6 years old. Commented Apr 28, 2017 at 20:39

1 Answer 1

2

Just to be clear the In[4] and [9] times are copied from that 6 yr old document. The [12][13] times are from your own tests (and basically the same as what I get). I haven't studied that old dev document, but have run the np.nditer examples on

https://docs.scipy.org/doc/numpy/reference/arrays.nditer.html

and know that nditer does not normally speed code. This iterator is meant to be used in C level code, and its exposure at the Python level is a convenience. It lets us test ideas in Python code before moving them to C or Cython. Note the Cython example at the end. Even so, I found I could get better speed in a simple multiplication case by using Cython memoryviews.

np.ndindex is one of the few numpy functions that uses np.nditer in Python code. I've occasionally suggested a similar pattern to produce depth limited iteration.

So don't worry too much about mastering np.nditer.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.