As an example I am reading through the following:
https://docs.scipy.org/doc/numpy-dev/neps/new-iterator-ufunc.html
So I run a bit of code on my computer using iPython shown for example in the manual as:
def iter_add_itview(x, y, out=None):
it = np.nditer([x,y,out], [],
[['readonly'],['readonly'],['writeonly','allocate']])
(a, b, c) = it.itviews
np.add(a, b, c)
return it.operands[2]
Which per the example results in tests cases as follows:
In [10]: a = np.arange(1000000,dtype='f4').reshape(100,100,100).T
In [12]: b = np.arange(10000,dtype='f4').reshape(100,100,1).T
In [11]: c = np.arange(10000,dtype='f4').reshape(1,100,100).T
In [4]: timeit np.add(np.add(np.add(a,b), c), a)
1 loops, best of 3: 99.5 ms per loop
In [9]: timeit iter_add_itview(iter_add_itview(iter_add_itview(a,b), c), a)
10 loops, best of 3: 29.3 ms per loop
SO naturally I want to try this excitement for myself on using NumPy1.12.1 in Python 2.7 on Linux with an Intel chipset, the only problem is I consistently get null results for the exact same experimental setups as in the example above:
In [12]: timeit np.add(np.add(np.add(a,b), c), a)
100 loops, best of 3: 10.7 ms per loop
In [13]: timeit iter_add_itview(iter_add_itview(iter_add_itvie
...: w(a,b), c), a)
100 loops, best of 3: 10.7 ms per loop
In this case I was supposed to be seeing improvements from buffer cache optimization.
Why is it that I am not able to replicate a section of the dev manual for a updated NumPy version using a legacy version of the code?