Broadcasting with concatenate operator

Question

numpy.broadcasting allows to perform basic operations (additions, multiplication, etc.) with arrays of different shapes (under certain conditions on these shapes). For example:

>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[ 1.,  2.,  3.],
       [11., 12., 13.],
       [21., 22., 23.],
       [31., 32., 33.]])

Each element of the output array is the addition of the elements in the input arrays. This can be described as an 'outer addition operation'.

Is there a simple way to do this with the concatenation operator instead? In other words, is there a simple way to perform an 'outer concatenation operation'? Which means that, with the previous example, the output array should be:

array([[[ 0.,  1.],
        [ 0.,  2.],
        [ 0.,  3.]],

       [[10.,  1.],
        [10.,  2.],
        [10.,  3.]],

       [[20.,  1.],
        [20.,  2.],
        [20.,  3.]],

       [[30.,  1.],
        [30.,  2.],
        [30.,  3.]]])

With few tries, I came up with this solution using numpy.meshgrid but it looks quite under-optimized to me and might be improved:

>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> c,d = np.meshgrid(a,b)
>>> np.concatenate((c.T[..., None], d.T[..., None]), axis=-1)
array([[[ 0.,  1.],
        [ 0.,  2.],
        [ 0.,  3.]],

       [[10.,  1.],
        [10.,  2.],
        [10.,  3.]],

       [[20.,  1.],
        [20.,  2.],
        [20.,  3.]],

       [[30.,  1.],
        [30.,  2.],
        [30.,  3.]]])

Is there a more 'pythonic' way to do this?

This is a feature I miss often too... I usually go for np.broadcast_arrays to manually do the broadcast, then np.stack or np.concatenate them together. — Chrysophylaxs
– Chrysophylaxs, Commented Mar 5, 2024 at 12:54
Playing around with different stacks and broadcast orderings, I found: np.dstack(np.broadcast_arrays(a[:, None], b)) — Chrysophylaxs
– Chrysophylaxs, Commented Mar 5, 2024 at 13:04
@Chrysophylaxs your solution is interesting as it allows to concatenate more than one array (as it is possible to do with regular broadcasting). np.dstack(np.broadcast_arrays(a[:, None], b, c)) works as well. — Guillaume Mougeot
– Guillaume Mougeot, Commented Mar 5, 2024 at 15:08
concatenate (and stack derivatives) doesn't do broadcasting, nor is it an operator. Your use of meshgrid replicates the inputs to a desired size; concatenate then joins them. I could have suggested using repeat and tile, but the idea using of broadcast_arrays is a good one. — hpaulj
– hpaulj, Commented Mar 5, 2024 at 18:02
does this work np.array(np.meshgrid(a, b)).T?? Parharps you want to use itertools.product — Onyambu
– Onyambu, Commented Mar 5, 2024 at 18:18

hpaulj · Accepted Answer · 2024-03-05 23:05:04Z

I added a fast broadcasted assignment solution at the end.

What does it mean to be more 'pythonic' (or in this case more 'numpy-onic')?

If it runs and produces the desire result, is that enough? 'one liners' are stylish in perl, but not python. broadcasting is a valid concept when using operators like +, but concatenate isn't an operator.

In sense then we are left with comparing speeds. But all speed tests need to be qualified by the size of the arrays.

Anyways for your test case:

In [33]: %%timeit 
    ...: c,d = np.meshgrid(a,b)
    ...: np.concatenate((c.T[..., None], d.T[..., None]), axis=-1)

82.7 µs ± 337 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

The broadcast_arrays approach produces two arrays, similar to meshgrid (actually just their transpose):

In [34]: np.broadcast_arrays(a[:, None], b)
Out[34]: 
[array([[ 0.,  0.,  0.],
        [10., 10., 10.],
        [20., 20., 20.],
        [30., 30., 30.]]),
 array([[1., 2., 3.],
        [1., 2., 3.],
        [1., 2., 3.],
        [1., 2., 3.]])]

Speed is slightly better, but not an 'order of magnitude':

In [35]: timeit np.stack(np.broadcast_arrays(a[:, None], b),axis=2)
77.4 µs ± 2.89 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

But we can construct the same broadcasted arrays with repeat (that's a good compiled numpy method):

In [36]: (a[:,None].repeat(3,axis=1), b[None,:].repeat(4, axis=0))
Out[36]: 
(array([[ 0.,  0.,  0.],
        [10., 10., 10.],
        [20., 20., 20.],
        [30., 30., 30.]]),
 array([[1., 2., 3.],
        [1., 2., 3.],
        [1., 2., 3.],
        [1., 2., 3.]]))

And surprise, it's faster:

In [37]: timeit np.stack((a[:,None].repeat(3,axis=1), b[None,:].repeat(4, axis=0)),axis=2)
29.7 µs ± 99.6 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

So which is more 'pythonic'? For this small example the repeat is faster, even if it is 'wordier'. My guess is that for much larger arrays, the broadcast_arrays will scale better.

broadcasting code

Someone found and down-voted my answer to a previous broadcasting question

what happens under the hood of broadcasting a numpy array

Comments suggest looking at the source code found in np.lib.stride_tricks.

broadcast_array is python code, and uses _broadcast_to. That in turn uses np.nditer. That's known to be slow, at least in the python interface version.

The arrays produced by meshgrid and my repeats are copies. But the arrays produced by broadcast_arrays (Out[34]), are views, created from a and b by playing with the shape and strides. Normally making a view is fast since it doesn't have to copy the base values. But my guess is that the setup code for the stride_tricks functions is time consuming.

In [44]: _34[0].base
Out[44]: array([ 0., 10., 20., 30.])

In [45]: _34[1].base
Out[45]: array([1., 2., 3.])

In [46]: _34[0].strides, _34[1].strides
Out[46]: ((8, 0), (0, 8))

As suggested by the module name, stride_tricks, using this broadcast_arrays function is a nice, cute trick, but not essential to understanding or using numpy, and especially not the normal use of broadcasting with operators.

broadcasted assignment

Here's an even better version. Instead of using concatenate, it makes the target array directly, and fills it in with values from a and b:

In [75]: timeit res=np.empty((4,3,2)); res[:,:,0]=a[:,None]; res[:,:,1]=b
4.56 µs ± 14 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

This does utilize broadcasting, as can be seen with an erronious assignment:

In [76]: res=np.empty((4,3,2)); res[:,:,0]=a
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[76], line 1
----> 1 res=np.empty((4,3,2)); res[:,:,0]=a

ValueError: could not broadcast input array from shape (4,) into shape (4,3)

broadcasting is not just used by binary operators, it is also used during advanced indexing, and as shown here when assigning values to a slice of an array.

Here a (4,1) a can fit the (4,3) slot, and (3,) b can also fit that size slot.

Thank you for your (very) detailed answer! Your last comment is perhaps the closest to what I had in mind by saying use broadcast for concatenation or by saying "pythonic" (sorry for the vagueness of this term...)
I had to step back from the problem as you presented it - the concatenate as operator - and let my mind visualize a different task. To me, programming, especially with numpy, is almost a visual task.
broadcast_to (and related) is slower for small dimensions, but scales better when the dimensions get large. So the initial overhead of using as_strided is high, but no extra memory allocation or copying is needed.

Collectives™ on Stack Overflow

Broadcasting with concatenate operator

1 Answer 1

broadcasting code

broadcasted assignment

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

broadcasting code

broadcasted assignment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related