numpy divide row by row sum

Question

How can I divide a numpy array row by the sum of all values in this row?

This is one example. But I'm pretty sure there is a fancy and much more efficient way of doing this:

import numpy as np
e = np.array([[0., 1.],[2., 4.],[1., 5.]])
for row in xrange(e.shape[0]):
    e[row] /= np.sum(e[row])

Result:

array([[ 0.        ,  1.        ],
       [ 0.33333333,  0.66666667],
       [ 0.16666667,  0.83333333]])

DSM · Accepted Answer · 2015-03-14 17:21:40Z

133

Method #1: use None (or np.newaxis) to add an extra dimension so that broadcasting will behave:

>>> e
array([[ 0.,  1.],
       [ 2.,  4.],
       [ 1.,  5.]])
>>> e/e.sum(axis=1)[:,None]
array([[ 0.        ,  1.        ],
       [ 0.33333333,  0.66666667],
       [ 0.16666667,  0.83333333]])

Method #2: go transpose-happy:

>>> (e.T/e.sum(axis=1)).T
array([[ 0.        ,  1.        ],
       [ 0.33333333,  0.66666667],
       [ 0.16666667,  0.83333333]])

(You can drop the axis= part for conciseness, if you want.)

Method #3: (promoted from Jaime's comment)

Use the keepdims argument on sum to preserve the dimension:

>>> e/e.sum(axis=1, keepdims=True)
array([[ 0.        ,  1.        ],
       [ 0.33333333,  0.66666667],
       [ 0.16666667,  0.83333333]])

edited Mar 14, 2015 at 17:21

answered Apr 24, 2013 at 21:25

DSM

355k67 gold badges606 silver badges504 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Warren Weckesser Over a year ago

I don't see how you can drop the axis=1. Without the axis argument, sum() returns the sum of all the values in the array.

Jaime Over a year ago

In numpy 1.7 there is a keepdims argument that lets you do e/e.sum(axis=1, keepdims=True)

DSM Over a year ago

@WarrenWeckesser: I didn't say you could drop the 1 part, I said you could drop the axis= part.

Warren Weckesser Over a year ago

Ah, I misunderstood what you meant.

Michael Over a year ago

Could you explicitly explain the [:,None] notation? I see the change it makes, but don't get the coding convention.

|

M. Ali · Accepted Answer · 2016-06-24 16:38:32Z

5

You can do it mathematically as .

Here, E is your original matrix and D is a diagonal matrix where each entry is the sum of the corresponding row in E. If you're lucky enough to have an invertible D, this is a pretty mathematically convenient way to do things.

In numpy:

import numpy as np

diagonal_entries = [sum(e[row]) for row in range(e.shape[0])]
D = np.diag(diagonal_entries)
D_inv = np.linalg.inv(D)
e = np.dot(e, D_inv)

answered Jun 24, 2016 at 16:38

M. Ali

2,4461 gold badge23 silver badges22 bronze badges

1 Comment

Prasad Raghavendra Over a year ago

While this answer might be correct, using for loop isn't the way to go about it. It is not completely vectorised. Downvoted.

β.εηοιτ.βε · Accepted Answer · 2020-07-15 17:38:34Z

2

You can also use reshape method of numpy as follows:

e = np.array([[0., 1.],[2., 4.],[1., 5.]])
e/=np.sum(e, axis=1).reshape(-1,1)
e

array([[0.        , 1.    ],
       [0.33333333, 0.66666667],
       [0.16666667, 0.83333333]])

edited Jul 15, 2020 at 17:38

β.εηοιτ.βε

40.4k14 gold badges81 silver badges104 bronze badges

answered Jul 15, 2020 at 17:07

dipankar1234

492 bronze badges

Collectives™ on Stack Overflow

numpy divide row by row sum

3 Answers 3

6 Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related