73

How can I divide a numpy array row by the sum of all values in this row?

This is one example. But I'm pretty sure there is a fancy and much more efficient way of doing this:

import numpy as np
e = np.array([[0., 1.],[2., 4.],[1., 5.]])
for row in xrange(e.shape[0]):
    e[row] /= np.sum(e[row])

Result:

array([[ 0.        ,  1.        ],
       [ 0.33333333,  0.66666667],
       [ 0.16666667,  0.83333333]])

3 Answers 3

133

Method #1: use None (or np.newaxis) to add an extra dimension so that broadcasting will behave:

>>> e
array([[ 0.,  1.],
       [ 2.,  4.],
       [ 1.,  5.]])
>>> e/e.sum(axis=1)[:,None]
array([[ 0.        ,  1.        ],
       [ 0.33333333,  0.66666667],
       [ 0.16666667,  0.83333333]])

Method #2: go transpose-happy:

>>> (e.T/e.sum(axis=1)).T
array([[ 0.        ,  1.        ],
       [ 0.33333333,  0.66666667],
       [ 0.16666667,  0.83333333]])

(You can drop the axis= part for conciseness, if you want.)

Method #3: (promoted from Jaime's comment)

Use the keepdims argument on sum to preserve the dimension:

>>> e/e.sum(axis=1, keepdims=True)
array([[ 0.        ,  1.        ],
       [ 0.33333333,  0.66666667],
       [ 0.16666667,  0.83333333]])
Sign up to request clarification or add additional context in comments.

6 Comments

I don't see how you can drop the axis=1. Without the axis argument, sum() returns the sum of all the values in the array.
In numpy 1.7 there is a keepdims argument that lets you do e/e.sum(axis=1, keepdims=True)
@WarrenWeckesser: I didn't say you could drop the 1 part, I said you could drop the axis= part.
Ah, I misunderstood what you meant.
Could you explicitly explain the [:,None] notation? I see the change it makes, but don't get the coding convention.
|
5

You can do it mathematically as enter image description here.

Here, E is your original matrix and D is a diagonal matrix where each entry is the sum of the corresponding row in E. If you're lucky enough to have an invertible D, this is a pretty mathematically convenient way to do things.

In numpy:

import numpy as np

diagonal_entries = [sum(e[row]) for row in range(e.shape[0])]
D = np.diag(diagonal_entries)
D_inv = np.linalg.inv(D)
e = np.dot(e, D_inv)

1 Comment

While this answer might be correct, using for loop isn't the way to go about it. It is not completely vectorised. Downvoted.
2

You can also use reshape method of numpy as follows:

e = np.array([[0., 1.],[2., 4.],[1., 5.]])
e/=np.sum(e, axis=1).reshape(-1,1)
e

array([[0.        , 1.    ],
       [0.33333333, 0.66666667],
       [0.16666667, 0.83333333]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.