3

I am looking at how to speed up one of my functions. The function is called with a number of two-dimensional arrays of the same size. I want to combine these into a 4D with 3x3 in the last two dimensions, and later get the eigenvalues of the whole array.

I have managed to do it using two nested for loops, but it is a bit slower then I would desire, so is there any good way of speeding up the code?

def principal(xx, xy, xz, yy, yz, zz):

    import numpy as np

    xx = np.array(xx)
    xy = np.array(xy)
    xz = np.array(xz)
    yy = np.array(yy)
    yz = np.array(yz)
    zz = np.array(zz)

    size = np.shape(xx)
    Princ = np.empty((size[1], size[0], 3, 3))
    for j in range(size[1]):
        for i in range(size[0]):
            Princ[j, i, :, :] = np.array([[xx[i, j], xy[i, j], xz[i, j]],
                                          [xy[i, j], yy[i, j], yz[i, j]],
                                          [xz[i, j], yz[i, j], zz[i, j]]])
    Princ = np.linalg.eigvalsh(Princ)

    return Princ


import numpy as np

number_arrays_1 = 3
number_arrays_2 = 4

xx = np.ones((number_arrays_1, number_arrays_2))*80
xy = np.ones((number_arrays_1, number_arrays_2))*30
xz = np.ones((number_arrays_1, number_arrays_2))*0
yy = np.ones((number_arrays_1, number_arrays_2))*40
yz = np.ones((number_arrays_1, number_arrays_2))*0
zz = np.ones((number_arrays_1, number_arrays_2))*60

Princ = principal(xx, xy, xz, yy, yz, zz)
print(Princ)

The reason I convert with xx = np.array(xx) is that in the larger program, I pass a pandas dataframe rather than a numpy array into the function.

6
  • You are importing numpy 2 times and for each function call. You should move them to the top of the script Commented Aug 14, 2020 at 11:37
  • The function principal which i define is usually located in a separate file which i then import into the main file for usage. I believe that i have to import the modules used within each function that i define, or is that not true? Commented Aug 14, 2020 at 11:57
  • 1
    @DogukanAltay, the extra imports are not a problem. Commented Aug 14, 2020 at 15:35
  • Will the xx etc always be np.ones(....)*c? Or is that just convenience for this example? Commented Aug 14, 2020 at 15:38
  • It is just for this this example, they will be very different as they are output from an simulation. It is just so that I can see there is gives the correct value (I know what the eigenvalues are which are evaluated at the end) Commented Aug 14, 2020 at 18:13

1 Answer 1

2

This looks like a simple stack and reshape operation:

def principal(xx, xy, xz, yy, yz, zz):
    princ = np.stack((xx.T, xy.T, xz.T, xy.T, yy.T, yz.T, xz.T, yz.T, zz.T), axis=-1).reshape(*xx.shape[::-1], 3, 3)
    return = np.linalg.eigvalsh(princ)

You don't need to explicitly call np.array on the inputs if they are already arrays. xx.values() on the dataframes should return the numpy values.

An alternative approach is to build the array, and then swap out the 3x3 dimensions to the back. This will likely be less efficient since the first approach makes the 3x3 dimensions contiguous, while this one does not:

princ = np.array([[xx, xy, xz], [xy, yy, yz], [xz, yz, zz]]).T

Not really related, but you could generate your arrays faster like this:

target_shape = (3, 4)
values = np.array([80, 30, 0, 40, 0, 60])
xx, xy, xz, yy, yz, zz = np.full((6, *target_shape), values.reshape(-1, 1, 1))

In fact, if your data allows it, you can even save on unpacking:

data = np.full((6, *target_shape), values.reshape(-1, 1, 1))
principal(*data)
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks alot, the first way with the stacking and then reshaping almost worked but i had to transpose the xx, xy etc to get it to the proper shape. In the for loops the notation from my previous function is Princ[j, i, :, :] = x[i, j]... so they are swapped due to how the data is feed into and out of the function later. But once i transposed the arrays it works well, went from about 3-4 minutes of calculations to about 30s so i would consider that a win! Also thanks for notifying me skip the conversion of xx = np.array(xx) i think this also saves some memory and time.
Will also try to use the .value(), or as it is stated in the pandas documentation .to_numpy() to see if that also helps.
@RasmusSchützer. I've fixed the answer to implement the transpose properly

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.