2

I've got two arrays:

  • data of shape (2466, 2498, 9), where the dimensions are (asset, date, returns).
  • correlation_matrix of shape (2466, 2466) (with 0's on the diagonal)

I want to get the dot product that equates to the expected returns, which is the returns of each asset multiplied by the correlation_matrix. It should give a shape the same as data.

I've tried:

data.transpose([1, 2, 0]) @ correlation_matrix

but this just hangs my PC (been going 10 minutes and counting).

I also tried:

np.einsum('ijk,lm->ijk', data, correlation_matrix)

but I'm less familiar with einsum, and this also hangs.

What am I doing wrong?

4
  • Think you can just do data*correlation_matrix.sum(), assuming your einsum works. Commented Jun 25, 2020 at 18:34
  • 2
    If things are hanging or taking too long, step back and test something smaller. Make sure the code is doing what you want with small arrays before stressing memory with something large. Commented Jun 25, 2020 at 18:39
  • Your einsum just sums all values of correlation_matrix and multiplies data by the resulting scalar. That probably not what you want. Commented Jun 25, 2020 at 22:16
  • Have you looked at your task manager or htop to see if your computer has enough RAM to do the operation without SWAPing to secondary memory (hard drive)? Commented Apr 27, 2024 at 20:57

2 Answers 2

3

With your .transpose((1, 2, 0)) data, the correct form is:

"ijs,sk"  # -> ijk

Since for a tensor A and B, we can write:

C_{ijk} = Σ_s A_{ijs} * B_{sk}

If you want to avoid transposing your data beforehand, you can just permute the indices:

"sij,sk"  # -> ijk

To verify:

p, q, r = 2466, 2498, 9

a = np.random.randint(255, size=(p, q, r))
b = np.random.randint(255, size=(p, p))

c1 = a.transpose((1, 2, 0)) @ b
c2 = np.einsum("sij,sk", a, b)

>>> np.all(c1 == c2)
True

The amount of multiplications needed to compute this for (p, q, r) shaped data is p * np.prod(c.shape) == p * (q * r * p) == p**2 * q * r. In your case, that is 136_716_549_192 multiplications. You also need approximately the same number of additions, so that gives us somewhere close to 270 billion operations. If you want more speed, you could consider using a GPU for your computations via cupy.

def with_np():
    p, q, r = 2466, 2498, 9
    a = np.random.randint(255, size=(p, q, r))
    b = np.random.randint(255, size=(p, p))
    c1 = a.transpose((1, 2, 0)) @ b
    c2 = np.einsum("sij,sk", a, b)

def with_cp():
    p, q, r = 2466, 2498, 9
    a = cp.random.randint(255, size=(p, q, r))
    b = cp.random.randint(255, size=(p, p))
    c1 = a.transpose((1, 2, 0)) @ b
    c2 = cp.einsum("sij,sk", a, b)

>>> timeit(with_np, number=1)
513.066

>>> timeit(with_cp, number=1)
0.197

That's a speedup of 2600, including memory allocation, initialization, and CPU/GPU copy times! (A more realistic benchmark would give an even larger speedup.)

Sign up to request clarification or add additional context in comments.

3 Comments

Wow, that's an incredible speedup! What kind of GPU do you have? That might be worth investment ;)
@cjm2671 It's a bit "outdated" -- NVIDIA GTX 1060 6GB ($300 when purchased, 5 years ago). You can probably get a newer graphics card model for cheaper than this card is being sold for nowadays, though. If you want CUDA computations, I recommend sticking with NVIDIA.
Well, looks like they're around $30 now on ebay, seems like a worthy investment ! Thank you! :)
2

There are different ways to do this product:

# as you already suggested:
data.transpose([1, 2, 0]) @ correlation_matrix

# using einsum
np.einsum('ijk,il', data, correlation_matrix)

# using tensordot to explicitly specify the axes to sum over
np.tensordot(data, correlation_matrix, axes=(0,0))

All of them should give the same result. The timing for some small matrices was more or less the same for me. So your problem is the large amount of data, not an inefficient implementation.

A=np.arange(100*120*9).reshape((100, 120, 9))
B=np.arange(100**2).reshape((100,100))

timeit('A.transpose([1,2,0])@B', globals=globals(), number=100)
# 0.747475513999234
timeit("np.einsum('ijk,il', A, B)", globals=globals(), number=100)
# 0.4993825999990804
timeit('np.tensordot(A, B, axes=(0,0))', globals=globals(), number=100)
# 0.5872082839996438

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.