0

I make a variable corr_matrix by iterating over rows and columns and correlating values.

import numpy as np
import random

enc_dict = {k: int(random.uniform(1,24)) for k in range(24)}
ret_dict = {k: int(random.uniform(1,24)) for k in range(24)}

corr_matrix=np.zeros((24,24))
ind_matrix = np.zeros((24,24))

data = np.random.rand(24,24)
for enc_row in range(0,24):
            for ret_col in range(0,24):
                corr_matrix[enc_row, ret_col] = np.corrcoef(data[enc_row,:], data[ret_col,:])[0,1]
                if enc_dict[enc_row] == ret_dict[ret_col]:
                    ind_matrix = np.append(ind_matrix, [[enc_row, ret_col]])

I want to store the indices in the matrix where enc_dict[enc_row] == ret_dict[ret_col] as a variable to use for indexing corr_matrix. I can print the values, but I can't figure out how to store them in a variable in a way that allows me to use them for indexing later.

I want to:

  1. make a variable, ind_matrix that is the indices where the above statement is true.

  2. I want to use ind_matrix to index within my correlation matrix. I want to be able to index the whole row as well as the exact value where the above statement is true (enc_dict[enc_row] == ret_dict[ret_col])

I tried ind_matrix = np.append(ind_matrix, [[enc_row, ret_col]]) which gives me the correct values but it has a lot of 0s before the #s for some reason. Also it doesn't allow me to call each pair of points together to use for indexing. I want to be able to do something like corr_matrix[ind_matrix[1]]

4
  • 1
    Give a Minimal, Complete, Verifiable Example in your question. Commented Mar 5, 2018 at 18:05
  • 3
    Don't use np.append. It's slow and hard to use correctly. Commented Mar 5, 2018 at 18:43
  • what should I use instead? Commented Mar 5, 2018 at 18:48
  • @Maria, I've found that just creating a plain old python list then converting it to an array with vstack, hstack, or array tends to work best in terms of readability and performance. Commented Mar 5, 2018 at 21:19

1 Answer 1

1

Here is a modified version of your code containing a couple of suggestions and comments:

import numpy as np

# when indices are 0, 1, 2, ... don't use dictionary
# also for integer values use randint
enc_ = np.random.randint(1, 24, (24,))
ret_ = np.random.randint(1, 24, (24,))

data = np.random.rand(24,24)
# np.corrcoef is vectorized, no need to loop:
corr_matrix = np.corrcoef(data)
# the following is the clearest, but maybe not the fastest way of generating
# your index array:
ind_matrix = np.argwhere(np.equal.outer(enc_, ret_))

# this can't be used for indexing directly, you'll have to choose
# one of the following idioms

# EITHER spread to two index arrays
I, J = ind_matrix.T
# or directly I, J = np.where(np.equal.outer(enc_, ret_))
# single index
print(corr_matrix[I[1], J[1]])
# multiple indices
print(corr_matrix[I[[1,2,0]], J[[1,2,0]]])
# whole row
print(corr_matrix[I[1]])

# OR use tuple conversion
ind_matrix = np.array(ind_matrix)
# single index
print(corr_matrix[(*ind_matrix[1],)])
# multiple indices
print(corr_matrix[(*zip(*ind_matrix[[1,2,0]],),)])
# whole row
print(corr_matrix[ind_matrix[1, 0]])

# OR if you do not plan to use multiple indices
as_tuple = list(map(tuple, ind_matrix))
# single index
print(corr_matrix[as_tuple[1]])
# whole row
print(corr_matrix[as_tuple[1][0]])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.