0

I have implemented and algorithm to remove some columns and rows from an ndarray in Python 2.7, however I feel there should be a better way to do it. Probably I do not know how to do it well in Python, this is why I put the questions here. I have been searching but I have been not succesful finding similar questions and in the documentation (for example in slicing and indexing documentation from scipy)

Assume I have a ndarray with some rows and columns:

number_of_rows = 3
number_of_columns = 3
a = np.arange(number_of_rows*number_of_columns).reshape(number_of_rows,number_of_columns)
a

Which output is:

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

Let's supposse I want to remove some columns and/or rows of the previous ndarray. In particular I want to remove column 0 and row 1, this is an output like:

array([[1, 2],
       [7, 8]])

To do that I follow the following steps, however they look too me not very elegant and I feel they should be a better implementation.

  1. I select to columns and rows to remove, in this example:

    rows_to_remove = [1]
    columns_to_remove = [0]
    
  2. Now I create a couple of list with the columns and rows to keep.

    rows_to_keep = list(set(range(0,a.shape[0]))-set(rows_to_remove))
    columns_to_keep = list(set(range(0,a.shape[1]))-set(columns_to_remove))
    

    This step in Matlab will be simpler just by using ~ to slice the indexes of the matrix (in python ndarray). Is there a better way to do this?.

  3. Then I select those columns and rows to keep:

    a[rows_to_keep,:][:,columns_to_keep]
    

Output:

array([[1, 2],
       [7, 8]])

Please note that if you just write:

a[rows_to_keep,columns_to_keep]

which output is:

array([1, 8])

This is a little bit socking for me, that a[rows_to_keep,columns_to_keep] is different to a[rows_to_keep,:][:,columns_to_keep]. Is there a better way to cover those steps?

Thank you very much

1

2 Answers 2

0

You could use the delete method to get this done: Using the array giving in your question as example. It would go like this:

number_of_rows = 3
number_of_columns = 3
a=np.arange(number_of_rows*number_of_columns).reshape(number_of_rows,number_of_columns)
b=np.delete(a,1,0)
b=np.delete(b,0,1)

And Voilà, b contains the output array, you want !!

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks. I should not do that for different reasons: a) the position is usefull to track back the data. b) later I want to use the whole matrix. Best wishes
You could put the positions of the rows and columns to delete in variables, before using the delete method. You could as well keep the original matrix !
True, but it is a big matrix and from the point of view of memory it can be ineficient.
0

For Question 2)

Instead of :

a[rows_to_keep,:][:,columns_to_keep]

use:

a[np.ix_(rows_to_keep,columns_to_keep)].

This is called Advanced Indexing (see [Numpy documentation1 and Writting in sub-ndarray of a ndarray in the most pythonian way. Python 2 )

For Questions 1) I will use Question 2 previous solution: a) creation of a mask, there are more elegant ways to do it, for example see Create a boolean mask from an array, but for simplicity:

mask = np.zeros(a.shape,dtype=bool)
mask[rows_to_remove,:] = True
mask[:,columns_to_remove] = True 

Now you can visualize:

a[~np.array(mask)] 

Note now there is not need for Question 2) answer.

Summary:

mask = np.zeros(a.shape,dtype=bool)
mask[rows_to_remove,:] = True
mask[:,columns_to_remove] = True 
a[~np.array(mask)] 

If wanted you can reshape:

a[~np.array(mask)].reshape(a.shape[0]-len(rows_to_remove),a.shape[1]-len(columns_to_remove))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.