4

I have the following (4x8) numpy array:

In [5]: z
Out[5]: 
array([['1A34', 'RBP', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
       ['1A9N', 'RBP', 0.0456267, 0.0539268, 0.331932, 0.0464031,
        4.41336e-06, 0.522107],
       ['1AQ3', 'RBP', 0.0444479, 0.201112, 0.268581, 0.0049757,
        1.28505e-12, 0.480883],
       ['1AQ4', 'RBP', 0.0177232, 0.363746, 0.308995, 0.00169861, 0.0,
        0.307837]], dtype=object)

In [6]: z.shape
Out[6]: (4, 8)

What I want to do is to extract the 0th, 2nd and 4th column of the above array yielding (4 x 3) array that looks like this:

    array([['1A34', 0.0,  0.0],
           ['1A9N', 0.0456267,  0.331932],
           ['1AQ3', 0.0444479, 0.268581],
           ['1AQ4', 0.0177232,  0.308995]])

What's the way to do it? Note that the above indexes are just example. In actuality it can be very irregular, e.g. 0th, 3rd, 4th.

2 Answers 2

8

Use slicing:

>>> arr = np.array([['1A34', 'RBP', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
       ['1A9N', 'RBP', 0.0456267, 0.0539268, 0.331932, 0.0464031,
        4.41336e-06, 0.522107],
       ['1AQ3', 'RBP', 0.0444479, 0.201112, 0.268581, 0.0049757,
        1.28505e-12, 0.480883],
       ['1AQ4', 'RBP', 0.0177232, 0.363746, 0.308995, 0.00169861, 0.0,
        0.307837]], dtype=object)
>>> arr[:,:5:2]
array([['1A34', 0.0, 0.0],
       ['1A9N', 0.0456267, 0.331932],
       ['1AQ3', 0.0444479, 0.268581],
       ['1AQ4', 0.0177232, 0.308995]], dtype=object)

If the column indices are irregular then you can do something like this:

>>> indices = [0, 3, 4]
>>> arr[:, indices]
array([['1A34', 1.0, 0.0],
       ['1A9N', 0.0539268, 0.331932],
       ['1AQ3', 0.201112, 0.268581],
       ['1AQ4', 0.363746, 0.308995]], dtype=object)

Note that there's a subtle but substantial difference between slicing (which is basic indexing) and using a sequence for indexing (also known as advanced indexing or fancy indexing). When using a slice such as arr[:, :5:2], no data is copied, and we get a view of the original array. This implies that mutating the result of arr[:, :5:2] will affect arr itself. With fancy indexing arr[:, [0, 3, 4]] is guaranteed to be a copy: this takes up more memory, and mutating this result will not affect arr.

Sign up to request clarification or add additional context in comments.

1 Comment

That slicing method assumed 'steps'. But in actuality the indexes can be very irregular, e.g. 0th, 3rd, 4th
0

You can access the columns of a numpy array in the following way:

array[:,column_number]

To get the array of specific columns you can do as follows:

z = array([[['1A34', 'RBP', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
   ['1A9N', 'RBP', 0.0456267, 0.0539268, 0.331932, 0.0464031,
    4.41336e-06, 0.522107],
   ['1AQ3', 'RBP', 0.0444479, 0.201112, 0.268581, 0.0049757,
    1.28505e-12, 0.480883],
   ['1AQ4', 'RBP', 0.0177232, 0.363746, 0.308995, 0.00169861, 0.0,
    0.307837]], dtype=object]) #your array here

op_array = array([ [z:,0], z[:,2], z[:,3] ])

The op_array will have the 0th, 2nd and 3rd columns as rows.

So you need to transpose it to get the output array in the desired format.

op_array.transpose()

op_array will now look as below:

op_array([['1A34', 0.0,  0.0],
       ['1A9N', 0.0456267,  0.331932],
       ['1AQ3', 0.0444479, 0.268581],
       ['1AQ4', 0.0177232,  0.308995])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.