14

I have a 2d array in the numpy module that looks like:

data = array([[1,2,3],
              [4,5,6],
              [7,8,9]])

I want to get a slice of this array that only includes certain columns of element. For example I may want columns 0 and 2:

data = [[1,3],
        [4,6],
        [7,9]]

What is the most Pythonic way to do this? (No for loops please)

I thought this would work:

newArray = data[:,[0,2]]

but it results in a:

TypeError: list indices must be integers, not tuple
1
  • Downvote. This cannot be reproduced in 2021. NameError: name 'array' is not defined. At least "today", the question is plain wrong, and I doubt it was different in 2010. Commented Sep 15, 2021 at 10:55

8 Answers 8

17

The error say it explicitely : data is not a numpy array but a list of lists.

try to convert it to an numpy array first :

numpy.array(data)[:,[0,2]]
Sign up to request clarification or add additional context in comments.

1 Comment

Nice catch! I bow to your psychic debugging abilities! :)
11

If you'd want to slice 2D list the following function may help

def get_2d_list_slice(self, matrix, start_row, end_row, start_col, end_col):
    return [row[start_col:end_col] for row in matrix[start_row:end_row]]

1 Comment

Great! Good coding here! Should not that the start of 0
5

Actually, what you wrote should work just fine... What version of numpy are you using?

Just to verify, the following should work perfectly with any recent version of numpy:

import numpy as np
x = np.arange(9).reshape((3,3)) + 1
print x[:,[0,2]]

Which, for me, yields:

array([[1, 3],
       [4, 6],
       [7, 9]])

as it should...

1 Comment

The example of the question does not "work just fine". Yours works because you create a np array from the scratch, the OP just copied the numpy output. In 2010 or now, the behaviour should be the same.
4

THis may not be what you are looking for but this is would do. zip(*x)[whatever columns you might need]

Comments

2

Why it works on Numpy but not Python lists

Because with __getitem__ you can program you classes to do whatever you want with : and multiple arguments.

Numpy does this, but built-in lists do not.

More precisely:

class C(object):
    def __getitem__(self, k):
        return k

# Single argument is passed directly.
assert C()[0] == 0

# Multiple indices generate a tuple.
assert C()[0, 1] == (0, 1)

# Slice notation generates a slice object.
assert C()[1:2:3] == slice(1, 2, 3)

# If you omit any part of the slice notation, it becomes None.
assert C()[:] == slice(None, None, None)
assert C()[::] == slice(None, None, None)
assert C()[1::] == slice(1, None, None)
assert C()[:2:] == slice(None, 2, None)
assert C()[::3] == slice(None, None, 3)

# Tuple with a slice object:
assert C()[:, 1] == (slice(None, None, None), 1)

# Ellipsis class object.
assert C()[...] == Ellipsis

We can then open up slice objects as:

s = slice(1, 2, 3)
assert s.start == 1
assert s.stop == 2
assert s.step == 3

So that is why when you write:

[][1, 2]

Python says:

TypeError: list indices must be integers, not tuple

because you are trying to pass (1, 2) to the list's __getitem__, and built-in lists are not programmed to deal with tuple arguments, only integers.

Comments

1

Beware that numpy only accept regular array with the same size for each elements. you can somehow use : [a[i][0:2] for i in xrange(len(a))] it's pretty ugly but it works.

1 Comment

Numpy accepts any combination of slices, integers and arrays. The arrays do not have to be the same size, but they should broadcast against each other. And I believe the expression you looking for is [[row[0], row[2]] for row in data].
0
newArray = data[:,0:2]

or am I missing something?

Comments

0

The example in question begins with array, not with np.array, and array is not defined as an isolated prefix:

data = array([[1,2,3],
              [4,5,6],
              [7,8,9]])

data[:,[0,2]]

Error:

NameError: name 'array' is not defined

To reproduce the error, you need to drop that array frame (without np. in front, it does not have a definition anyway).

data = [[1,2,3],
        [4,5,6],
        [7,8,9]]

data[:,[0,2]]

Error:

TypeError: list indices must be integers or slices, not tuple

The user has probably used the inner list of the array for tests but asked the question with a copy from a np.array output. At least in 2021, the question is just plain wrong: it cannot be reproduced. And I doubt that the behaviour was different in 2010 (numpy is the basic package of python).

For completeness, as in the other answers:

data = np.array([[1,2,3],
                [4,5,6],
                [7,8,9]])

data[:,[0,2]]

Output:

array([[1, 3],
       [4, 6],
       [7, 9]])

You do not need a nested list to reproduce this. Slicing a one-dimensional list by two dimensions like with

[1,2][:, 0]

throws the same TypeError: list indices must be integers or slices, not tuple.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.