4

when calling the "np.delete()", I am not interested to define a new variable for the reduced size array. I want to execute the delete on the original numpy array. Any thought?

>>> arr = np.array([[1,2], [5,6], [9,10]])
>>> arr
array([[ 1,  2],
       [ 5,  6],
       [ 9, 10]])
>>> np.delete(arr, 1, 0)
array([[ 1,  2],
       [ 9, 10]])
>>> arr
array([[ 1,  2],
       [ 5,  6],
       [ 9, 10]])
but I want:
>>> arr
array([[ 1,  2],
       [ 9, 10]])
3
  • What's wrong with arr = np.delete(arr, 1, 0)? Commented Nov 4, 2016 at 15:47
  • whats wrong with just doing arr = np.delete(arr, 1, 0) ? Or you could just call arr without the sections you don't want using brackets ? Commented Nov 4, 2016 at 15:47
  • Possible duplicate of deleting rows in numpy array Commented Nov 4, 2016 at 15:50

5 Answers 5

4

NumPy arrays are fixed-size, so there can't be an in-place version of np.delete. Any such function would have to change the array's size.

The closest you can get is reassigning the arr variable:

arr = numpy.delete(arr, 1, 0)
Sign up to request clarification or add additional context in comments.

1 Comment

Of course it could be done in place. E.g. with an array([1,2,3,4]), deleting the 2nd element would involve moving the 3 to the 2's spot, and the 4 to the 3rd's spot, and finally slicing the array to make its new length 3. This is how std::vector.erase works in C++ for example. It wouldn't be more efficient than np.delete, but it would use less memory, and that can be important.
1

The delete call doesn't modify the original array, it copies it and returns the copy after the deletion is done.

>>> arr1 = np.array([[1,2], [5,6], [9,10]])
>>> arr2 = np.delete(arr, 1, 0)
>>> arr1
array([[ 1,  2],
   [ 5,  6],
   [ 9, 10]])
>>> arr2 
array([[ 1,  2],
   [ 9, 10]])

Comments

1

If its a matter of performance you might want to try (but test it since I'm not sure) creating a view* instead of of using np.delete. You can do it by slicing which should be an inplace operation:

import numpy as np

arr = np.array([[1,  2], [5,  6], [9, 10]])
arr = arr[(0, 2), :]
print(arr)

resulting in:

[[ 1  2]
 [ 9 10]]

This, however, will not free the memory occupied from the excluded row. It might increase performance but memory wise you might have the same or worse problem. Also notice that, as far as I know, there is no way of indexing by exclusion (for instance arr[~1] would be very useful) which will necessarily make you spend resources in building an indexation array.

For most cases I think the suggestion other users have given, namely:

arr = numpy.delete(arr, 1, 0)

, is the best. In some cases it might be worth exploring the other alternative.

EDIT: *This is actually incorrect (thanks @user2357112). Fancy indexing does not create a view but instead returns a copy as can be seen in the documentation (which I should have checked before jumping to conclusions, sorry about that):

Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).

As so I'm unsure if the fancy indexing suggestion might be worth something as an actual suggestion unless it has any performance gain against the np.delete method (which I'll try to verify when opportunity arises, see EDIT2).

EDIT2: I performed a very simple test to see if there is any perfomance gain from using fancy indexing by opposition to delete function. Used timeit (actually the first time I've used but it seems the number of executions per snippet is 1 000 000, thus the hight numbers for time):

import numpy as np
import timeit

def test1():
    arr = np.array([[1, 2], [5, 6], [9, 10]])
    arr = arr[(0, 2), :]

def test2():
    arr = np.array([[1, 2], [5, 6], [9, 10]])
    arr = np.delete(arr, 1, 0)

print("Equality test: ", test1() == test2())

print(timeit.timeit("test1()", setup="from __main__ import test1"))
print(timeit.timeit("test2()", setup="from __main__ import test2"))

The results are these:

Equality test:  True
5.43569152576767
9.476918448174644

Which represents a very considerable speed gain. Nevertheless notice that building the sequence for the fancy indexing will take time. If it is worth or not will surely depend on the problem being solved.

3 Comments

This doesn't actually create a view. Indexing operations classified as advanced indexing, such as what you get with that (0, 2), don't produce views, since they don't produce the consistent strides necessary to create a view.
@user2357112 True. I should have check the documentation first. My mistake, I'll edit the post. Do you have any idea if performance wise this choice might be faster? My suggestion would be pretty useless if its not.
I think it might avoid some of the overhead numpy.delete has.
1

You could implement your own version of delete which copies data elements after the elements to be deleted forward, and then returns a view excluding the (now obsolete) last element:

import numpy as np


# in-place delete
def np_delete(arr, obj, axis=None):
    # this is a only simplified example
    assert (isinstance(obj, int))
    assert (axis is None)

    for i in range(obj + 1, arr.size):
        arr[i - 1] = arr[i]
    return arr[:-1]


Test = 10 * np.arange(10)
print(Test)

deleteIndex = 5
print(np.delete(Test, deleteIndex))
print(np_delete(Test, deleteIndex))

4 Comments

I use this in an algorithm where a row and column gets deleted in each step. This solution is already faster than using numpy.delete there. Using the @jit decorator from the Numba module on this function makes it even faster still.
@JensRenders it sounds like you are deleting multiple items - I expect you are taking all deletions into account at once and copy each element only once? Sounds like the implementation I was to lazy to try and post here :) good to know it's faster in some cases. I guess you also copy data block-wise instead of element by element as I imply above?
One more thought: is your implementation faster also in the worst case, deleting elements from the beginning?
Yes, shifting blocks makes more sense in the code. After using the @jit decorater, it doesn't make a difference anymore though. My current code runs faster then numpy.delete in any case, because it doesn't need to allocate new memory. If I time my code and the allocation of some useless memory, it is about the same speed as numpy.delete
0

Nothing wrong in your code. you just have to override the variable

    arr = np.array([[1,2], [5,6], [9,10]])
    arr = np.delete(arr, 1, 0)

1 Comment

This is not changing the actual object. You are just pointing the arr name at a new object. See the other replies for this answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.