0

I have a numpy array which looks like this

array([ 1219,  1220,  2215,  2216,  3459,  3460,  4686,  4687,  5920,
        5921,  7200,  7201,  8498,  8499,  9834,  9835, 10046, 11138,
       11139, 12520, 12521, 12522, 13812, 13813, 14033, 15099, 15100,
       16375, 16376, 17576, 17577, 18634, 18635, 19849, 19850])

And I want to delete the elements which are very close. For example I don't want both 2215 and 2216, I want to keep only the first one 2215. Or for the 4686 and 4687, I want to keep only 4686. How can I do it using only numpy commands?

1 Answer 1

1

One solution I came up with is to calculate the difference of the array, and remove those whose forward difference values are small. Taking advantage of the fact that your array is sorted, the following code works for me.

import numpy as np

arr = np.array([ 1219,  1220,  2215,  2216,  3459,  3460,  4686,  4687,  5920,
    5921,  7200,  7201,  8498,  8499,  9834,  9835, 10046, 11138,
    11139, 12520, 12521, 12522, 13812, 13813, 14033, 15099, 15100,
    16375, 16376, 17576, 17577, 18634, 18635, 19849, 19850])

threshold = 1
diff = np.empty(arr.shape)
diff[0] = np.inf  # always retain the 1st element
diff[1:] = np.diff(arr)
mask = diff > threshold

new_arr = arr[mask]

print(new_arr)

You can adjust the threshold value to play with different levels of tolerance.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much my friend!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.