-2

I've a numpy array as follow:

a = np.array([[1, 2, 3, -999.],
              [2, 3, 4, -999.],
              [3, 4, 5, 6]])

How can I remove the value -999. while keeping the dimensions, as such:

array([[   1.,    2.,    3.],
       [   2.,    3.,    4.],
       [   3.,    4.,    5.,    6.]])

I tried:

np.delete(a, np.where(a == -999.))

But this result in

array([   3.,    2.,    3.,    4., -999.,    3.,    4.,    5.,    6.])

And I tried

a[a == -999.] = np.nan
a[~np.isnan(a)]

While it removes the nan (and so the -999), the numpy array becomes 1D:

array([1., 2., 3., 2., 3., 4., 3., 4., 5., 6.])

EDIT

I use the resulting jagged array (list of lists) for slicing another array where each slice can have a different length.

My use-case:

a = np.random.randint(1,35,size=(100000,5))
a[a == 14] = -999 # set a missing value

Option 1, select values non equal fill value

%%timeit
slices = np.array([i[i != -999] for i in a])

10 loops, best of 3: 183 ms per loop

Option 2, mask and compress

%%timeit
a_ma = np.ma.masked_equal(a, -999)
slices = np.array([i.compressed() for i in a_ma])

1 loop, best of 3: 2.99 s per loop
3
  • 3
    NumPy does not really support jagged arrays. Commented Apr 6, 2018 at 19:40
  • As miradulo notes, numpy doesn't support jagged arrays. (Well, you could create an array of objects, but that probably wouldn't solve the problem in this case. The result wouldn't act like a 2-d array.) What are you going to do with the result? Knowing the ultimate goal of your calculation will help guide the answers given here. Commented Apr 6, 2018 at 20:36
  • @WarrenWeckesser thanks for your response. I've created a ipynb with my ultimate goal: nbviewer.jupyter.org/github/mattijn/pynotebook/blob/master/… Commented Apr 6, 2018 at 21:45

2 Answers 2

2

While jagged arrays are not really something you should be using, you could do the following using a list comprehension:

In [33]: a = np.array([i[i != -999.] for i in a])

In [34]: a
Out[34]:
array([array([ 1.,  2.,  3.]), array([ 2.,  3.,  4.]),
       array([ 3.,  4.,  5.,  6.])], dtype=object)

In [35]: a[0].shape
Out[35]: (3,)

In [36]: a[1].shape
Out[36]: (3,)

In [37]: a[2].shape
Out[37]: (4,)
Sign up to request clarification or add additional context in comments.

Comments

1

Which dimensions are you try to keep? a.shape is (3,4). How can you remove 2 items from a and still have 3x4 array (3*4=12)?

Your desired display is not a (3,4) array:

In [638]: arr = np.array([[   1.,    2.,    3.],
     ...:        [   2.,    3.,    4.],
     ...:        [   3.,    4.,    5.,    6.]])
     ...:        
In [639]: arr
Out[639]: 
array([list([1.0, 2.0, 3.0]), list([2.0, 3.0, 4.0]),
       list([3.0, 4.0, 5.0, 6.0])], dtype=object)
In [640]: arr.shape
Out[640]: (3,)

Because the rows vary in length, it creates an object dtype array, one element per row. That is basically a list of lists.

For some purposes it is handy to make a MaskedArray:

In [637]: np.ma.masked_equal(a, -999)
Out[637]: 
masked_array(
  data=[[1.0, 2.0, 3.0, --],
        [2.0, 3.0, 4.0, --],
        [3.0, 4.0, 5.0, 6.0]],
  mask=[[False, False, False,  True],
        [False, False, False,  True],
        [False, False, False, False]],
  fill_value=-999.0)

I see you have worked with MaskedArrays before: update numpy array where not masked

1 Comment

Thanks for your answer, I would love to use MaskedArrays, but my subsequent step is to used the jagged array for slicing another array, which I cannot using a MaskedArray. Maybe thats my real problem. My ultimate goal in a notebook: nbviewer.jupyter.org/github/mattijn/pynotebook/blob/master/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.