6

I have a list of floats/nan values, that looks like this:

a = [(9.62, np.nan, 0.063), (np.nan, np.nan, np.nan), (np.nan, 0.34, np.nan), (9.50, 2.65, 5.85), (np.nan, np.nan, np.nan), (8.9423173497260166e-06, np.nan, np.nan), (np.nan, np.nan, np.nan), (10.53906499271581, np.nan, 3.4981897643207153e-08), (27.945228892337656, np.nan, np.nan), (np.nan, np.nan, np.nan), (0.00015676098048248007, 428.53224664333368, 15.597030989617416), (3.219339103511719e-08, np.nan, np.nan), (351.3486881626871, 118.79412856376891, 96.925698744436318), (np.nan, np.nan, np.nan), (np.nan, np.nan, np.nan), (0.038185812702743384, 0.011979539923543838, 1.4161404311887908e-05), (6.5891883211951452, np.nan, np.nan), (np.nan, np.nan, np.nan), (np.nan, np.nan, np.nan), (np.nan, np.nan, np.nan), (0.01992113565158183, 1.0858887135978378e-08, 6.949483102803238e-08), (np.nan, np.nan, np.nan), (0.0053471054969118897, 32.364223190908589, 0.29950485126829518), (0.022687094833899225, np.nan, 3.3927513616780456e-05), (0.0065459356887503, 5.0304474154655309e-06, 6.1755309734841293e-06), (1.2854278279876815e-07, 110.94572059986106, 2.0737305081677166e-06), (2.8909153747692473, np.nan, np.nan), (np.nan, np.nan, np.nan), (0.00085244354118369653, np.nan, 547.28608997823414), (0.21609437779080298, 2.9772785752782283e-08, 0.024868855470372788), (np.nan, 1.0571674432090431e-08, np.nan), (np.nan, 0.00042711039439664552, np.nan), (np.nan, 3.7576842775630178e-09, np.nan), (np.nan, 1.2436122988008544e-08, np.nan), (np.nan, 0.008772060008242254, np.nan), (np.nan, 2.9731267579988852, np.nan), (np.nan, 152.69348161610276, np.nan), (np.nan, 1.7976907012194907, np.nan), (np.nan, 0.0006232073677262973, np.nan), (np.nan, 1.3468250342036237e-08, np.nan), (np.nan, 6.9699321813542907e-05, np.nan), (np.nan, 5.2001506649804148e-05, np.nan), (np.nan, np.nan, np.nan)]

i.e.: made up of N sub-lists, each one containing the same number of elements M (in this case 3, but it could change), where each of those elements is either a float or a np.nan value (my actual list has much larger N and M values).

I need to efficiently count the number of non np.nan values in each sublist. If the number is zero (all np.nan values), a np.nan value should be stored.

The final list/array would look like (using a above):

count = [2, nan, 1, 3, ...]

I tried with np.count_nonzero() but it counts np.nan as non-zeros, so it returns all counts as 3.

1 Answer 1

7

You can use numpy.isnan to create a boolean array, and then count the Trues with sum for each row (axis=1):

import numpy as np
# count the non-nan values
non_nans = (~np.isnan(a)).sum(1)

# replace 0 count with np.nan
np.where(non_nans == 0, np.nan, non_nans)
# array([  2.,  nan,   1.,   3.,  nan, ...])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.