How to group dataframe NaN values of one feature by another feature

Question

Situation: I have a DataFrame with NaN values. I'm going to make a prognosis of sth for the next year, so I guess i don't need very old data. I want to check the 'structure' of NaNs to see if there are lots of them in old data, and not so much in the new ones.

df = pd.DataFrame(columns = ['year','water_consumption','some_index'], 
                  data = [[1, float('nan'),3],[2,26,7], [5,float('nan'),6], 
                         [1,float('nan'),42],[1,float('nan'),13]])

	A	B	C
0	1	NaN	3
1	2	26.0	7
2	5	NaN	6
3	1	NaN	42
4	1	NaN	13

The question is: how can I group number of NaN values in feature_one by values of feature_two easily (I know I can make a list and cycling by every value of feature_one and then count them, but I'd like to know if there is any easier and more elegant way)? Groupby has no 'isna()' method.

In the end i want to see a table like:

	A	B_nan_count
0	1	3
1	2	0
2	5	1

jezrael · Accepted Answer · 2022-08-15 11:33:12Z

3

First compare B for boolean mask and for count aggregate sum:

df1 = df.B.isna().groupby(df.A).sum().reset_index(name='B_nan_count')

Your solution filter rows, so df[df.B.isna()] return DataFrame, so for count use GroupBy.size:

df1 = df[df.B.isna()].groupby('A').size().reset_index(name='B_nan_count')

edited Aug 15, 2022 at 11:33

answered Aug 15, 2022 at 10:43

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to group dataframe NaN values of one feature by another feature

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related