How to count only specific values in a pandas dataframe

Question

I have the following pandas dataframe;

a = [['01', '12345', 'null'], ['02', '78910', '9870'], ['01', '23456', 'null'],['01', '98765', '8760']]

df_a = pd.DataFrame(a, columns=['id', 'order', 'location'])

I need to get a count of how many NULL values (NULL is a string) that occur for each ID. So the result would look like;

id   null_count
01    02

I can get basic counts using a groupby:

new_df = df_a.groupby(['id', 'location'])['id'].count()

But the results return more than just the NULL values;

id  location
01  8760        1
    null        2
02  9870        1

Scott Boston · Accepted Answer · 2017-11-02 21:00:03Z

6

Because in your source dataframe your NULLs are strings 'null', use:

df_a.groupby('id')['location'].apply(lambda x: (x=='null').sum())\
    .reset_index(name='null_count')

Output:

   id  null_count
0  01          2
1  02          0

OR

df_a.query('location == "null"').groupby('id')['location'].size()\
    .reset_index(name='null_count')

Output:

   id  null_count
0  01           2

edited Nov 2, 2017 at 21:00

answered Nov 2, 2017 at 20:53

Scott Boston

154k15 gold badges160 silver badges207 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jezrael Over a year ago

I am wishing you a Merry Christmas! And thank you for all support, so small present for you (3+). Good luck!

BENY · Accepted Answer · 2017-11-02 21:03:43Z

5

Base on your own code , adding .loc notice this is multi index slice ..

df_a.groupby(['id', 'location'])['id'].count().loc[:,'null']
Out[932]: 
id
01    2
Name: id, dtype: int64

edited Nov 2, 2017 at 21:03

answered Nov 2, 2017 at 20:52

BENY

324k22 gold badges176 silver badges250 bronze badges

Comments

MaxU - stand with Ukraine · Accepted Answer · 2017-11-02 21:02:22Z

4

In [16]: df_a.set_index('id')['location'].eq('null').sum(level=0)
Out[16]:
id
01    2.0
02    0.0
Name: location, dtype: float64

answered Nov 2, 2017 at 21:02

MaxU - stand with Ukraine

212k37 gold badges402 silver badges437 bronze badges

Collectives™ on Stack Overflow

How to count only specific values in a pandas dataframe

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related