Filter on specific value dataframe pandas/ python

Question

I want to create a function that filter on a specific value in a column of an dataframe(
My dataframe has the follow columns and value:

Zoekterm	High_bias
Man	1
Man	1
Vrouw	1
kind	0

I wrote a function that filter on a specific value see below

Def most_likey_bias():                                                                                              
bias = data['high_bias'] == 1                                                                                         
if bias.any():                                                                                                
  print(data.loc[bias,['High_bias','Zoekterm']                                                                 
print(most_likey_bias())

The outcome of the table is:

Zoekterm	High_bias
vrouw	1
kind	1

This table gives back which "Zoekterm" has a value of 1
But because the " Zoekterm" has duplicates of the same name i want a table that gives me a count of each zoekterm So the table that i want is:
This means a table where it counts for each "Zoekterm" how much "High bias" it has based on an specific value (1)

Zoekterm	High_bias
Man	4
Vrouw	2
kind	5

I tried with groupby or with count, but i don't get it. Could someone give me some tips.

jezrael · Accepted Answer · 2021-06-01 08:08:40Z

1

Use GroupBy.size with filtered rows and convert Series to DataFrame by Series.reset_index:

def most_likey_bias():                                      
    bias = data['high_bias'] == 1                                    
    if bias.any():                                            
        return data[bias].groupby('Zoekterm').size().reset_index(name='High_bias')

Similar idea is aggregate sum:

def most_likey_bias():                                      
    bias = data['High_bias'] == 1                                    
    if bias.any():                                            
        return data[bias].groupby('Zoekterm')['High_bias'].sum().reset_index(name='High_bias')

print (most_likey_bias())
  Zoekterm  High_bias
0      Man          2
1    Vrouw          1

edited Jun 1, 2021 at 8:08

answered Jun 1, 2021 at 7:45

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Leyla Elkhamlichi Over a year ago

Hi jezrael i get an erorr unexpected EOF while parsing, i tried many things but it's not working: This what my code: def most_likey_bias(): bias = data['high_bias'] == 1 if bias.any(): return data[bias].groupby('Zoekterm').size().reset_index(name='High_bias') print(most_likey_bias)

jezrael Over a year ago

@LeylaElkhamlichi - Tested and for me working well. Only need call function like print (most_likey_bias()) - () was missing

Collectives™ on Stack Overflow

Filter on specific value dataframe pandas/ python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related