1

I want to create a function that filter on a specific value in a column of an dataframe(
My dataframe has the follow columns and value:

Zoekterm High_bias
Man 1
Man 1
Vrouw 1
kind 0

I wrote a function that filter on a specific value see below

Def most_likey_bias():                                                                                              
bias = data['high_bias'] == 1                                                                                         
if bias.any():                                                                                                
  print(data.loc[bias,['High_bias','Zoekterm']                                                                 
print(most_likey_bias())

The outcome of the table is:

Zoekterm High_bias
vrouw 1
kind 1

This table gives back which "Zoekterm" has a value of 1
But because the " Zoekterm" has duplicates of the same name i want a table that gives me a count of each zoekterm So the table that i want is:
This means a table where it counts for each "Zoekterm" how much "High bias" it has based on an specific value (1)

Zoekterm High_bias
Man 4
Vrouw 2
kind 5

I tried with groupby or with count, but i don't get it. Could someone give me some tips.

1 Answer 1

1

Use GroupBy.size with filtered rows and convert Series to DataFrame by Series.reset_index:

def most_likey_bias():                                      
    bias = data['high_bias'] == 1                                    
    if bias.any():                                            
        return data[bias].groupby('Zoekterm').size().reset_index(name='High_bias')

Similar idea is aggregate sum:

def most_likey_bias():                                      
    bias = data['High_bias'] == 1                                    
    if bias.any():                                            
        return data[bias].groupby('Zoekterm')['High_bias'].sum().reset_index(name='High_bias')

print (most_likey_bias())
  Zoekterm  High_bias
0      Man          2
1    Vrouw          1
Sign up to request clarification or add additional context in comments.

2 Comments

Hi jezrael i get an erorr unexpected EOF while parsing, i tried many things but it's not working: This what my code: def most_likey_bias(): bias = data['high_bias'] == 1 if bias.any(): return data[bias].groupby('Zoekterm').size().reset_index(name='High_bias') print(most_likey_bias)
@LeylaElkhamlichi - Tested and for me working well. Only need call function like print (most_likey_bias()) - () was missing

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.