I am working with a dataset that looks like the one below (values changed and the real one is a lot larger)
fruit_type, temp, count
apple, 12, 4
apple, 14, 6
pear, 12, 6
pear, 16, 2
grape 12, 5
peach, 9, 33
peach 6, 3
I am trying to utilize a numpy agg function to find the percent of the total count each of the counts are for each temp. I also would like a column to represent the total count. Below is the code that I have been trying.
data3 = data2.groupby('fruit_type')['count'].agg({
'prob' : lambda count: ((count) / count.sum()),
'total' : lambda count: count.size
})
The temp values are discrete. I would like count to be aggregated on a row by row basis where the total count sum is grouped by the fruit type. Please let me know what is wrong with my code.