0

I have a column where its categorical (house, neighbors, routine). And I have 4 extra columns. The dataset looks like this:

print(df)

      type    num_before_cleaning      num_after_cleaning    num_before_removing     num_after_removing
0     house          32                       12                    42                      10
1     house          10                       3                     4                       1
2     neighbors      20                       5                     25                      7
3     routine        40                       21                    62                      35 
4     neighbors      14                       2                     21                      9
5     routine        52                       30                    71                      42

and I want for each category in column type it will divide num_before_cleaning / num_after_cleaning and num_before_removing / num_after_removing

So, the outcome will be for example:

print(house_cleaning)
0.64
print(routine_removing)
0.79

I know that I should use np.where but how can I make it perform calculations after giving it a specific condition? Or is there any other ways I can solve it.

I've tried researching but didn't find any answers.

3
  • Can you add expected output? Commented Apr 28, 2020 at 12:05
  • @jezrael I've added the expected output which is for each type, it will divide what ends with cleaning so it will calcuate the sum of num_before_cleaning and divide it by the sum of num_after_cleaning. And same goes to removing. Commented Apr 28, 2020 at 12:13
  • I only ask if understand what need, but added answer. If not working let me know. Commented Apr 28, 2020 at 12:13

1 Answer 1

1

I believe you need:

df1 = df.groupby('type').sum()
df1 = df1.assign(clean = df1.pop('num_before_cleaning').div(df1.pop('num_after_cleaning')),
                 remove = df1.pop('num_before_removing').div(df1.pop('num_after_removing')))
print (df1)
              clean    remove
type                         
house      2.800000  4.181818
neighbors  4.857143  2.875000
routine    1.803922  1.727273
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for your answer! But it's not want I meant. The output should be a percentage so it will calculate the sum of num_before_cleaning and divide it by the sum of num_after_cleaning. And same goes to removing. So, the output should be a percentage. For example, Percentage of cleaning for type house: 0.86. Percentage for removing for type routine: 0.43
@Hamad - sorry, I try understand percentage, how is count 0.86 ? Because in question is 0.64 so not sure about formula
Sorry if it caused confusion it's just a example of an output not the actual number :)
@Hamad - Ok, so what is formula? I try count and still failed get 0.86. :(
df['num_after_cleaning].sum() / df['num_before_cleaning'].sum() and same goes to remove ^^
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.