2

I need to create a new column in a pandas DataFrame which is calculated as the ratio of 2 existing columns in the DataFrame. However, the denominator in the ratio calculation will change based on the value of a string which is found in another column in the DataFrame.

Example. Sample dataset :

import pandas as pd
df = pd.DataFrame(data={'hand'      : ['left','left','both','both'], 
                        'exp_force' : [25,28,82,84], 
                        'left_max'  : [38,38,38,38], 
                        'both_max'  : [90,90,90,90]})

I need to create a new DataFrame column df['ratio'] based on the condition of df['hand'].

If df['hand']=='left' then df['ratio'] = df['exp_force'] / df['left_max']

If df['hand']=='both' then df['ratio'] = df['exp_force'] / df['both_max']

1
  • 1
    np.where(df.hand.eq('left'), df['exp_force'] / df['left_max'], df['exp_force'] / df['both_max']) Commented Aug 22, 2020 at 9:02

3 Answers 3

3

You can use np.where():

import pandas as pd
df = pd.DataFrame(data={'hand'      : ['left','left','both','both'], 
                        'exp_force' : [25,28,82,84], 
                        'left_max'  : [38,38,38,38], 
                        'both_max'  : [90,90,90,90]})
df['ratio'] = np.where((df['hand']=='left'), df['exp_force'] / df['left_max'], df['exp_force'] / df['both_max'])
df

Out[42]: 
   hand  exp_force  left_max  both_max     ratio
0  left         25        38        90  0.657895
1  left         28        38        90  0.736842
2  both         82        38        90  0.911111
3  both         84        38        90  0.933333

Alternatively, in a real-life scenario, if you have lots of conditions and results, then you can use np.select(), so that you don't have to keep repeating your np.where() statement as I have done a lot in my older code. It's better to use np.select in these situations:

import pandas as pd
df = pd.DataFrame(data={'hand'      : ['left','left','both','both'], 
                        'exp_force' : [25,28,82,84], 
                        'left_max'  : [38,38,38,38], 
                        'both_max'  : [90,90,90,90]})
c1 = (df['hand']=='left')
c2 = (df['hand']=='both')
r1 = df['exp_force'] / df['left_max']
r2 = df['exp_force'] / df['both_max']
conditions = [c1,c2]
results = [r1,r2]
df['ratio'] = np.select(conditions,results)
df
Out[430]: 
   hand  exp_force  left_max  both_max     ratio
0  left         25        38        90  0.657895
1  left         28        38        90  0.736842
2  both         82        38        90  0.911111
3  both         84        38        90  0.933333
Sign up to request clarification or add additional context in comments.

Comments

2

You can use the apply() method of your dataframe :

df['ratio'] = df.apply(
    lambda x: x['exp_force'] / x['left_max'] if x['hand']=='left' else x['exp_force'] / x['both_max'],
    axis=1
)

2 Comments

make sure to copy and paste the code into your answer and not images.
this will be quite slow, compared to the numpy method
1

Enumerate

for i,e in enumerate(df['hand']):
 
  if e == 'left':
    df.at[i,'ratio'] = df.at[i,'exp_force'] / df.at[i,'left_max']
  if e == 'both':
    df.at[i,'ratio'] = df.at[i,'exp_force'] / df.at[i,'both_max']
df

Output:

    hand    exp_force   left_max    both_max    ratio
0   left    25            38          90      0.657895
1   left    28            38          90      0.736842
2   both    82            38          90      0.911111
3   both    84            38          90      0.933333

1 Comment

this will be quite slow compared to using numpy

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.