1

I would like to perform operations on Pandas dataframes using fixed columns, rows, or values.

For example:

import numpy as np
import pandas as pd

df = pd.DataFrame({'a':(1,2,3), 'b':(4,5,6), 'c':(7,8,9), 'd':(10,11,12),
                  'e':(13,14,15)})

df
Out[57]: 
   a  b  c   d   e
0  1  4  7  10  13
1  2  5  8  11  14
2  3  6  9  12  15

I want to use the values in columns 'a' and 'b' as fixed values.


# It's easy enough to perform the operation I want on one column at a time:
df.loc[:,'f'] = df.loc[:,'c'] + df.loc[:,'a'] + df.loc[:,'b']

# It gets cumbersome if there are many columns to perform the operation on though:
df.loc[:,'g'] = df.loc[:,'d'] / df.loc[:,'a'] * df.loc[:,'b']
df.loc[:,'h'] = df.loc[:,'e'] / df.loc[:,'a'] * df.loc[:,'b']
# etc.

# This returns columns with all NaN values.
df.loc[:,('f','g','h')] = df.loc[:,'c':'e'] / df.loc[:'a']

Is there an optimal way to do what I want in Pandas? I could not find working solutions in the Pandas documentation or this SO thread. I don't think I can use .map() or .applymap(), because I'm under the impression they can only be using for simple equations (one input value). Thanks for reading.

2 Answers 2

2

Use div and mul instead of / and * with axis=0:

df[['g', 'h']] = df[['d', 'e']].div(df['a'], axis=0).mul(df['b'], axis=0)
print(df)

# Output
   a  b  c   d   e     g     h
0  1  4  7  10  13  40.0  52.0
1  2  5  8  11  14  27.5  35.0
2  3  6  9  12  15  24.0  30.0

With numpy:

arr = df.to_numpy()
arr[:, [3, 4]] / arr[:, [0]] * arr[:, [1]]

# Output
array([[40. , 52. ],
       [27.5, 35. ],
       [24. , 30. ]])
Sign up to request clarification or add additional context in comments.

Comments

0

As @Corralien pointed out, its better to use Pandas dataframe operations such as .div(), but I also figured out that the usage of .loc[] is important.

# Doesn't work:
df.loc[:,['f','g','h']] = df.loc[:,'c':'e'].div(df.loc[:'a'], axis=0)

# Doesn't work:
df[['f','g','h']] = df.loc[:,'c':'e'].div(df.loc[:'a'], axis=0)

# Now works.
df[['f','g','h']] = df.loc[:,'c':'e'].div(df['a'], axis=0)

At the moment, I'm not exactly sure why this is. Any insight would be helpful, thanks.

1 Comment

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.