47

I have a dataframe with 10 columns. I want to add a new column 'age_bmi' which should be a calculated column multiplying 'age' * 'bmi'. age is an INT, bmi is a FLOAT.

That then creates the new dataframe with 11 columns.

Something I am doing isn't quite right. I think it's a syntax issue. Any ideas?

Thanks

df2['age_bmi'] = df(['age'] * ['bmi'])
print(df2)

4 Answers 4

58

try df2['age_bmi'] = df.age * df.bmi.

You're trying to call the dataframe as a function, when you need to get the values of the columns, which you can access by key like a dictionary or by property if it's a lowercase name with no spaces that doesn't match a built-in DataFrame method.

Someone linked this in a comment the other day and it's pretty awesome. I recommend giving it a watch, even if you don't do the exercises: https://www.youtube.com/watch?v=5JnMutdy6Fw

Sign up to request clarification or add additional context in comments.

6 Comments

Perfect thanks Cory, I will check out that video as well
I checked out the first hour of that video so far, it is fantastic . Thank you for that link! The guy has a great flow to his teaching
Awesome, glad you're enjoying it. I'm still watching it myself, but in the first hour I was like "oh sh!t!" like 3 times in awe over the cool stuff you can do with it.
This eitther gives me an error that df2 is undefined or if I use df it says: ``` SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: pandas.pydata.org/pandas-docs/stable/user_guide/…
Is there a way the columns are auto-computed? as in the excel like columns? Where affecting one column or row would auto-affect the values in other columns that depend on them? Without having to run apply or anything explicilty over the frame?
|
27

As pointed by Cory, you're calling a dataframe as a function, that'll not work as you expect. Here are 4 ways to multiple two columns, in most cases you'd use the first method.

In [299]: df['age_bmi'] = df.age * df.bmi

or,

In [300]: df['age_bmi'] = df.eval('age*bmi')

or,

In [301]: df['age_bmi'] = pd.eval('df.age*df.bmi')

or,

In [302]: df['age_bmi'] = df.age.mul(df.bmi)

Comments

4

You have combined age & bmi inside a bracket and treating df as a function rather than a dataframe. Here df should be used to call the columns as a property of DataFrame-

df2['age_bmi'] = df['age'] *df['bmi']

Comments

4

You can also use assign:

df2 = df.assign(age_bmi = df['age'] * df['bmi'])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.