1

There is a pandas dataframe as follow:

import pandas as pd
raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'blue', 'yellow', "green"],
'grade': [88, 92, 95, 70]}

df = pd.DataFrame(raw_data)

I want to divide age and grade numeric cell values equal blue in favorite_color column to 125.0 value and yellow values divide to 130.0 and green to 135.0. Results mus be inserted in new columns age_new, grade_new. By below code I receive error.

df['age_new'] =(df.loc[df['favorite_color']=='blue']/125.0)
df['age_new'] =(df.loc[df['favorite_color']=='yellow']/130.0)
df['age_new'] =(df.loc[df['favorite_color']=='green']/135.0)
df['grade_new'] =(df.loc[df['favorite_color']=='blue']/125.0)
df['grade_new'] =(df.loc[df['favorite_color']=='yellow']/130.0)
df['grade_new'] =(df.loc[df['favorite_color']=='green']/135.0)

Error:

TypeError: unsupported operand type(s) for /: 'str' and 'int'

2 Answers 2

4

map

mods = {'blue': 125, 'yellow': 130, 'green': 135}

df.assign(
    mods=df.favorite_color.map(mods),
    age_new=lambda d: d.age / d.mods,
    grade_new=lambda d: d.grade / d.mods
)

               name  age favorite_color  grade  mods   age_new  grade_new
0    Willard Morris   20           blue     88   125  0.160000   0.704000
1       Al Jennings   19           blue     92   125  0.152000   0.736000
2      Omar Mullins   22         yellow     95   130  0.169231   0.730769
3  Spencer McDaniel   21          green     70   135  0.155556   0.518519

Similar

mods = {'blue': 125, 'yellow': 130, 'green': 135}

df.join(df[['age', 'grade']].div(df.favorite_color.map(mods), axis=0).add_suffix('_new'))

               name  age favorite_color  grade   age_new  grade_new
0    Willard Morris   20           blue     88  0.160000   0.704000
1       Al Jennings   19           blue     92  0.152000   0.736000
2      Omar Mullins   22         yellow     95  0.169231   0.730769
3  Spencer McDaniel   21          green     70  0.155556   0.518519
Sign up to request clarification or add additional context in comments.

Comments

2

You can use .replace instead of .loc, so that you only perform the operation once.

import pandas as pd

raw_data = {
    'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
    'age': [20, 19, 22, 21],
    'favorite_color': ['blue', 'blue', 'yellow', "green"],
    'grade': [88, 92, 95, 70]}

df = pd.DataFrame(raw_data)

color_d = {
    "blue": 125,
    "yellow": 130,
    "green": 135
}

df[["age_new", "grade_new"]] = df[["age", "grade"]].div(
    df['favorite_color'].replace(color_d), 
    axis=0)

df.head()

Which gives

    name                age favorite_color  grade   age_new     grade_new
0   Willard Morris      20  blue            88      0.160000    0.704000
1   Al Jennings         19  blue            92      0.152000    0.736000
2   Omar Mullins        22  yellow          95      0.169231    0.730769
3   Spencer McDaniel    21  green           70      0.155556    0.518519

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.