Conditionally fill column values based on another columns value in pandas

Question

I have a DataFrame with a few columns. One columns contains a symbol for which currency is being used, for instance a euro or a dollar sign. Another column contains a budget value. So for instance in one row it could mean a budget of 5000 in euro and in the next row it could say a budget of 2000 in dollar.

In pandas I would like to add an extra column to my DataFrame, normalizing the budgets in euro. So basically, for each row the value in the new column should be the value from the budget column * 1 if the symbol in the currency column is a euro sign, and the value in the new column should be the value of the budget column * 0.78125 if the symbol in the currency column is a dollar sign.

I know how to add a column, fill it with values, copy values from another column etc. but not how to fill the new column conditionally based on the value of another column.

Any suggestions?

Wes McKinney · Accepted Answer · 2012-05-23 19:05:57Z

131

You probably want to do

df['Normalized'] = np.where(df['Currency'] == '$', df['Budget'] * 0.78125, df['Budget'])

answered May 23, 2012 at 19:05

Wes McKinney

106k32 gold badges146 silver badges109 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

jake wong Over a year ago

Is it possible to do something like this but with words instead of numbers?

K. Mitra Over a year ago

df['Qnty'] = np.where(df['Quantity'].str.extract('([a-z]+)') == 'g', df['Quantity'].str.extract('(\d+)').astype(int) / 1000, df['Quantity'].str.extract('(\d+)').astype(int)) don't know if anyone requires this or not, but still I posted.

shortorian · Accepted Answer · 2021-01-27 22:08:13Z

25

An option that doesn't require an additional import for numpy:

df['Normalized'] = df['Budget'].where(df['Currency']=='$', df['Budget'] * 0.78125)

answered Jan 27, 2021 at 22:08

shortorian

1,2221 gold badge13 silver badges22 bronze badges

Comments

Prince Kumar Sharma · Accepted Answer · 2019-06-12 16:24:36Z

22

Similar results via an alternate style might be to write a function that performs the operation you want on a row, using row['fieldname'] syntax to access individual values/columns, and then perform a DataFrame.apply method upon it

This echoes the answer to the question linked here: pandas create new column based on values from other columns

def normalise_row(row):
    if row['Currency'] == '$'
    ...
    ...
    ...
    return result

df['Normalized'] = df.apply(lambda row : normalise_row(row), axis=1)

edited Jun 12, 2019 at 16:24

Prince Kumar Sharma

12.6k4 gold badges62 silver badges90 bronze badges

answered Feb 23, 2017 at 14:05

Thomas Kimber

11.1k4 gold badges30 silver badges47 bronze badges

1 Comment

Teepeemm Over a year ago

Should that be lambda row:normalise_row(row)? And couldn't you replace the whole thing with just normalise_row?

Artem Yevtushenko · Accepted Answer · 2017-11-20 03:35:29Z

Taking Tom Kimber's suggestion one step further, you could use a Function Dictionary to set various conditions for your functions. This solution is expanding the scope of the question.

I'm using an example from a personal application.

# write the dictionary

def applyCalculateSpend (df_name, cost_method_col, metric_col, rate_col, total_planned_col):
    calculations = {
            'CPMV'  : df_name[metric_col] / 1000 * df_name[rate_col],
            'Free'  : 0
            }
    df_method = df_name[cost_method_col]
    return calculations.get(df_method, "not in dict")

# call the function inside a lambda

test_df['spend'] = test_df.apply(lambda row: applyCalculateSpend(
row,
cost_method_col='cost method',
metric_col='metric',
rate_col='rate',
total_planned_col='total planned'), axis = 1)

  cost method  metric  rate  total planned  spend
0        CPMV    2000   100           1000  200.0
1        CPMV    4000   100           1000  400.0
4        Free       1     2              3    0.0

Dudelstein · Accepted Answer · 2025-04-02 14:39:48Z

4

Panda's loc can also be used without importing numpy:

# First assign Budget to the entire Normalized column
df['Normalized'] = df['Budget']
# Then convert to dollars where Currency equals the dollar sign
df.loc[df['Currency'] == '$', 'Normalized'] = df['Budget'] * 0.78125

edited Apr 2 at 14:39

answered Aug 25, 2023 at 10:34

Dudelstein

7147 silver badges21 bronze badges

Comments

kamran kausar · Accepted Answer · 2023-08-30 16:03:35Z

1

df.loc[df['col1'].isnull(), 'col2'] = values

answered Aug 30, 2023 at 16:03

kamran kausar

4,6632 gold badges25 silver badges17 bronze badges

Collectives™ on Stack Overflow

Conditionally fill column values based on another columns value in pandas

6 Answers 6

2 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

2 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related