Create new rows in a Pandas Dataframe based on a column from another pandas dataframe

Question

I have a dataframe DF1 which looks like this:

Account Name	Task Type	Flag	Cost
Account 1	Repair	True	$100
Account 2	Repair	True	$200
Account 3	Repair	False	$300

DF2 looks like this:

Country	Percentage
US	30%
Canada	20%
India	50%

I want to create DF3 based on DF1 & DF2 by doing the following:

Filter rows with where the Flag = True
Create a new column 'Calculated_Cost' which will multiply the 'Cost' column in DF1 with percentage column of DF2 & create multiple rows based on the number of rows in DF2

The Final output would look like this:

Account Name	Task Type	Flag	Cost	Country	Calculated_Cost
Account 1	Repair	True	$100	US	$30
Account 1	Repair	True	$100	Canada	$20
Account 1	Repair	True	$100	India	$50
Account 2	Repair	True	$200	US	$60
Account 2	Repair	True	$200	Canada	$40
Account 2	Repair	True	$200	India	$100
Account 3	Repair	False	$300	Nan	Nan

Welcome to SO. Your problem is well formulated, but what have you tried? Is there a minimal reproducible example ? — jlandercy
– jlandercy, Commented May 10, 2022 at 5:13
Hello... I am a beginner level coder and i am a bit stumped with the problem. I have tried multiple approaches such as iterating through rows in the dataframe, building a function which will do the multiplication etc. but I just dont have the skill to solve this... I only have bits and pieces of code that I have written but i dont think that will be helpful for you guys — Anuj Mandal
– Anuj Mandal, Commented May 10, 2022 at 5:21

jezrael · Accepted Answer · 2022-05-10 05:35:41Z

1

Use:

df1['Cost'] = df1['Cost'].str.lstrip('$').astype(int)
df2['Percentage'] = df2['Percentage'].str.rstrip('%').astype(int).div(100)

df = pd.concat([df1[df1['Flag']].merge(df2, how='cross'), df1[~df1['Flag']]])
df['Calculated_Cost'] = df['Cost'].mul(df.pop('Percentage'))
print (df)
  Account Name Task Type   Flag  Cost Country  Calculated_Cost
0    Account 1    Repair   True   100      US             30.0
1    Account 1    Repair   True   100  Canada             20.0
2    Account 1    Repair   True   100   India             50.0
3    Account 2    Repair   True   200      US             60.0
4    Account 2    Repair   True   200  Canada             40.0
5    Account 2    Repair   True   200   India            100.0
2    Account 3    Repair  False   300     NaN              NaN

answered May 10, 2022 at 5:35

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Anuj Mandal Over a year ago

Hello... thanks for the answer Jezrael.. I tried the above code but I am getting this error: raise KeyError(f"None of [{key}] are in the [{axis_name}]") KeyError: "None of [Index(['True', 'True', 'False'], dtype='object')] are in the [columns]"

Emilio Silva Over a year ago

In @jezrael 's example, he converted your flags to booleans. Use df1.Flag = df1.Flag == 'True' to convert them to booleans.

Anuj Mandal · Accepted Answer · 2022-05-11 10:02:44Z

0

I am sure there is a more efficient way to do this... but I got it done using the following code:

import pandas as pd

df1 = pd.DataFrame(
    {
     'Account Name': ['Account 1', 'Account 2', 'Account 3'],
     'Task Type': ['Repair', 'Repair', 'Repair'],
     'Flag': ['True', 'True', 'False'],
     'Cost': ['$100', '$200', '$300']
       }
    )

df2 = pd.DataFrame(
    {
     'Country': ['US', 'Canada', 'India'],
     'Percentage': ['30%', '20%', '50%']
       }
    )

df1['Cost'] = df1['Cost'].str.lstrip('$').astype(int)
df2['Percentage'] = df2['Percentage'].str.rstrip('%').astype(int).div(100)
filtered_df_true = df1.loc[df1['Flag'] == 'True'] 
filtered_df_false = df1.loc[df1['Flag'] == 'False']
df3 = filtered_df_true.assign(key=1).merge(df2.assign(key=1), how = 'outer', on='key')
df3['Calculated Cost'] = df3['Cost']*df3['Percentage']
frames = [df3, filtered_df_false]
result = pd.concat(frames)
result.pop('key')
result.pop('Percentage')
print(result)

answered May 11, 2022 at 10:02

Anuj Mandal

11 bronze badge

1 Comment

Emilio Silva Over a year ago

Since this is your own attempt, please edit your question and add it to the end.

Collectives™ on Stack Overflow

Create new rows in a Pandas Dataframe based on a column from another pandas dataframe

2 Answers 2

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related