I have a df as below format.
year type1 type2 price
2015 apple natural 40
2015 apple organic 35
2016 apple natural 44
2016 apple organic 40
2015 banana natural 20
2015 banana organic 15
2016 banana natural 20
2016 banana organic 18
I need to create a new column price_new when the year, type1 and type2 conditions are met. In other words for the same year and type1 if the type2 is natural then fill the new column with new value or else print the old value.
I tried the below:
df["price_new"] = np.where(((df["year"] == 2015) & (
df["type1"] == "apple") & (df["type2"].isin(['natural']))),
25, df["price"])
df["price_new"] = np.where(((df["year"] == 2016) & (
df["type1"] == "apple") & (df["type2"].isin(['natural']))),
26, df["price"])
df["price_new"] = np.where(((df["year"] == 2015) & (
df["type1"] == "apple") & (~df["type2"].isin(['natural']))),
20, df["price"])
df["price_new"] = np.where(((df["year"] == 2016) & (
df["type1"] == "apple") & (~df["type2"].isin(['natural']))),
22, df["price"])
The output should be like below:
year type1 type2 price price_new
2015 apple natural 40 25
2015 apple organic 35 20
2016 apple natural 44 26
2016 apple organic 40 22
2015 banana natural 20
2015 banana organic 15
2016 banana natural 20
2016 banana organic 18
However, the values from only the last condition are printed:
year type1 type2 price price_new
2015 apple natural 40 40
2015 apple organic 35 35
2016 apple natural 44 44
2016 apple organic 40 22
2015 banana natural 20
2015 banana organic 15
2016 banana natural 20
2016 banana organic 18
- How could the
price_newcolumn get the new values for all conditions - In real data i have more than 10 types in
type1column. Is there an efficient way to write this instead of writing it for each unique value intype1column?