1

I have dataframe

city_reg     city_live   reg_region    live_region 
 Moscow         Tver        77            69
 Tambov         Tumen'      86            86

I need to replace values in city_reg to values from city_live if reg_region == live_region

I try to use

df.loc[df.reg_region == df.live_region, 'city_reg'] = df['city_live']

but it returnes

ValueError: cannot reindex from a duplicate axis

How can I fix that?

2 Answers 2

2

Use mask or numpy.where which working with duplicated indices very nice:

#create duplicated indices for test
df.index = [0,0]
print (df)
  city_reg city_live  reg_region  live_region
0   Moscow      Tver          77           69
0   Tambov    Tumen'          86           86

df['city_reg'] = df['city_reg'].mask(df.reg_region == df.live_region,  df['city_live'])

Or:

df['city_reg'] = np.where(df.reg_region == df.live_region,  df['city_reg'], df['city_live'])

print (df)
  city_reg city_live  reg_region  live_region
0   Moscow      Tver          77           69
0   Tumen'    Tumen'          86           86
Sign up to request clarification or add additional context in comments.

Comments

2

Try this:

mask = df.reg_region == df.live_region
df.loc[mask, 'city_reg'] = df.loc[mask, 'city_live']

#   city_reg city_live  reg_region  live_region
# 0   Moscow      Tver          77           69
# 1   Tumen'    Tumen'          86           86

The reason this works is that the indices are aligned between the left and right hand sides of the assignment when you apply the same mask.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.