5

i have two dataframes

 df
 city   mail
  a    satya
  b    def
  c    akash
  d    satya
  e    abc
  f    xyz
#Another Dataframe d as
 city   mail
 x      satya
 y      def
 z      akash
 u      ash

So now i need to update city in df from updated values in 'd' comparing the mails, if some mail id not found it should remain as it was. So it should look like

 df ### o/p should be like
 city   mail
  x    satya
  y    def
  z    akash
  x    satya  #repeated so same value should placed here
  e    abc     # not found so as it was
  f    xyz

I have tried --

s = {'mail': ['satya', 'def', 'akash', 'satya', 'abc', 'xyz'],'city': ['a', 'b', 'c', 'd', 'e', 'f']}
s1 = {'mail': ['satya', 'def', 'akash', 'ash'],'city': ['x', 'y', 'z', 'u']}
df = pd.DataFrame(s)
d = pd.DataFrame(s1)
#from google i tried
df.loc[df.mail.isin(d.mail),['city']] = d['city']

#giving erronous result as

 city   mail
 x  satya
 y  def
 z  akash
 u  satya  ###this value should be for city 'x'
 e    abc
 f    xyz

I can't do a merge here on='mail',how='left', as in one dataframe i have less customer.So after merging, how can i map the value of non matching mail's city in merged one.

Please suggest.

2
  • What is the expected output? Commented Apr 13, 2016 at 6:01
  • @Alexander-some typoerror was there,plz see my edited question. Commented Apr 13, 2016 at 6:01

1 Answer 1

10

It looks like you want to update the city value in df from the city value in d. The update function is based on the index, so this first needs to be set.

# Add extra columns to dataframe.
df['mobile_no'] = ['212-555-1111'] * len(df)
df['age'] = [20] * len(df)

# Update city values keyed on `mail`.
new_city = df[['mail', 'city']].set_index('mail')
new_city.update(d.set_index('mail'))
df['city'] = new_city.values

>>> df
  city   mail     mobile_no  age
0    x  satya  212-555-1111   20
1    y    def  212-555-1111   20
2    z  akash  212-555-1111   20
3    x  satya  212-555-1111   20
4    e    abc  212-555-1111   20
5    f    xyz  212-555-1111   20
Sign up to request clarification or add additional context in comments.

2 Comments

@Alexander-How can it provide flexibility to update only those columns which i want to update.It seems updating df from all matching column in d.
If i have 2 extra column in df as 'age','mobile_no',,,and same two columns are also in d and updated.But i donot want those two column to get updated from d to df. Only city should updated in df....not age and mobile_no.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.