1

new to python and can't seem to find the exact answer I am looking for though I believe there is an easier way to fill this info

I have df1 and df2

df1: FirstName  LastName  PhNo  uniqueid

df2: uniqueid PhNo

I want to fill values missing in df1['PhNo'], with matching values in df2 based on matching uniqueid == uniqueid

Codes I used are as follows

dff = pd.merge(df1,df2,on = 'uniqueid', how = 'Left')
dff['PhNo'] = 0
dff['PhNo'][df1['PhNo_x'] >= 1] = df1['PhNo_x']
df1['PhNo'][df2['PhNo_y'] >= 1] = df1['PhNo_y']

this seems to do the work but does not seem like an efficient way of doing this. I am looking for a less number of lines and better technique than merge

df1

FirstName  LastName  PhNo    uniqueid
Sam        R         123x    1
John       S         345x    2
Paul       K         np.Nan  3
Laney      P         no.NaN  4

df2

uniqueid  PhNo
1         213x
3         675x
4         987x

desired output: df1

FirstName  LastName  PhNo    uniqueid
Sam        R         123x    1
John       S         345x    2
Paul       K         **675x**    3
Laney      P         **987x**    4
3
  • Can you add some data sample, 4-5 rows with expected output? Commented Mar 10, 2019 at 17:21
  • i added the data sample as requested... Commented Mar 10, 2019 at 17:53
  • Thank you, so solution working nice? Commented Mar 10, 2019 at 17:53

2 Answers 2

4

I believe you need Series.map with Series.fillna:

df1 = pd.DataFrame({
        'FirstName':list('abcdef'),
        'LastName':list('aaabbb'),
         'PhNo':[7,np.nan,9,4,np.nan,np.nan],
         'uniqueid':[5,3,6,9,2,4],

})

print (df1)
  FirstName LastName  PhNo  uniqueid
0         a        a   7.0         5
1         b        a   NaN         3
2         c        a   9.0         6
3         d        b   4.0         9
4         e        b   NaN         2
5         f        b   NaN         4

df2 = pd.DataFrame({
         'PhNo':[10,90,30,20],
         'uniqueid':[3,6,9,4],

})
print (df2)
   PhNo  uniqueid
0    10         3
1    90         6
2    30         9
3    20         4

s = df2.set_index('uniqueid')['PhNo']
df1['PhNo'] = df1['PhNo'].fillna(df1['uniqueid'].map(s))
print (df1)
  FirstName LastName  PhNo  uniqueid
0         a        a   7.0         5
1         b        a  10.0         3
2         c        a   9.0         6
3         d        b   4.0         9
4         e        b   NaN         2
5         f        b  20.0         4
Sign up to request clarification or add additional context in comments.

4 Comments

@anky_91 - I ask for data for 100% verification :)
@jezrael I am getting 0 rather than the value from df2
@jezrael found the error, my database had "0" rather than empty string, df1['PhNo'].replace(0,np.nan,inplace=True) did the trick though. Would the similar solution work for "0" values or should i post it as separate question?
@Shri - With 0 is solution df1['PhNo'] = np.where(df1['PhNo'] == 0, df1['uniqueid'].map(s), df1['PhNo'])
0

DataFrame.fillna(value= &n)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.