I have a DataFrame (20k rows) with 2 columns I would like to update if the first column (latitude) row entry is NaN. I wanted to use the code below as it might be a fast way of doing it, but I'm not sure how to update this line msk = [isinstance(row, float) for row in df['latitude'].tolist()] to get the rows that are NaN only. The latitude column I am doing the check on is float, so this line of code returns all rows.
def boolean_mask_loop(df):
msk = [isinstance(row, float) for row in df['latitude'].tolist()]
out = []
for target in df.loc[msk, 'address'].tolist():
dict_temp = geocoding(target)
out.append([dict_temp['lat'], dict_temp['long']])
df.loc[msk, ['latitude', 'longitude']] = out
return df
| id | address | latitude | longitude |
|---|---|---|---|
| 1 | addr1 | NaN | NaN |
| 2 | addr2 | NaN | NaN |
| 3 | addr3 | 40.7526 | -74.0016 |