1

I've looked around for a solution and tried filtering my df to where the longitude and latitude are not null but to no avail. This is my first time using geopy package so maybe my error is stemming from that. I have a df that includes long/lat coords and I'm trying to attach city, state and country to each observation. When I limit my df to just the first 10 rows my code works like a charm. When I apply it to the whole df(34,556 observations) I get this error code: 'NoneType' object has no attribute 'raw'.


from geopy.geocoders import Nominatim

geolocator = Nominatim(user_agent="geoapiExercises")

df_power_org = pd.read_csv('global_power_plant_database.csv', low_memory=False)

df_power_org = df_power_org[df_power_org.longitude.notnull()]
df_power_org = df_power_org[df_power_org.latitude.notnull()]

def city_state_country(row):
    coord = f"{row['latitude']}, {row['longitude']}"
    location = geolocator.reverse(coord, exactly_one=True, language='en')
    address = location.raw['address']
    city = address.get('city', '')
    state = address.get('state', '')
    country = address.get('country', '')
    row['city'] = city
    row['state'] = state
    row['country2'] = country
    return row

df_power_org = df_power_org.apply(city_state_country, axis=1)

Any advice is deeply appreciated.

2
  • Which service are you using for the geocoding? Commented Apr 22, 2022 at 5:03
  • I'm using geolocator, I realize I should add that code chunk to the initial post. I'll update that now. Commented Apr 22, 2022 at 21:24

1 Answer 1

2

From geopy's documentation:

Nominatim is free, but provides low request limits.

Digging a little deaper, Nominatim's site states:

No heavy uses (an absolute maximum of 1 request per second).

It's likely that you're being blocked by Nominatim for excessive requests.

If you want to use Nominatim and follow their instructions, you can modify your code to pause after each request... this will take about 10 hours to do all 34k requests.

from geopy.geocoders import Nominatim
from time import sleep

geolocator = Nominatim(user_agent="geoapiExercises")

df_power_org = pd.read_csv('global_power_plant_database.csv', low_memory=False)

df_power_org = df_power_org[df_power_org.longitude.notnull()]
df_power_org = df_power_org[df_power_org.latitude.notnull()]

def city_state_country(row):
    coord = f"{row['latitude']}, {row['longitude']}"
    sleep(1)
    location = geolocator.reverse(coord, exactly_one=True, language='en')
    if not location:
        # if you see many in a row, it's probably Nominatim blocking you.
        # if it's just every once in a while, there were just some bad results. 
        print('Failed with coord: ', coord)
        row['city'], row['state'], row['country2'] = None, None, None
        return row
    address = location.raw['address']
    city = address.get('city', '')
    state = address.get('state', '')
    country = address.get('country', '')
    row['city'] = city
    row['state'] = state
    row['country2'] = country
    return row

df_power_org = df_power_org.apply(city_state_country, axis=1)
Sign up to request clarification or add additional context in comments.

3 Comments

If this is intended at all for a commercial service, please consider using a paid service. My suggestion, having used it before for this purpose, would be to use GoogleV3. 40k reverse requests costs $200, and you get $200 a month in free requests, so you can do the whole dataset for free.
Thank you for that detail on the request limit. I attempted running it with the edit suggested, but after approximately 10 minutes it returned the same error code.
In that case, it may be working properly, but one of your lat/long pairs returns None when reverse geolocated. I'll edit my code with a suggested work around~

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.