24

I'm trying to make a bunch of requests (~1000) using Asyncio and the aiohttp library, but I am running into a problem that I can't find much info on.

When I run this code with 10 urls, it runs just fine. When I run it with 100+ urls, it breaks and gives me RuntimeError: Event loop is closed error.

import asyncio
import aiohttp


@asyncio.coroutine
def get_status(url):
    code = '000'
    try:
        res = yield from asyncio.wait_for(aiohttp.request('GET', url), 4)
        code = res.status
        res.close()
    except Exception as e:
        print(e)
    print(code)


if __name__ == "__main__":
    urls = ['https://google.com/'] * 100
    coros = [asyncio.Task(get_status(url)) for url in urls]
    loop = asyncio.get_event_loop()
    loop.run_until_complete(asyncio.wait(coros))
    loop.close()

The stack trace can be found here.

Any help or insight would be greatly appreciated as I've been banging my head over this for a few hours now. Obviously this would suggest that an event loop has been closed that should still be open, but I don't see how that is possible.

6
  • is not Asyncio error. Python recursive error, reached limit. need thread for all non class function... Commented Sep 16, 2015 at 5:17
  • First, make sure you are using the latest aiohttp release. I assume you do. Technically aiohttp need one loop iteration after finishing request for closing underlying sockets. So insert loop.run_until_complete(asyncio.sleep(0)) before loop.close() call. Commented Sep 16, 2015 at 6:00
  • Your traceback suggests that a job submitted to an Executor through run_in_executor returned after the loop has been closed. Weirdly enough, aiohttp and asyncio don't use run_in_executor... Commented Sep 16, 2015 at 13:11
  • @AndrewSvetlov, thanks for the reply - I tried sleeping before close, but still no dice... any other ideas? Commented Sep 16, 2015 at 13:38
  • @Vincent technically they does, DNS resolving is performed by run_in_executor -- but it should be done before finishing get_status tasks. Commented Sep 16, 2015 at 13:44

3 Answers 3

19

The bug is filed as https://github.com/python/asyncio/issues/258 Stay tuned.

As quick workaround I suggest using custom executor, e.g.

loop = asyncio.get_event_loop()
executor = concurrent.futures.ThreadPoolExecutor(5)
loop.set_default_executor(executor)

Before finishing your program please do

executor.shutdown(wait=True)
loop.close()
Sign up to request clarification or add additional context in comments.

3 Comments

Awesome Andrew, thanks for your help. I didn't realize I was talking to part of the team :). Following this on GH
Changed in version 3.5.3: BaseEventLoop.run_in_executor() no longer configures the max_workers of the thread pool executor it creates
Andrew, can you suggest not "quick workaround" but some robust workaround for Python 3.5 ?
7

You're right, loop.getaddrinfo uses a ThreadPoolExecutor to run socket.getaddrinfo in a thread.

You're using asyncio.wait_for with a timeout, which means res = yield from asyncio.wait_for... will raise a asyncio.TimeoutError after 4 seconds. Then the get_status coroutines return None and the loop stops. If a job finishes after that, it will try to schedule a callback in the event loop and raises an exception since it is already closed.

5 Comments

Ahh, that makes sense, but this is the only way I have found to implement request timeouts. Do you know of a way that I could timeout without closing the loop?
@PatrickAllen You might want to increase the number of workers that is 5 by default.
@PatrickAllen Or use loop._default_executor.shutdown(wait=True) before closing the loop.
I'll mark this as answered, because this seems to have fixed the original problem. Should I be limiting the max number of connections? It seems that requests are timing out for no apparent reason. Maybe I'm making too many requests too quickly?
@PatrickAllen Well, 5 worker threads and a thousand of request means you're trying to run 200 socket.getaddrinfo in 4 seconds which seems reasonable to me, even though the number of workers can be increased. You can also give a custom TcpConnector to request in order to specify a connection timeout: connector=aiohttp.TCPConnector(loop=loop, force_close=True, conn_timeout=1)
3

This is a bug in the interpreter. Fortunately, it was finally fixed in 3.10.6 so you just need to update your installed Python.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.