I was trying to reproduce & better understand the TaskPool example in this blog post by Cristian Garcia, and I ran into a very interesting result.
Here are the two scripts that I used. I swapped out an actual network request with a random sleep call
#task_pool.py
import asyncio
class TaskPool(object):
def __init__(self, workers):
self._semaphore = asyncio.Semaphore(workers)
self._tasks = set()
async def put(self, coro):
await self._semaphore.acquire()
task = asyncio.create_task(coro)
self._tasks.add(task)
task.add_done_callback(self._on_task_done)
def _on_task_done(self, task):
self._tasks.remove(task)
self._semaphore.release()
async def join(self):
await asyncio.gather(*self._tasks)
async def __aenter__(self):
return self
def __aexit__(self, exc_type, exc, tb):
print("aexit triggered")
return self.join()
And
# main.py
import asyncio
import sys
from task_pool import TaskPool
import random
limit = 3
async def fetch(i):
timereq = random.randrange(5)
print("request: {} start, delay: {}".format(i, timereq))
await asyncio.sleep(timereq)
print("request: {} end".format(i))
return (timereq,i)
async def _main(total_requests):
async with TaskPool(limit) as tasks:
for i in range(total_requests):
await tasks.put(fetch(i))
loop = asyncio.get_event_loop()
loop.run_until_complete(_main(int(sys.argv[1])))
The command main.py 10 on python 3.7.1 yields the following result.
request: 0 start, delay: 3
request: 1 start, delay: 3
request: 2 start, delay: 3
request: 0 end
request: 1 end
request: 2 end
request: 3 start, delay: 4
request: 4 start, delay: 1
request: 5 start, delay: 0
request: 5 end
request: 6 start, delay: 1
request: 4 end
request: 6 end
request: 7 start, delay: 1
request: 8 start, delay: 4
request: 7 end
aexit triggered
request: 9 start, delay: 1
request: 9 end
request: 3 end
request: 8 end
I have a few questions based on this result.
- I would not have expected the tasks to run until the context manager exited and triggered
__aexit__, because that is the only trigger forasyncio.gather. However the print statements strongly suggest that thefetchjobs are occuring even before theaexit. What's happening, exactly? Are the tasks running? If so, what started them? - Related to (1). Why is the context manager exiting before all the jobs have returned?
- The
fetchjob is supposed to return a tuple. How can I access this value? For a web-based application, I imagine the developer may want to do operations on the data returned by the website.
Any help is greatly appreciated!