How does async / await work, in a simplified abstract sense?

Question

I am trying to understand how async / await works from a high-level, semi-abstract perspective.

There are quite a few long and complicated explanations of async / await online, some of which appear excellent for what they set out to do (this might include some of the answers and links in this and this SO question, for example). But I also suspect that there is a simpler mechanistic explanation using stacks or queues for "scientists and engineers" who just want a sense of how asyncio prioritizes various chunks of code as it runs over an async function.

To keep the question concrete, I'll focus on a relatively simple example in which an asynchronous library like aiohttp or xhttp is used to make a series of get requests for a collection of URLs. Something like:

import time
import asyncio
import aiohttp

async def fetch_url(session, url):
    async with session.get(url) as response:
        return await response.text()

async def fetch_urls(session, url):
    print(url + ' fetch 1 started')
    text_response = await fetch_url(session, url)
    print(url + ' fetch 1 ended')
    await asyncio.sleep(4)
    print(url + ' fetch 2 started')
    text_response = await fetch_url(session, url)
    print(url + ' fetch 2 ended')
    return text_response

async def main():
    start_time = time.time()
    urls = [
        'https://httpbin.org/delay/1',
        'https://httpbin.org/delay/2',
        'https://httpbin.org/delay/3',
    ]
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_urls(session, url) for url in urls]
        results = await asyncio.gather(*tasks)
        for i, result in enumerate(results):
            print(f'Response {i+1} received with {len(result)} chars')
    print(f'Time taken: {time.time() - start_time:.2f} seconds')

if __name__ == '__main__':
    asyncio.run(main())

The above code could have many more URLs of different sorts in urls, and many more repetitions of asyncio.sleep and fetch_url in the fetch_urls function. Additionally, the response delays of the URLs could be highly variable and we could implement highly variable periods of asyncio.sleep as well.

In an effort to speak precisely, I'll now define a few terms.

Let a "chunk" of code refer to a sequence of commands that are executed one after another such that the execution of each command in the chunk must finish before execution of the next command can begin. A non-async function would be a single chunk of code.

In the above code, it seems loosely like an async before a def tells Python that the code of the corresponding async function (more accurately, "coroutine") should not be treated as a single chunk but rather a series of chunks, with the break points between the chunks indicated by the await commands. For example, fetch_urls above has 4 chunks which we could describe as print/fetch, print/sleep, print/fetch, and print/return. Between each sequential pair of chunks there is typically a delay of some sort, often of unknown length.

Is this an acceptable abstraction so far?

Assuming so, each executed instance of fetch_urls seems to be called a "task". Thus, we could represent the first task as an ordered sequence: t0 = (c00, d00, c01, d01, c02, d02, c03), where t represents "task", c represents "chunk", and d represents "delay".

It then seems that asyncio starts the run of main by directing Python to execute the first chunk of each task in the order that the tasks are provided in main. Once the execution of a chunk completes, the delay between it and the next chunk begins, and once this delay ends, the next chunk of the task "arrives" for prioritization, and seems to be prioritized in something like a FIFO manner. But what specifically is this scheme of prioritization? For example, suppose that the execution of chunk c03 is currently underway and chunk c13 arrives and then chunk c22 arrives. Is c13 or c22 executed first after the execution of c03 concludes?

UPDATE: Below is a simple example of code that seems to suggest that at least some part of the queuing prioritization of asyncio is FIFO rather than, for example, ordering by activation time.

from time import perf_counter as pc
import asyncio

pc0 = pc()

async def sleep_then_sum(k, d, n):
    print(f'sleep_then_sum {k} started at {pc()-pc0} sec')
    print(f'sleep_then_sum {k} falls asleep at {pc()-pc0} sec')
    await asyncio.sleep(d)
    print(f'sleep_then_sum {k} starts computing sum at {pc()-pc0} sec')
    sn = sum(range(n))
    print(f'sleep_then_sum {k} done at {pc()-pc0} sec')
    return sn

async def main():
    m = 3
    task_list = m * [None]
    delay = [1, 3, 2]
    n_to_sum = [2*10**8, 10**8, 10**8]
    async with asyncio.TaskGroup() as tg:
        for k in range(m):
            task_list[k] = tg.create_task(sleep_then_sum(k, delay[k], n_to_sum[k]))
            await asyncio.sleep(0.2)
    sn_list = [task.result() for task in task_list]
    return sn_list

sn_list = asyncio.run(main())
print(sn_list)

print(f'Total time taken: {pc()-pc0} sec')

Your understand is correct. There is no explicit prioritization, it's a very simple FIFO queue. My recommendation is to watch this lecture by David Beazley where he introduces async/await from first principles in a very relaxed manner, by building a simple event loop. (Don't be thrown off by his use of generators and yield from - await is just yield from under the hood, and async def is implemented as a generator.) — user4815162342
– user4815162342, Commented May 22 at 9:06
@user4815162342 Awesome talk. Thanks for recommending it. My initial tests had several complexities that made me think the prioritization was more complicated than basic FIFO, but I see now that it does look that simple. I appreciate the comment. — SapereAude
– SapereAude, Commented May 22 at 17:37

VPfB · Accepted Answer · 2025-05-24 07:00:32Z

2

For readers looking for a short answer:

An async task is not interrupted at every await:

case 1. An await can return a value to the caller "immediately" and in such case the code execution continues normally. By "immediately" I mean without waiting for something; taking some time for computation is fine. In other words, when a task has everything it needs, it will run to the end.

case 2. When a value cannot be returned immediately, an await will signalize it by yielding a "pending" future object (through a pipeline of all nested await-s) to the scheduler. When this happens, the task is automatically stopped (like any other generator that just has yielded) and the scheduler picks the next routine for execution from a queue (see the note at the bottom).

A "chunk" from task's beginning or from the last await to the taks's end or the next await is internally called a step in asyncio - keep in mind that only await-s that do not immediately return a value (case 2) divide task execution to steps.

Later, when a value is assigned to the future a task is waiting for, it changes the future's state from "pending" to "done" and associated callbacks are scheduled. i.e. put into the queue. One of those callbacks is to do the task's next step. The await where it was interrupted is then completed like in the case 1.

Notes:

The queue is like a FIFO only for routines scheduled to be run without a delay. Routines scheduled with some delay are sorted by their activation time.

edited May 24 at 7:00

answered May 23 at 12:46

VPfB

18.1k8 gold badges54 silver badges98 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

SapereAude May 25 at 7:37

Thank you, this is helpful. Two simple follow-up questions to determine if I understand: (1) If fetch_urls contains await asyncio.sleep(0) rather than await asyncio.sleep(4) as above, then would this line be an example of case 1? That is, could we represent the first task as t0 = (s00, d00, s01, d01, s02) where s represents "step" (i.e., "chunk")? (2) Additionally, your "Notes" seem to predict that if, during the execution of step s01, delay d20 ends and then delay d10 ends, then s11 will run before s20 (while FIFO would predict the opposite). Is this correct?

VPfB May 25 at 9:24

@SapereAude await asyncio.sleep(0) is special. It was designed to be the case 2 as explained here docs.python.org/3/library/asyncio-task.html#asyncio.sleep

VPfB May 25 at 9:37

@SapereAude delays scheduled by asyncio.sleep(delay) are converted to abslolute time (now + delay) when they start. Actually the scheduler's queue is implemented as two queues: one is a simple FIFO (the corresponding low-level function is call_soon) and one is a sorted queue (heap) and the main low-level function is call_at and the more frequently used call_later just converts the relative delay to an absolute timestamp and then calls call_at.

SapereAude May 25 at 20:28

This sounds intriguing, but I'm not sure I fully follow. To understand I think I may need a simple mechanistic example (that could omit any or all of the specific functions under the hood). I have given such an example for FIFO queuing as an update in the original question. In that example, during the execution of step s01, delay d20 ends and then delay d10 ends, and then s21 runs before s11, suggesting FIFO queuing but not queuing by activation time. Is there a similar example that shows queuing by activation time but not FIFO?

VPfB May 26 at 19:16

(3) When the scheduled callbacks are going to be executed, all items from the queue with activation time that are due (their time has come) are moved to the end of the "plain" FIFO. Then all the items from the FIFO are executed (in the FIFO order, of course). Execution of those routines may create new scheduled items, but those new ones are processed at the next event loop iteration. Each event loop iteration has basically two steps: handle pending I/O and handle scheduled callbacks.

|

Collectives™ on Stack Overflow

How does async / await work, in a simplified abstract sense?

1 Answer 1

7 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related