4

I do not understand why resp.json() needs to be awaited. From my understanding async/await is useful when dealing with I/O. But when I call resp.json() in the example below, has the web request not already been processed with session.get() in the line above?

async with session.get('https://api.github.com/events') as resp:
    print(await resp.json())
1
  • 1
    why do you think the session.get() (an asynchronous call) will have completed before you get to resp.json()? Commented Jan 11, 2020 at 20:32

2 Answers 2

5

But when I call resp.json() in the example below, has the web request not already been processed with session.get() in the line above?

No, it reads only HTTP headers, to get response body you need to read the rest of the response.

It's pretty useful since you can check HTTP headers and avoid reading the rest of the response if, let's say, server returned wrong HTTP code.

Another example: if you expect response body to be big, you can read it by chunks to avoid RAM overusage (check note here).

Sign up to request clarification or add additional context in comments.

3 Comments

Isn't the header in the same TCP packet as the body? If so, the I/O heavy work has already been done, no? And both header and body are already in memory anyway?
@Daniel don't know about TCP, but it's surly not already in a memory. Did you ever download some big file over HTTP? :) While you get headers pretty soon, reading body from network takes noticeable amount of time.
I think that's not the way it works. When you do a request to a server I think it doesn't send the header and wait for the body to be requested. The whole response begins to be sent and doesn't stop until it ends or the TCP connection is lost. So the await is mainly useful to allow the requester to continue working while the response is completely received.
1

I went down the rabbit hole with this, so let me explain.

You are ofc right in that a conversion to JSON is not I/O bound, and it's hardly CPU-bound either (unless the response body is really big).

However, in the example of aiohttp, it's not the conversion to JSON that is being awaited, although it misleadingly appears so in the provided interface.

What aiohttp does is, even though the server returns everything (the status code, headers and body) in one go (assuming a non-streaming situation, that's just how HTTP works), the client can buffer all of it into memory, and read it in steps. So, it can read just the status code first, and depending on its value, it can proceed or not with the other steps. Each step is, technically speaking, a separate I/O operation, even though we generally don't consider reading from memory as a bottleneck, since it's super fast.

Also, if the response body is huge, it may have not fully arrived yet, so the extra step would involve network I/O too.

This separation is quite useful for large responses, but mostly just adds asyncio overhead otherwise. Remember that async programming should be used only where it makes sense, if not every function would be an awaitable by default and we wouldn't have async/await keywords.

It's a baked in decision in aiohttp though, so there's not much that you can do. Ultimately, you gain more than what you lose here overall (compared to using requests throughout your project) so you should probably be fine with the trade-off.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.