Requests Session asynchronous usage?

Question

My current code creates the separate Session object for every request through the .get() method:

content_getters.py (the relevant part):

def get_page_content(link: str) -> bytes:
    headers = {"User-Agent": "Mozilla/5.0 (Macintosh; "
                             "Intel Mac OS X 10_11_6) "
                             "AppleWebKit/537.36 (KHTML, like Gecko) "
                             "Chrome/61.0.3163.100 Safari/537.36"}

    response = requests.get(link, headers=headers)

    html = response.content.decode("utf-8")

    if response.status_code != requests.codes.ok:
        raise ConnectionError("Page", link, "returned status code",
                              response.status_code)

    return response.content

def parse_single_page(link):
    content = get_page_conent(link)
    # rest of very long function

main.py:

from concurrent.futures.thread import ThreadPoolExecutor

from content_getters import get_page_content, extract_links, parse_single_page

if __name__ == "__main__":
    MAX_THREADS = 30

    # get links
    html: str = get_page_content(
        "https://www.d20pfsrd.com/bestiary/bestiary-hub/monsters-by-cr/") \
        .decode("utf-8")

    links = extract_links(html)

    num_threads = min(MAX_THREADS, len(links))
    with ThreadPoolExecutor(max_workers=num_threads) as executor:
        # asynchronous, threads will return results when they finish their
        # own work
        results = [result for result
                   in executor.map(parse_single_page, links)]

requests docs (link) state that "if you’re making several requests to the same host, the underlying TCP connection will be reused, which can result in a significant performance increase". I suppose that my separate calls to the .get() method create separate Session objects for each call, which can be faster.

Question: Is the Session object synchronous (sequential) for all requests made with it? Will I still get asynchronous requests if I use the same Session object for all threads in concurrent.futures.thread.ThreadPoolExecutor, instead of 1 Session per thread as I'm doing now?

This might help stackoverflow.com/questions/18188044/…

Prajwal
– Prajwal

2021-05-28 08:34:02 +00:00
Commented May 28, 2021 at 8:34 — Prajwal
– Prajwal, Commented May 28, 2021 at 8:34

rawrex · Accepted Answer · 2021-05-28 08:43:09Z

2

In short, Session is not thread-safe, you can check the issue discussion on Github.

For your case, I would highly recommend to look toward the asyncio and the aiohttp module, where you will have freedom to pass around a session since everything will be in one thread. It also won't induce as much overhead as the multithreading. As they say:

Use asyncio when you can, use threads when you must

The documentation on aiohttp.

edited May 28, 2021 at 8:43

answered May 28, 2021 at 8:36

rawrex

4,0742 gold badges11 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

qalis Over a year ago

Very interesting, I completely forgot about the asyncio, thanks!

rawrex Over a year ago

@qalis it is awesome! May have a bit of learning curve, but totally worth it. Would suggest this article to check before official documentation, which is quite verbose.

Bishwajit Ghosh · Accepted Answer · 2023-11-22 08:07:50Z

2

As per the documentation, requests.Session uses urllib3's connection pooling for the sessions. And as per urllib3's documentation, it is a thread-safe system now.

When the question was originally posted it probably wasn't, but in a GitHub comment, it was most likely made thread-safe for good.

answered Nov 22, 2023 at 8:07

Bishwajit Ghosh

213 bronze badges

1 Comment

LISTERINE Over a year ago

Just adding a link to the commit based on the comment you posted github.com/urllib3/urllib3/pull/2661

Collectives™ on Stack Overflow

Requests Session asynchronous usage?

2 Answers 2

2 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related