0

I need to make an API request for several pieces of data, and then process each result. The request is paginated, so I'm currently doing

def get_results():
    while True:
        response = api(num_results=5)
        if response is None:  # No more results
            break
        yield response

def process_data():
    for page in get_results():
        for result in page:
            do_stuff(result)

process_data()

I'm hoping to use asyncio to retrieve the next page of results from the API while I'm processing the current one, instead of waiting for results, processing them, then waiting again. I've modified the code to

import asyncio

async def get_results():
    while True:
        response = api(num_results=5)
        if response is None:  # No more results
            break
        yield response

async def process_data():
    async for page in get_results():
        for result in page:
            do_stuff(result)

asyncio.run(process_data())

I'm not sure if this is doing what I intend it to. Is this the right way to make processing the current page of API results and getting the next page of results asynchronous?

1
  • To use asyncio, the API you're calling needs to be async itself. A good indicator is that you must await its result. If none of your async def functions contain an await, that's a hint that they're not really async. Commented Nov 23, 2019 at 8:28

1 Answer 1

0

Maybe you can use Asyncio.Queue to refactor your code to Producer/Consumer Pattern

import asyncio
import random

q = asyncio.Queue()

async def api(num_results):
    # you could use aiohttp to fetch api

    # fake content
    await asyncio.sleep(1)
    fake_response = random.random()
    if fake_response < 0.1:
        return None
    return fake_response

async def get_results(q):
    while True:
        response = await api(num_results=5)
        if response is None:
            # indicate producer done
            print('Producer Done')
            await q.put(None)
            break
        print('Producer: ', response)
        await q.put(response)

async def process_data():
    while True:
        data = await q.get()
        if not data:
            print('Consumer Done')
            break
        # process data whatever you want, but if its cpu intensive, you can call loop.run_in_executor
        # fake the process needs a little time
        await asyncio.sleep(3)
        print('Consume', data)

loop = asyncio.get_event_loop()
loop.create_task(get_results(q))
loop.run_until_complete(process_data())

Come back to the question

Is this the right way to make processing the current page of API results and getting the next page of results asynchronous?

Its not the right way, because get_results() is iterated each time your do_stuff(result) done

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.