0

currently i working with python Dataframes. I am just a beginner and i created a for loop to collect the data from api (json format)and appending it to list by join a string to the list based on each search. finally converting it to Dataframe.

This loop works perfectly fine.Since it has to loop over 1500 enteries, its taking really lot of time. can anyone suggest me best python way to make it fast .?

Thank you very much in advance :)

url = "https:\\api."

team = [abc,def,ghi, ...] # List of more than 1500 entries movie symbols

abcd = list()
for t in team:
                status_url = requests.get(f"{url}/{t}")
                status_data = status_url.text
                status_data_list = list(status_data)
                status_data_list.insert(1, f"\"Movie_name\":\"{t}\",")
                final_string = ''.join(status_data_list)
                parsed = json.loads(final_string)
                abcd.append(parsed)

Movie_dataframe = pd.DataFrame(abcd)
0

1 Answer 1

2

The speed loss is not in converting the data to a dataframe. It is the requests.

first, you could change your code slightly to

for t in team:
    response = requests.get(f"{url}/{t}")
    status_data = response.json()
    status_data["Movie_name"] = t
    abcd.append(status_data)

However, you can perform the requests asynchronously, which will fetch all of the data at the same time. However your IP might get blacklisted from the website, check the maximum rate at which you can make requests

import asyncio
import httpx

url = "https:\\api."

teams = ["abc","def","ghi"]

async def get_team(team):
   async with httpx.AsyncClient() as client:
       r = await client.get(f"{url}/{team}")
   status_data = r.json()
   status_data["Movie_name"] = team
   return status_data

loop = asyncio.get_event_loop()
tasks = [get_team(team) for team in teams]
abcd = loop.run_until_complete(asyncio.gather(*tasks))
Movie_dataframe = pd.DataFrame(abcd)
Sign up to request clarification or add additional context in comments.

2 Comments

te first idea worked, little better than before, but still taking lot of time. name 'team' is not defined i get this error for the second idea. i am still reading the documentation hope something will work soon :)
@SureshkumarRamachandran I fixed my code, it should work now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.