1

I am trying to scrape the CDC website for the data of the last 7 days reported cases for COVID-19. https://covid.cdc.gov/covid-data-tracker/#cases_casesinlast7days I've tried to find the table, by name, id, class, and it always returns as none type. When I print the data scraped, I cant manually locate the table in the html either. Not sure what I'm doing wrong here. Once the data is imported, I need to populate a pandas dataframe to later use for graphing purposes, and export the data table as a csv.

2
  • for extra information, it appears that the table is generated in javascript so selemium will need to be used to get this data Commented Oct 17, 2020 at 19:33
  • what Taylor says is right. Additionally, I see that there is a button "download" on your website, so you might just try that (with selenium) Commented Oct 17, 2020 at 19:38

1 Answer 1

1

You might as well request data from the API directly (check out Network tab in your browser while refreshing the page):

import requests
import pandas as pd


endpoint = "https://covid.cdc.gov/covid-data-tracker/COVIDData/getAjaxData"
data = requests.get(endpoint, params={"id": "US_MAP_DATA"}).json()
df = pd.DataFrame(data["US_MAP_DATA"])

enter image description here


EDIT: Trying to make this answer more general and useful.

How did you discern that this was how to parse the data?

Firstly, you need to inspect the page (Ctrl + Shift + I) and navigate to network tab:

enter image description here


Secondly, you need to refresh the page to record network activity.

Where to look?

enter image description here

Check XHR to limit number of records (1);

Look through the records by clicking on them (2) and check their preview responses (3) to find out if it's the data you need.


It doesn't always work but when it does, parsing data from API directly is so much easier than writing scrapers via requests / bs4 / selenium etc and should be the first choice.

Sign up to request clarification or add additional context in comments.

2 Comments

Wow! Worked like a charm! Its possible my assignment intended for me to use beautiful soup but this solution is far more fast and effective then selenium and a chromium executable. How did you discern that this was how to parse the data?
Thank you! Very thorough explanation

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.