How to parse a JavaScript-based page

Question

I can't grab Ontario Coronavirus's HTML from a page that's served using JavaScript. I'm using Nokogiri in Ruby.

The site Ruby retrieves is more of a warning/explanation page that says my browser needs JavaScript.

<h1>JavaScript is required to view this site</h1> <p>Ontario.ca needs JavaScript to function properly and provide you with a fast,
stable experience. Please enable JavaScript or check your browser's settings.</p>...Outdated browsers lack safety features that keep your information secure

I tried parsing the page using JSON with the same result. The page comes back as a stringIO object, and that .string also has the same result.

How can I grab this page and any others that get served this way? I'm thinking this is a recurring issue with JavaScript served sites.

The pages are loaded through ajax, so one way to load this site is to use a Watir gem on this one. By the way, which part of the information you want to retrieve? — Fernand
– Fernand, Commented Mar 19, 2020 at 1:33
Welcome to SO! Your question is poorly asked. Questions seeking debugging help ("why isn't this code working?") must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. See: "How to create a Minimal, Reproducible Example". Nokogiri is not designed to process JavaScript based pages. It can help you locate scripts and from there you can write code to extract URLs and process those but DHTML is beyond Nokogiri's scope. Search for "ruby scrape javascript". — the Tin Man
– the Tin Man, Commented Mar 19, 2020 at 4:43
@theTinMan All of those items were in my post. I had no clue it was served through AJAX. That was the problem, and was properly answered by Fernand. It was reproducible through reading my post. — Rich_F
– Rich_F, Commented Mar 19, 2020 at 5:16
You said you tried things but don't show us; Evidence of effort is an important part of asking. You ask how you can grab the page, but researching, showing where you researched and explaining why it didn't help is also part of evidence of effort. That you got a comment helping doesn't make the question any better nor does it make it reproducible. YOU know the steps you took, and you might be able to reproduce it, but that also won't make it well asked because we can't duplicate anything you tried. — the Tin Man
– the Tin Man, Commented Mar 19, 2020 at 21:47

Fernand · Accepted Answer · 2020-03-23 04:14:43Z

1

You need to use a Watir gem for this one since it is loaded through ajax. And also, it seems they have an API, you may also want to take look at this.

answered Mar 23, 2020 at 4:14

Fernand

1,3431 gold badge10 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to parse a JavaScript-based page

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related