0

I'm interested in trying a web scraping project. The target sites use Javascript to dynamically load and update content. Most of the discussion online concerning web scraping such sites indicates node.js, casper.js, phantom.js, and nightmare.js are all reasonably popular tools to use when attempting such a project. Node.js seems to be used most often.

If I am running a Flask server and wish to display the results of a node.js, for example, scrape in tabular format on my site, is this possible? Will I run into compatibility issues? Or should I try to stick it out with a python-based approach to scraping like BS4 for the sake of consistency? I ask because node.js is described as a server, so I assume a conflict would arise if I tried to use it and Flask simultaneously.

1 Answer 1

1

If you want to write a web scraper that executes javascript, node.js (with something like Phantom.js) is a great choice. Another popular choice is Selenium. You would need to simulate user actions to activate event handlers. Let's call this action "scraping". BS4 would not be appropriate because it cannot execute javascript.

Once you have your data saved to disk, displaying the results in HTML tabular form (let's call this action "reporting") would require yet another solution. Flask is a suitable choice.

Since the scraping and reporting are separate concerns, no conflict would arise if you wanted to use the two services simultaneously. When using Selenium or node.js as a scraper, you aren't really running a web server. So it's incorrect to think of it as two web-servers in possible conflict.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! At first I thought node was just another language, then I started reading and saw it referred to as a web server. Next, phantom, casper, and nightmare came out of the woodwork, which added to the confusion. I sincerely appreciate the clear and concise response!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.