1

Google seems to be failing me today: I'm looking for a way to load a remote html page into my Java application. This HTML page contains some JavaScript that generates most of the content. Now I thought it would be fairly straightforward to open the page in Java and have a look at the HTML.

When I use URL.openStream() to read the file, I get the HTML source with JavaScript and without the generated HTML (which is what I would expect). So how do i get from this to the HTML source including the generated content? I thought it would be fairly straightforward, but after a few hours on Google, I get completely entangled in Rhino, EnvJs, Jsoup, but it's not really getting me anywhere.

Does anyone have any suggestions?

2
  • 1
    This might not be the best solution. But when you put the HTML in a webview the Javascript code will be executed. So you can pull it from the webview again. Commented Oct 23, 2012 at 12:20
  • You need to execute the JS first with some JS-engine to gather its output. Commented Oct 23, 2012 at 12:51

1 Answer 1

2

Yes, basically there is no easy solution, as you need to actually render the page, so you need a javascript engine (as feeela says).

One solution is to use webkit. I haven't used it in Java, but in Python. You may look at WebKit browser in Java app on multiple platforms

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.