1

How can i parse HTML page in Android with js results? The main problem is that if i simply use Jsoup.connect() method the Document object doesn't contain js results, because js needs some time for running. Is it possible to delay connection?

5
  • 1
    Jsoup doesn't execute the JavaScript, waiting for the DOM to be ready is not the problem. You'll have to use something that has a JavaScript engine, something like Selenium or PhantomJS. Commented Jan 9, 2018 at 9:00
  • it's not the best solution i think. You propose to add such huge modules to my app only for parsing 1 page. It's very unconvinient. Commented Jan 9, 2018 at 9:09
  • You don't have to use those. I'm just saying that you need something that has a JavaScript engine, Jsoup doesn't have that. Commented Jan 9, 2018 at 9:10
  • i understand, but what is "something" you told about? Commented Jan 9, 2018 at 9:23
  • I'm not sure, you'll have to figure that out for yourself, I've just give you 2 example which you've said are inadequate for what you're trying to do. Commented Jan 9, 2018 at 9:26

1 Answer 1

2

As already mentioned in the comments, JSOUP does not run any JavaScript. For that you would need a JavaScript Interpreter.

Since you mentioned that the page you are wanting to read takes some time to render, it seems clear that you actually need to run the JavaScript to render the DOM.

However, if you look into the source code of the page you may be able to figure out how the JavaScript actually renders the page. I see two possibilities:

1) The JavaScript really just runs to dynamically render the page with information that is already loaded with the initial access. That frequently happens for modern websites that are able to send along all relevant data with the first access (aka isomorphic rendering). Here you may get the wanted information for data that is usually available in the website as JSON objects. You can extract the JSON and then parse this with a JSON parser.

2) The JavaScript actually loads some data asynchronously. IN that case you can identify these http requests and use JSOUP to get this data. Usually such data is in JSON format, so also in this case it may make sense to use A JSON parser to read out the relevant parts.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.