1

I have an old Java program that used to get data from an html page, worked fines few years ago, now when I run it, there is no data. The page link is :

http://www.batstrading.com/book/ibm/

I can still see the html table got from my Java program, but there is no data, but if you use a browser to get to that page, you can see data dynamically changing, why ?

The html text I now get with my Java program from the page is like the text you can see from the browser's view source, looks like this :

    <tbody>
      <tr>
        <td class="shares">&nbsp;</td>
        <td class="price">&nbsp;</td>
      </tr>

Instead of data, it is showing &nbsp;

How to fix my code to get the data ? What I mean is : there is nothing wrong with the Java program, it's getting the text just like the browser's view source, you don't see the data, because the page is now dynamic, so how to use Java to get data from a dynamic page is the question.

3

2 Answers 2

2

Scrap the current approach since the site is updated via Javascript. You won't be able to just download the HTML and make it work.

However, a much easier approach (than using Selenium or a JS engine) would be to simply request the source data that the Javascript is using to update the page:

http://www.batstrading.com/json/bzx/book/IBM

It's perfectly valid JSON. Request that link with your HTTP client and parse the JSON using Jackson. This will yield very reliable results.

Disclaimer You need to make sure that what you are doing complies with the Terms of Service on the website you are using. Otherwise you subject yourself to legal issues.

Sign up to request clarification or add additional context in comments.

2 Comments

Personally I feel that learning to use powerful tools that will work in every situation is a better solution than assuming other sites will be as nice as this, but if this is really the limit it's probably a better approach for simplicity's sake.
@SlaterTyranus I believe in using the right tool for the job. In this particular job, Selenium is overkill. But yes, it's a phenomenal tool for other cases (such as QA testing, or screen scraping sites without such friendly JSON)
0

You can't do this by directly downloading the page, you've got two options here. Personally I would use Casperjs or Selenium to interact with the javascript on the page. Otherwise you have to manually simulate what the javascript is doing, which is in general not very long-lasting or scalable (read: it will break once they change anything about their site).

These tools will emulate a browser and let you wait until certain elements load.

There are a number of other of these kinds of web browsers, but I would highly recommend Casper since it's fast and easy to use and call even from within your Java script since it's just Javascript. See this for instructions on calling javascript from java.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.