0

I have a table like this that i want to Parse to get the data-code value of row.id and the second and third column of the table.

<table>
    <tr class="id" data-code="100">
       <td></td>
       <td>18</td>
       <td class="name">John</td>
    <tr/>
    <tr class="id" data-code="200">
       <td></td>
       <td>21</td>
       <td class="name">Mark</td>
    <tr/>
</table>

I want to print out.

100, 18, John
200, 21, Mark

I have tried the following suggestion from this thread but its not selecting anything how to parse a table from HTML using jsoup

URL url = new URL("http://www.myurl.com");
Document doc = Jsoup.parse(url, 3000);

Element tables = doc.select("table[class=id]");

for(Element table : tables)
{
     System.out.println(table.toString());
}

EDIT: also tried using Jsoup.connect() instead of parse()

Document doc = null;
try
{
    doc = Jsoup.connect("http://www.myurl.com").get();
} 
catch (IOException e) 
{
    e.printStackTrace();
}
6
  • Table doesn't have a class "id"...? Try tr[class=id] Commented Feb 24, 2015 at 13:20
  • it doesnt work and i have tried doc.select("table tr.id") and "table tr[class=id]") Commented Feb 24, 2015 at 13:25
  • Works fine here... error is probably in the first two lines... does println(doc) output anything? Commented Feb 24, 2015 at 13:35
  • it doesnt print anything for me Commented Feb 24, 2015 at 13:40
  • Does the link show anything when you put it in your browser? Commented Feb 24, 2015 at 13:40

1 Answer 1

0

Try like this:

URL url = new URL("http://www.myurl.com");
Document doc = Jsoup.parse(url, 3000);
// This should work now
Element tables = doc.select("table tr .id");
// This propably should work too
Element tables2 = doc.select("table tr[class*=id]");

for(Element table : tables)
{
     System.out.println(table.toString());
}

From documentation:

public Elements select(String cssQuery) Find elements that match the Selector CSS query, with this element as the starting context. Matched elements may include this element, or any of its children. This method is generally more powerful to use than the DOM-type getElementBy* methods, because multiple filters can be combined, e.g.: •el.select("a[href]") - finds links (a tags with href attributes) •el.select("a[href*=example.com]") - finds links pointing to example.com (loosely)

See the query syntax documentation in Selector.

Parameters: cssQuery - a Selector CSS-like query Returns: elements that match the query (empty if none match)

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you it works. how do i get the text in between <td class="name">John</td>
doc.select("table tr .id td").text(); I thing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.