0

I am trying to use jsoup libraries to parse a html file and get all the data relating to table class="scl_list" as below, which is only a small part of the html page.

<table class="scl_list">
        <tr>
            <th align="center">Id:</th>
            <th align="center">Name:</th>
            <th align="center">Serial:</th>
            <th align="center">Status:</th>
            <th align="center">Ladestrom:</th>
            <th align="center">Z&auml;hleradresse:</th>
            <th align="center">Z&auml;hlerstand:</th>
        </tr>
        <tr>
            <th align="center">7</th>
            <th align="center">7</th>
            <th align="center">c3001c0020333347156a66</th>
            <th align="center">Idle</th>
            <th align="center">16.0</th>
            <th align="center">40100021</th>
            <th align="center">12464.25</th>
        </tr>
        <tr>
            <th align="center">21</th>
            <th align="center">21</th>
            <th align="center">c3002a003c343551086869</th>
            <th align="center">Idle</th>
            <th align="center">16.0</th>
            <th align="center">540100371</th>
            <th align="center">1219.73</th>
        </tr>
    </table>

For every <tr> , I then need to get every <th> and save the data in a table or vector. Unfortunately I can't find many examples using jsoup which does something similar.

So far I have this, where html_string is my html page, but I'm not sure how to progress. Any help is much appreciated :

Document doc = Jsoup.parse(html_string);
Elements els = doc.getElementsContainingText("table class=\"scl_list\"");
0

1 Answer 1

1

Jsoup is a simple and intuitive library. You can find many examples online how to read html tables. Look at the documentation under jsoup cookbook and especially the selector-syntax. To get back to your question, an easy way would be the following:

public static void main(String[] args) {
    String html =   "<table class=\"scl_list\">\n" +
                    "        <tr>\n" +
                    "            <th align=\"center\">Id:</th>\n" +
                    "            <th align=\"center\">Name:</th>\n" +
                    "            <th align=\"center\">Serial:</th>\n" +
                    "            <th align=\"center\">Status:</th>\n" +
                    "            <th align=\"center\">Ladestrom:</th>\n" +
                    "            <th align=\"center\">Z&auml;hleradresse:</th>\n" +
                    "            <th align=\"center\">Z&auml;hlerstand:</th>\n" +
                    "        </tr>\n" +
                    "        <tr>\n" +
                    "            <th align=\"center\">7</th>\n" +
                    "            <th align=\"center\">7</th>\n" +
                    "            <th align=\"center\">c3001c0020333347156a66</th>\n" +
                    "            <th align=\"center\">Idle</th>\n" +
                    "            <th align=\"center\">16.0</th>\n" +
                    "            <th align=\"center\">40100021</th>\n" +
                    "            <th align=\"center\">12464.25</th>\n" +
                    "        </tr>\n" +
                    "        <tr>\n" +
                    "            <th align=\"center\">21</th>\n" +
                    "            <th align=\"center\">21</th>\n" +
                    "            <th align=\"center\">c3002a003c343551086869</th>\n" +
                    "            <th align=\"center\">Idle</th>\n" +
                    "            <th align=\"center\">16.0</th>\n" +
                    "            <th align=\"center\">540100371</th>\n" +
                    "            <th align=\"center\">1219.73</th>\n" +
                    "        </tr>\n" +
                    "    </table>";
    Document doc = Jsoup.parse(html);
    Elements trs = doc.select("table.scl_list tr");
    List<List<String>> data = new ArrayList<>();
    for(Element tr : trs){
        List<String> row = tr.select("th").stream().map(e -> e.text())
                                .collect(Collectors.toList());
        data.add(row);
    }
    data.forEach(System.out::println);
}

The output should be something like:

[Id:, Name:, Serial:, Status:, Ladestrom:, Zähleradresse:, Zählerstand:]
[7, 7, c3001c0020333347156a66, Idle, 16.0, 40100021, 12464.25]
[21, 21, c3002a003c343551086869, Idle, 16.0, 540100371, 1219.73]

Since the first element seems to contain only the table heading, you can skip it by using a simple for loop and starting from the second element.

Since I assume that your data represents electricity meters, I would recommend you to implement a small class as data container, which could look like this

class Meter{
    int id;
    String name;
    String serial;
    String status;
    double chargingCurrent;
    String address;
    double  meterReading;

    public Meter(List<String> data) {
        this.id = Integer.parseInt(data.get(0));
        this.name = data.get(1);            
        this.serial = data.get(2);
        this.status = data.get(3);
        this.chargingCurrent = Double.parseDouble(data.get(4));
        this.address = data.get(5);
        this.meterReading = Double.parseDouble(data.get(6));
    }
    // getters & setters
}

The code from above could then be rewrittten to something like:

Document doc = Jsoup.parse(html);
Elements trs = doc.select("table.scl_list tr");
List<Meter> meters = new ArrayList<>();
for(int i = 1; i< trs.size(); i++){
    List<String> row = trs.get(i).select("th").stream().map(e -> e.text())
                            .collect(Collectors.toList());
    meters.add(new Meter(row));
} 
meters.forEach(System.out::println);

with a corresponding toString method the output will look like:

Meter{id=7, name=7, serial=c3001c0020333347156a66, status=Idle, chargingCurrent=16.0, address=40100021, meterReading=12464.25}
Meter{id=21, name=21, serial=c3002a003c343551086869, status=Idle, chargingCurrent=16.0, address=540100371, meterReading=1219.73}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.