Create custom object from XML with Jackson

Question

I receive large XML document, i need to extract some of the fields from it, and return them. Problem is, when I was looking at various solution on how to deserialize object with Jackson, it was mostly 1-to-1 mapping, or with building custom parser. My situation looks more or less like that

XML

<a>
 <b>
   <c>val</c>
   <d x='val' z='val'><e>val</e><f>lot of irrelevant fields</f></d>
   <g>lot of irrelevant fields</g>
  <b>
<a>

and I'm interested only in values of C X Z E so recreating entire structure in java is definitely no-no. Implementing custom parser also sound like overkill. Is it any nicer solution, IE via annotations or something similar? I remember some time ago, I've seen library which allowed to do it via annotations, but now I'm bit restricted in terms of libraries I can use.

You could build a minimal dto and annotate the class with @JsonIgnoreProperties(ignoreUnknown = true), see baeldung.com/jackson-deserialize-json-unknown-properties — Michael Kreutz
– Michael Kreutz, Commented May 12, 2020 at 18:55
@MichaelKreutz this example is about JSON while I'm parsing XML will that work? Do I need to replicate nested structure in my dto? because it is slightly more complicated than in my example — user902383
– user902383, Commented May 12, 2020 at 19:48
I did not try it out, but I think it should work as well for XML. You need to model the structure of the fields that you are interested in - all others you can omit. baeldung.com/jackson-xml-serialization-and-deserialization uses also @Json prefixed annotations in combination with xml parsing... — Michael Kreutz
– Michael Kreutz, Commented May 12, 2020 at 19:55

stdunbar · Accepted Answer · 2020-05-12 21:54:41Z

The most obvious way is with XPath. This is included in Java - no extra libraries. While there are many ways to get to what you want I wrote a quick test:

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.xml.sax.SAXException;

public class XPathDemo {
    private static final String xmlString = "<a>\n" +
            " <b>\n" +
            "   <c>val</c>\n" +
            "   <d x=\"x-val\" z=\"z-val\"><e>e-val</e><f>lot of irrelevant fields</f></d>\n" +
            "   <g>lot of irrelevant fields</g>\n" +
            "  </b>\n" +
            "</a>";

    public static void main(String[] argv) throws IOException, SAXException, ParserConfigurationException, XPathExpressionException {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setNamespaceAware(true);
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document document = db.parse(new ByteArrayInputStream(xmlString.getBytes(StandardCharsets.UTF_8)));

        XPath xpath = XPathFactory.newInstance().newXPath();
        String c_value = (String) xpath.evaluate("/a/b/c/text()", document, XPathConstants.STRING);
        System.out.println( "value of c is \"" + c_value + "\"");

        String x_value = (String) xpath.evaluate("/a/b/d/@x", document, XPathConstants.STRING);
        System.out.println( "value of x is \"" + x_value + "\"");

        String z_value = (String) xpath.evaluate("/a/b/d/@z", document, XPathConstants.STRING);
        System.out.println( "value of z is \"" + z_value + "\"");

        String e_value = (String) xpath.evaluate("/a/b/d/e/text()", document, XPathConstants.STRING);
        System.out.println( "value of e is \"" + e_value + "\"");
    }
}

Output:

value of c is "val"
value of x is "x-val"
value of z is "z-val"
value of e is "e-val"

This is a super simple example. It gets harder when you have the same basic structure repeated many times. I'd read up on XPath Syntax as it is very powerful but can be a bit of a pain to get what you want sometimes.

There are a few caveats that you should know about:

You need valid XML. What you posted is not and wouldn't work.
This will read the entire document into memory. That's fine if you have a few thousand lines. But if you've got a 10GB document you may need another way.

mfe · Accepted Answer · 2020-05-12 20:55:53Z

0

You should look at DSM library. It did exactly what you want.

https://github.com/mfatihercik/dsm

answered May 12, 2020 at 20:55

mfe

1,21811 silver badges18 bronze badges

1 Comment

user902383 Over a year ago

as i mentioned, I'm not interested in other libraries. I'm working in very close environment so adding some random libraries is not valid approach for me.

Collectives™ on Stack Overflow

Create custom object from XML with Jackson

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related