unable to parse a node from xml string with dom4j

Question

I'm parsing a xml string with dom4j and I'm using xpath to select some element from it, the code is :

    String test = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><epp xmlns=\"urn:ietf:params:xml:ns:epp-1.0\"><response><result code=\"1000\"><msg lang=\"en-US\">Command completed successfully</msg></result><trID><clTRID>87285586-99412370</clTRID><svTRID>52639BB8-1-ARNES</svTRID></trID></response></epp>";
    SAXReader reader = new SAXReader();
    reader.setIncludeExternalDTDDeclarations(false);
    reader.setIncludeInternalDTDDeclarations(false);
    reader.setValidation(false);
    Document xmlDoc;
    try {
        xmlDoc = reader.read(new StringReader(test));
        xmlDoc.getRootElement();
        Node nodeStatus = xmlDoc.selectSingleNode("//epp/response/result");

        System.out.print(nodeStatus.getText());
    } catch (DocumentException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

I always get null for the nodeStatus variable. I actualy nead to read the code from the result noad from the xml

<result code="1000">

This is the XML that I am reading from the String test:

<?xml version="1.0" encoding="UTF-8"?>
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0">
    <response>
        <result code="1000">
            <msg lang="en-US">Command completed successfully</msg>
        </result>
        <trID>
            <clTRID>87285586-99412370</clTRID>
            <svTRID>52639BB8-1-ARNES</svTRID>
        </trID>
    </response>
</epp>

Any hints?

helderdarocha · Accepted Answer · 2014-02-26 18:50:15Z

1

Your XML has a namespace. DOM4J returns null because it won't find your nodes.

To make it work, you first have to register the namespaces you are using. You will need a prefix. Any one. And you will have to use that prefix in your XPath.

You could use tns for "target namespace". Then you have to create a xpath object with it like this:

XPath xpath = new DefaultXPath("/tns:epp/tns:response/tns:result");

To register the namespaces you will need to create a Map, add the namespace with the prefix you used in the xpath expression, and pass it to the setNamespaceURIs() method.

namespaces.put("tns", "urn:ietf:params:xml:ns:epp-1.0");
xpath.setNamespaceURIs(namespaces);

Now you can call selectSingleNode, but you will call it on your XPath object passing the document as the argument:

Node nodeStatus = xpath.selectSingleNode(xmlDoc);

From there you can extract the data you need. getText() won't give you the data you want. If you want the contents of the result node as XML, you can use:

nodeStatus.asXML()

Edit: to retrieve just the code, change your XPath to:

/tns:epp/tns:response/tns:result/@code

And retrieve the result with

nodeStatus.getText();

I replaced the double slash // (which means descendant-or-self) with / since the expression contains the full path and / is more efficient. But if you only have one result node in your whole file, you can use:

//result/@code

to extract the data. It will match all descendants. If there is more than one result, it will return a node-set.

edited Feb 26, 2014 at 18:50

answered Feb 26, 2014 at 17:22

helderdarocha

23.7k4 gold badges52 silver badges66 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

simonC Over a year ago

tnx for the info, i'm a little blurry with those namespace terms, havent worked with xml-s in that way till now. What about the @ in xpath I have read somewhere that with @ is possible to red the parameter from some xml noad, for example //epp/response/result/@code?

helderdarocha Over a year ago

Yes. If you just want the code string add /@code to the end of your XPath expression. Now you can retrieve the text using getText().

simonC Over a year ago

Ok if a add a namespace it works, but the parsed result is strange in the asXML() method for /tns:epp/tns:response/tns:result i get the namespace added in result node for example this <result xmlns="urn:ietf:params:xml:ns:epp-1.0" code="1000"><msg lang="en-US">Command completed successfully</msg></result>, in the original xml there is no namespace in the result node, it is added during parsing

helderdarocha Over a year ago

The namespace actually is part of the original document, since it's declared in the root element and inherited by all child nodes. The valid result requires the namespace. I am not sure about how you can remove the namespace in Dom4J. There probably is a remove() method where you can remove a namespace. You can also extract the individual nodes you need (//result/@code, //result/msg) and deliver the response in a new XML adding it to a new <result> node.

helderdarocha Over a year ago

But if all you need is the code, use getText() with xpath as //result/@code and you get just the string, without any other namespace information.

Collectives™ on Stack Overflow

unable to parse a node from xml string with dom4j

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related