1

I'm having some trouble parsing an XML file in Java. The file takes the form:

<root>
  <thing>
    <name>Thing1</name>
    <property>
      <name>Property1</name>
    </property>
    ...
  </thing>
  ...
</root>

Ultimately, I would like to convert this file into a list of Thing objects, which will have a String name (Thing1) and a list of Property objects, which will each also have a name (Property1).

I've been trying to use xpaths to get this data out, but when I try to get just the name for 'thing', it gives me all of the names that appear in 'thing', including those of the 'property's. My code is:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse(filename);
XPath xpath = XPathFactory.newInstance().newXPath();


XPathExpression thingExpr = xpath.compile("//thing");
NodeList things = (NodeList)thingExpr.evaluate(dom, XPathConstants.NODESET);
for(int count = 0; count < things.getLength(); count++)
{
    Element thing = (Element)things.item(count);
    XPathExpression nameExpr = xpath.compile(".//name/text()");
    NodeList name = (NodeList) nameExpr.evaluate(thing, XPathConstants.NODESET);
    for(int i = 0; i < name.getLength(); i++)
    {
        System.out.println(name.item(i).getNodeValue());    
    }
}

Can anyone help? Thanks in advance!

1
  • It doesn't seem that you have expressed exactly what you want to produce using XPath -- even when the comments are taken into account. Xpath is used to select some specific nodes we are interested in -- which are they in your particular case? And which data from these particular nodes do you want to extract? Please, edit the question and specify this missing and important information. Commented Oct 19, 2012 at 5:36

3 Answers 3

1

You could try something like...

public class TestXPath {

    public static void main(String[] args) {
        String xml =
                        "<root>\n"
                        + "    <thing>\n"
                        + "        <name>Thing1</name>\n"
                        + "        <property>\n"
                        + "            <name>Property1</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property2</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property3</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property4</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property5</name>\n"
                        + "        </property>\n"
                        + "    </thing>/n"
                        + "    <NoAThin>\n"
                        + "        <name>Thing2</name>\n"
                        + "        <property>\n"
                        + "            <name>Property1</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property2</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property3</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property4</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property5</name>\n"
                        + "        </property>\n"
                        + "    </NoAThin>/n"
                        + "</root>";

        try {
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();
            ByteArrayInputStream bais = new ByteArrayInputStream(xml.getBytes());
            Document dom = db.parse(bais);
            XPath xpath = XPathFactory.newInstance().newXPath();

            // Find the "thing" node...
            XPathExpression thingExpr = xpath.compile("/root/thing");
            NodeList things = (NodeList) thingExpr.evaluate(dom, XPathConstants.NODESET);

            System.out.println("Found " + things.getLength() + " thing nodes...");

            // Find the property nodes of thing
            XPathExpression expr = xpath.compile("property");
            NodeList nodes = (NodeList) expr.evaluate(things.item(0), XPathConstants.NODESET);

            System.out.println("Found " + nodes.getLength() + " thing/property nodes...");

            // Find all the property "name" nodes under thing
            expr = xpath.compile("property/name");
            nodes = (NodeList) expr.evaluate(things.item(0), XPathConstants.NODESET);

            System.out.println("Found " + nodes.getLength() + " name nodes...");
            System.out.println("Property value = " + nodes.item(0).getTextContent());

            // Find all nodes that have property nodes
            XPathExpression exprAll = xpath.compile("/root/*/property");
            NodeList nodesAll = (NodeList) exprAll.evaluate(dom, XPathConstants.NODESET);
            System.out.println("Found " + nodesAll.getLength() + " property nodes...");

        } catch (Exception exp) {
            exp.printStackTrace();
        }
    }
}

Which will give you an output of something like

Found 1 thing nodes...
Found 5 thing/property nodes...
Found 5 name nodes...
Property value = Property1
Found 10 property nodes...
Sign up to request clarification or add additional context in comments.

Comments

0

How about "//thing/name/text()" ?

The double slashes you have now before name mean "anywhere in the tree, not necessarily direct child nodes".

1 Comment

That does get me all of the names, but if I later do the same thing with the Property names, I don't know how to match them up :/ . I do use the ".//" later, because I've been told that's for a relative xpath.
0

Use these XPath expressions:

//thing[name='Thing1']

this selects any thing element in the XML document, that has a name child whose string value is "Thing1".

also use:

//property[name='Property1']

this selects any property element in the XML document, that has a name child whose string value is "Property1".

Update:

To get all text nodes, each containing a string value of a thing element, just do:

//thing/text()

In XPath 2.0 one can get a sequence of the strings themselves, using:

//thing/string(.)

This isn't possible with a single XPath expression, but one can get the string value of a particular (the n-th)thing element like this:

string((//thing)[$n])

where $n must be substituted with a specific number from 1 to count(//thing) .

So, in your prograaming language, you can first determine cnt by evaluating this XPath expression:

count(//thing)

And then in a loop for $n from 1 to cnt dynamically produce the xpath expression and evaluate it:

string((//thing)[$n])

Exactly the same goes for obtaining all the values for property elements.

1 Comment

That's assuming I already know the name of the Things, which I don't... that's what I'm looking for :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.