How to extract details from the xml files using java?

Question

I have the following type of XML file,

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE eSummaryResult PUBLIC "-//NLM//DTD eSummaryResult, 29 October 2004//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eSummary_041029.dtd">
<eSummaryResult>
<DocSum>
    <Id>224589801</Id>
    <Item Name="Caption" Type="String">NC_000010</Item>
    <Item Name="Title" Type="String">Homo sapiens chromosome 10, GRCh37.p10 Primary Assembly</Item>
    <Item Name="Extra" Type="String">gi|224589801|gnl|ASM:GCF_000001305|10|ref|NC_000010.10||gpp|GPC_000000034.1||gnl|NCBI_GENOMES|10[224589801]</Item>
    <Item Name="Gi" Type="Integer">224589801</Item>
    <Item Name="CreateDate" Type="String">2002/08/29</Item>
    <Item Name="UpdateDate" Type="String">2012/10/30</Item>
    <Item Name="Flags" Type="Integer">544</Item>
    <Item Name="TaxId" Type="Integer">9606</Item>
    <Item Name="Length" Type="Integer">135534747</Item>
    <Item Name="Status" Type="String">live</Item>
    <Item Name="ReplacedBy" Type="String"/>
    <Item Name="Comment" Type="String"><![CDATA[  ]]></Item>
</DocSum>

</eSummaryResult>

How to extract the details from node="Item" based on the name value it has? And also is it good to use the standard java dom xml or better to use any other xml parser library for this purpose?

Evgeniy Dorofeev · Accepted Answer · 2012-12-07 13:05:39Z

1

I suggest StAX, try this (javax.xml.stream.*)

    XMLInputFactory f = XMLInputFactory.newInstance();
    XMLStreamReader rdr = f.createXMLStreamReader(new FileReader("test.xml"));
    while (rdr.hasNext()) {
        if (rdr.next() == XMLStreamConstants.START_ELEMENT) {
            if (rdr.getLocalName().equals("Item")) {
                System.out.println(rdr.getAttributeValue("", "Name"));
                System.out.println(rdr.getElementText());
            }
        }
    }

StAX must be always the first thing to consider. See http://en.wikipedia.org/wiki/StAX you will know why

answered Dec 7, 2012 at 13:05

Evgeniy Dorofeev

137k31 gold badges209 silver badges288 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

DDK · Accepted Answer · 2012-12-07 12:41:24Z

1

Try the below code

/* Create a Document object (doc) from the xml */
NodeList list = doc.getElementsByTagName("Item");

for(int i=0;i<list.getLength();i++)
{
    Node node = list.item(i);
    NamedNodeMap namedNodeMap = node.getAttributes();
    if(namedNodeMap.getNamedItem("Name").getTextContent().equalsIgnoreCase("Caption"))
    {
         System.out.println(node.getTextContent());
    }
}

The output should be NC_000010

answered Dec 7, 2012 at 12:41

DDK

1,03620 silver badges40 bronze badges

Comments

rmuller · Accepted Answer · 2012-12-07 12:58:49Z

If only using standard Java, XPath is the way to go:

private URL xml = getClass().getResource("/example.xml");

@Test
public void testExamples() throws Exception {
    //assertEquals("NC_000010", extractUsingDom("Caption"));
    assertEquals("NC_000010", extractUsingXPath("Caption"));
}

public String extractUsingXPath(final String name) throws XPathExpressionException, IOException {
    // XPathFactory class is not thread-safe so we do not store it
    XPath xpath = XPathFactory.newInstance().newXPath();
    return xpath.evaluate(
        String.format("/eSummaryResult/DocSum/Item[@Name='%s']", name), // xpath expression
        new InputSource(xml.openStream()));                             // the XML Document
}

mgaert · Accepted Answer · 2012-12-07 12:45:57Z

0

Maybe use XPath?

Document dom = ...;
XPath xpath = XPathFactory.newInstance().newXPath();
String result = xpath.evaluate("/eSummaryResult/DocSum/Item[@Name='Title']", dom);

answered Dec 7, 2012 at 12:45

mgaert

2,40823 silver badges28 bronze badges

Collectives™ on Stack Overflow

How to extract details from the xml files using java?

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related