0

I am trying to go through a XML-document in a generic way, as in knowing nothing about it, except how many levels it has:

<nodelevel1>
    <nodelevel2 attribute="xyz">
    </nodelevel2>
</nodelevel1>

Therefore I took this XML-document and extracted all the information in it in a generic way (so no XPath, no .getElementsByTagName("carname").item(0).getTextContent(), etc.). I do this in order to understand working with XML better, not in order to have a perfect solution, I'm aware that there are simpler / better solutions. This is for learning purposes only.

I was able to get all the information out in a generic way except for the attributes company="Ferrari", company="Lamborgini", etc. I had to use "Company: " + eElement.getAttribute("company").

So how can I get the attributes of the nodes (here the companies) without specifying them?

sportscars.xml

     <?xml version="1.0"?>
     <cars>
        <supercars company="Ferrari">
           <carname type="formula one">Ferarri 101</carname>
           <carname type="sports car">Ferarri 201</carname>
           <carname type="sports car">Ferarri 301</carname>
        </supercars>
        <supercars company="Lamborgini">
           <carname>Lamborgini 001</carname>
           <carname>Lamborgini 002</carname>
           <carname>Lamborgini 003</carname>
        </supercars>
        <luxurycars company="Benteley">
           <carname>Benteley 1</carname>
           <carname>Benteley 2</carname>
           <carname>Benteley 3</carname>
        </luxurycars>
     </cars>

My java-class QueryXMLFileDemo.java:

    public class QueryXmlFileDemo {

        public static void main(String[] args) {
            try {
                File inputFile = new File("sportcars.xml");
                DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
                DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
                Document inputDocument = dBuilder.parse(inputFile);
                inputDocument.getDocumentElement().normalize();
                Node carsNode = inputDocument.getFirstChild();
                NodeList carsNodeList = carsNode.getChildNodes();
                for (int i = 0; i < carsNodeList.getLength(); i++) {
                    Node carTypes = carsNodeList.item(i);

                    String attributeName = carsNodeList.item(i).getNodeName();
                    System.out.println("Attribute Name: " + attributeName);     

                    // hides the #text-entries
                    if (Node.ELEMENT_NODE != carTypes.getNodeType()) {
                        continue;
                    }
                    if (carTypes.getNodeType() == Node.ELEMENT_NODE) {
                        Element eElement = (Element) carTypes;
                        // Line I want to do generically without specifying the attributes name
                        System.out.println("Company: " + eElement.getAttribute("company"));
                    }
                    System.out.println("CarType: " + carTypes.getNodeName());
                    NodeList carNamesList = carTypes.getChildNodes();
                    for (int j = 0; j < carNamesList.getLength(); j++) {
                        Node carNameNode = carNamesList.item(j);
                        if (Node.ELEMENT_NODE != carNameNode.getNodeType()) {
                            continue;
                        }
                        System.out.println("Car: " + carNameNode.getTextContent());
                    }
                    System.out.println("");
                }
            } catch (Exception e) {
            }
        }
    }

Output:

Company: Ferrari
CarType: supercars
Car: Ferarri 101
Car: Ferarri 201
Car: Ferarri 301

Company: Lamborgini
CarType: supercars
Car: Lamborgini 001
Car: Lamborgini 002
Car: Lamborgini 003

Company: Benteley
CarType: luxurycars
Car: Benteley 1
Car: Benteley 2
Car: Benteley 3
2
  • 2
    Use JAXP, preferably the StAX API. It allows you to "walk" the XML tree and "pull" the next node - without knowing anything about it. It is also rather fast and allows you to process XML files of infinite size because it doesn't read the whole lot into memory. Commented Dec 18, 2015 at 10:52
  • Will look into it, but another question: is JAXP a XML-parser itself or is it a structure to "embed" a DOM/SAX/StAX-parser? Seems like I can parse, edit, ... XML documents with it just like the DOM-parser but the terminology confuses me. Commented Dec 18, 2015 at 11:42

2 Answers 2

1

To iterate all attributes of an Element:

NamedNodeMap attrs = element.getAttributes();
for (int i = 0; i < attrs.getLength(); i++) {
    Attr attr = (Attr)attrs.item(i);
    String name = attr.getName();
    String value = attr.getValue();
    // use here
}
Sign up to request clarification or add additional context in comments.

2 Comments

Works fine! But how did you know that you had to use NamedNodeMap for containing the attributes from the Element? I dont know how I could have found the NamedNodeMap by looking at the Java API of Element or Node.
@hamena314 By looking at the return value of the getAttributes() method of Node.
0

Here's a full answer, with SAX (Please handle exceptions etc. properly:) )

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class SaxParserExample {
  public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
    SAXParserFactory spf = SAXParserFactory.newInstance();
    spf.setNamespaceAware(true);
    SAXParser saxParser = spf.newSAXParser();
    File file = new File("c:\\temp\\sportscars.xml");
    InputStream inputStream = new FileInputStream(file);
    Reader reader = new InputStreamReader(inputStream, "UTF-8");
    InputSource is = new InputSource(reader);
    is.setEncoding("UTF-8");
    saxParser.parse(is, new DefaultHandler() {
      boolean captureCarName = false;

      @Override
      public void startElement(String uri, String localName, String qName, Attributes attributes)
          throws SAXException {
        String company = attributes.getValue("company");
        if (company != null) {
          System.out.println("Company: " + company);
          System.out.println("CarType: " + localName);
        }
        if ("carname".equals(localName)) {
          captureCarName = true;
        }
      }

      @Override
      public void endElement(String uri, String localName, String qName) throws SAXException {
        if ("carname".equals(localName)) {
          captureCarName = false;
        }
      }

      @Override
      public void characters(char[] ch, int start, int length) throws SAXException {
        if (captureCarName) {
          System.out.println("Car: " + new String(ch, start, length));
        }
      }
    });
  }
}

Output:

Company: Ferrari
CarType: supercars
Car: Ferarri 101
Car: Ferarri 201
Car: Ferarri 301
Company: Lamborgini
CarType: supercars
Car: Lamborgini 001
Car: Lamborgini 002
Car: Lamborgini 003
Company: Benteley
CarType: luxurycars
Car: Benteley 1
Car: Benteley 2
Car: Benteley 3

1 Comment

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.