9

Is it possible to get the content of an element from a XML file in startElement function that is the override function of the SAX handler?

Below is the specification.

1) XML file

<employees>
   <employee id="111">
      <firstName>Rakesh</firstName>
      <lastName>Mishra</lastName>
      <location>Bangalore</location>
   </employee>
   <employee id="112">
      <firstName>John</firstName>
      <lastName>Davis</lastName>
      <location>Chennai</location>
   </employee>
   <employee id="113">
      <firstName>Rajesh</firstName>
      <lastName>Sharma</lastName>
      <location>Pune</location>
   </employee>
</employees>

2) startElement function

@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
    .......code in here..........
}

3) Expected result

element name   : employee
attribute name : id
attribute value: 111
firstName      : Rakesh
lastName       : Mishra
location       : Bangalore

element name   : employee
attribute name : id
attribute value: 112
firstName      : John
lastName       : Davis
location       : Chennai

element name   : employee
attribute name : id
attribute value: 113
firstName      : Rajesh
lastName       : Sharma
location       : Pune
7
  • mkyong.com/java/how-to-read-xml-file-in-java-sax-parser Commented Jun 9, 2014 at 4:15
  • @PawanAryan, thank you. I already check this one. If I say I want only write code in startElement function, is it possible? Commented Jun 9, 2014 at 4:18
  • 2
    You only get attributes in startElement. Any text values you get in characters. You should use startElement to detect when an element started. Inside it you can set flags which you can check in the characters method. Knowing which is the current element inside characters, you can get its value. You must remember to reset those flags in endElement. Commented Jun 9, 2014 at 4:26
  • 1
    Using startElement() and other method is the only way you access data in XML. i dont think its possible to write every thing in startElement. SAX Parser is different than DOM because it doesn’t load complete XML into memory and read xml document sequentially. startElement() : Every time a SAX parser gets a opening tag '<', it calls startElement(). endElement(): Every time a SAX parser gets a closing tag '>', it calls endElement(). character(): Every time a SAX parser gets a simple character string, it calls character() method and the xml according to the code written in startElement(). Commented Jun 9, 2014 at 4:31
  • @PawanAryan, thank you for your easy understand concept. how about this option? I want a set of tagName, attName, attValue, and tag's value. The reason I ask this because I need to use it in another thread. Commented Jun 9, 2014 at 4:41

1 Answer 1

14

You can get the element's name in startElement and endElement. You can also get attributes in startElement. Values you should get in characters.

Here is a very basic example on how to get the value of an element using a ContentHandler:

public class YourHandler extends DefaultHandler {

    boolean inFirstNameElement = false;

    public class startElement(....) {
        if(qName.equals("firstName") {
            inFirstNameElement = true;
        }
    }

    public class endElement(....) {
        if(qName.equals("firstName") {
            inFirstNameElement = false;
        }
    }

    public class characters(....) {
        if(inFirstNameElement) {
            // do something with the characters in the <firstName> element
        }
    }
}

If you have a simple example, setting boolean flags for each tag is OK. If you have a more complex scenario, you might prefer store the flags in a map using element names as keys, or even create one or more Employee classes mapped to your XML, instantiate them every time <employee> is found in startElement, populate its properties, and add it to a Collection in endElement.

Here is a complete ContentHandler example that works with your example file. I hope it helps you get started:

public class SimpleHandler extends DefaultHandler {

    class Employee {
        public String firstName;
        public String lastName;
        public String location;
        public Map<String, String> attributes = new HashMap<>();
    }
    boolean isFirstName, isLastName, isLocation;
    Employee currentEmployee;
    List<Employee> employees = new ArrayList<>();

    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes atts) throws SAXException {
        if(qName.equals("employee")) {
            currentEmployee = new Employee();
            for(int i = 0; i < atts.getLength(); i++) {
                currentEmployee.attributes.put(atts.getQName(i),atts.getValue(i));
            }
        }
        if(qName.equals("firstName")) { isFirstName = true; }
        if(qName.equals("lastName"))  { isLastName = true;  }
        if(qName.equals("location"))  { isLocation = true;  }
    }

    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {
        if(qName.equals("employee")) {
            employees.add(currentEmployee);
            currentEmployee = null;
        }
        if(qName.equals("firstName")) { isFirstName = false; }
        if(qName.equals("lastName"))  { isLastName = false;  }
        if(qName.equals("location"))  { isLocation = false;  }
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        if (isFirstName) {
            currentEmployee.firstName = new String(ch, start, length);
        }
        if (isLastName) {
            currentEmployee.lastName = new String(ch, start, length);
        }
        if (isLocation) {
            currentEmployee.location = new String(ch, start, length);
        }
    }

    @Override
    public void endDocument() throws SAXException {
        for(Employee e: employees) {
            System.out.println("Employee ID: " + e.attributes.get("id"));
            System.out.println("  First Name: " + e.firstName);
            System.out.println("  Last Name: " + e.lastName);
            System.out.println("  Location: " + e.location);
        }
    }
}
Sign up to request clarification or add additional context in comments.

6 Comments

Excuse me!!! How about this option? In case we don't know the tagName or something, but we want to get tagName, attName, attValue, and tagValue at once. Is it possible?
As shown above, in the startElement method you can read the tag name (qName) and all attributes you can read from the atts variable (atts.getQName(i) and atts.getValue(i)), but to read the tag's text value you need to use the characters method and use flags as shown above. If you run the example above you should get the result you are expecting.
How about different xml file? Do we need to implement other codes? As I notice, your code use specific tagName in condition. If we don't know the specific one, what should we do?
SAX is intended for sequential reading of an XML file, which is necessary when you need to extract bits of information from large files. If you want to obtain all the data at once by simply using methods to extract the data you wish, you might prefer to use an object model API, such as DOM or XPath.
Please note that "SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks". Thus you need to accumulate data.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.