2

When xml input is given as input stream to SAX parser with some of the xml elements are empty, then the parser's character method is not called and gets different result.

For example,

XML input:

<root>
    <salutation>Hello Sir</salutation>
    <userName />
    <parent>
        <child>a</child>
    </parent>
    <parent>
        <child>b</child>
    </parent>
    <parent>
        <child>c</child>
    </parent>
    <success>yes</success>
    <hoursSpent />
</root>

Parser Implementation:

public class MyContentHandler implements ContentHandler {

private String salutation;
private String userName;
private String success;
private String hoursSpent;
String tmpValue="";

public void endElement(String uri, String localName, String qName) throws SAXException {
if ("salutation".equals(qName)) {
        userName=tmpValue;
      }
}else
if ("userName".equals(qName)) {
        userName=tmpValue;
      }
}else
if ("success".equals(qName)) {
        success=tmpValue;
      }
}else
if ("hoursSpent".equals(qName)) {
        hoursSpent=tmpValue;
      }
}


 public void characters(char[] ch, int begin, int length) throws SAXException {

    tmpValue = new String(ch, begin, length).trim();
}

Main Program:

public class MainProgram{

  public static void main(String[] args) throws Exception {
    SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
    SAXParser saxParser = saxParserFactory.newSAXParser();

    XMLReader xmlReader = saxParser.getXMLReader();

    MyContentHandler contentHandler = new MyContentHandler(xmlReader);
    xmlReader.setContentHandler(contentHandler);

    String input = "<root><salutation>Hello Sir</salutation><userName /><parent><child>a</child></parent><parent><child>b</child></parent><success>yes</success><hoursSpent /></root>";
    InputStream stream = new ByteArrayInputStream(input.getBytes());
    xmlReader.parse(new InputSource(stream));
    System.out.println(contentHandler.getUserName()); //prints Hello sir instead of null
System.out.println(contentHandler.getHoursSpent); //prints yes instead of null

if empty xml element is specified without open and close elements as below,

<userName />

instead of <userName></userName>, then the character() method in the handler class is not executed and wrong value is set. This issue occurs only when i use input xml as input stream. Please help me to solve this issue

2
  • Why not just override startElement and have that method reset tmpValue to an empty String? Commented Jun 15, 2017 at 15:46
  • Yes, it works as expected Commented Jun 16, 2017 at 8:24

1 Answer 1

5

The parser is behaving exactly as specified, it is your code that is wrong.

In general the parser makes zero-to-many calls on the characters() method between a start tag and the corresponding end tag. You need to initialize an empty buffer in startElement(), append to the buffer in characters(), and then use the accumulated value in endElement().

The way you have written it, you will not only get the wrong result for an empty element, you will also get the wrong result if the parser breaks the text up into multiple calls, which often happens if (a) there are entity references in the text, or (b) the text is very long, or (c) the text happens to span two chunks that are read from the input stream in separate read() calls.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.