Skip the contents of an element in SAX parsing in Java

Question

I am parsing a custom XML configuration file in a Java application. I am trying to use the SAX parser, mainly because I need to report errors in the configuration with line numbers.

There are a lot of code samples online of implementing a handler class, and things seem fairly straightforward for normal processing - for example, http://tutorials.jenkov.com/java-xml/sax-example.html

But in my case, sometimes I need to skip an entire tree under an element:

<sampledocument>
    <sampletag>
         <process/>
         <these/>
         <tags/>
    </sampletag>
    <sampletag skip="yes">
         <do_not>
         <process/>
         <these/>
         <tags/>
    </sampletag>
<sampledocument>

LATER ADDITION: Moreover, I only know whether to skip at runtime. In a somewhat contrived example, I would need to open a file to process the tags under <sampletag>, and if the file is not found, not process them:

<sampledocument>
    <sampletag file="file1">
         <process/>
         <these/>
         <tags/>
         <if_file1_exists/>
    </sampletag>
    <sampletag file="file2">
         <process/>
         <these/>
         <tags/>
         <if_file2_exists/>
    </sampletag>
<sampledocument>

Of course, I can just track skipping in the handler code, but this is a bit awkward. Can I somehow tell SAX in the startElement() method to just skip the contents of this element?

Michael Kay · Accepted Answer · 2017-05-09 08:36:51Z

2

Write a filter class to sit on the pipeline between the SAX parser and your existing ContentHandler. You can do this by extending XMLFilterImpl. This filter should have an integer variable skipDepth, initially zero.

In startElement, if you recognize an element that you want to deep-skip, or if skipDepth > 0, then increment skipDepth.

In endElement, if skipDepth > 0, decrement skipDepth.

In all event handlers, pass the event on down the pipeline (by calling super.xxx()) if and only if skipDepth == 0.

If you want to be smart, you can write this filter in a generic way, so it takes a parameter which is a callback function that accepts the node name and attributes and returns a boolean indicating whether to skip the element. Then you can reuse your code next time you want to skip elements, but with different skip conditions.

answered May 9, 2017 at 8:36

Michael Kay

165k11 gold badges97 silver badges173 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Mikhail Ramendik Over a year ago

Thanks! But how is this different from simply maintaining skipDepth in the ContentHandler? In my real task, the ContentHandler must actually process the element before determining whether to skip a tree, so if I have a separate filter, the ContentHandler wil have to trigger the skipping anyway.

Michael Kay Over a year ago

SAX code is always best written as a pipeline, one step in the pipeline for each separable task. Otherwise you quickly end up with spaghetti code in your ContentHandler (you already said it was "a bit awkward"). With a properly constructed pipeline you end up with maintainable, reusable code that is easy to modify and debug; if you put everything in the ContentHandler you end up with an unmaintainable mess. Of course, if your example differs from the real task then I can't advise you how to break up the functionality in the real task.

Mikhail Ramendik Over a year ago

I have modified the example to test for files at runtime. The real code verifies correctness of the configuration, explaining how it verifies it would make a very long question, but it is a call to a separate class - so somewhat similar to checking for a file.

Michael Kay Over a year ago

Changing the question in such a way as to invalidate existing answers is just thoroughly unfriendly.

Mikhail Ramendik Over a year ago

Sorry - I did get thoroughly confused. I have restored the previous example and placed the new one after it, seeing as providing the new one in a comment is impossible.

Collectives™ on Stack Overflow

Skip the contents of an element in SAX parsing in Java

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related