0

Hi I had written a generic code in Java that parses XML input file without knowing its structure and outputs value in comma separated value. So lets say I have following in my XML document:

<Employee>
    <Name>XYZ</Name>
    <Id>123</Id> 
    <Address>
         <Office_Address>office address here</Office_Address>
    </Address>
</Employee>

So now my Java code parses above xml file into comma separated value as:

Employee (File 1):  Name , ID
Address (File 2):  Office_Address

That is for each nested element its output a new csv file having columns inside it equals to its child nodes.

So this is working fine but now problem is : Lets I am having same above mentioned XML file as:

 <Employee>
    <Name>XYZ</Name>
    <Id>123</Id> 
    <Address/>
</Employee>

So in this case when my generic Java code process this file it outputs as:

Employee (File 1) : Name, Id, Address

So instead of two output file I am getting one and file 1 has sometimes 3 entries instead of 2. This happens because Address element is present sometime as nested and some time as flat. So when it is nested Java code creates a new comma separated corresponding to it but when it is not nested than it outputs just one file.

I can solve this problem by hard coding the logic for this element. But I do not want to do that as than there will be no point of my Java generic XML parsing code.

So my question is that any way in which we can figure out that an element in an XML files generating from same sources may be present as nested and sometime as flat. Use of XSD or any other way. I tried researching many things but not able to figure out anything.

Thanks in advance and hoping to get solution or few good suggestions.

6
  • @AndrewThompson: Its just a dummy example I made to explain problem I am facing. Did not thought about this. Thanks for pointing out but let me know if you have any ideas to fix original problem. Commented May 6, 2013 at 16:32
  • you mention "XSD", do you have an xsd for the xml? if so, then yes you can solve the problem. if not, you will have a tough time solving this in the general sense. Commented May 6, 2013 at 16:56
  • can you tell me how I can solve this problem if I had the XSD for the XML file. Please tell me solution only if you are suggesting that I should read the complete XML file once, get access to its structure in my code some how but than my generic parsing code wont be generic. Because as I try to process new XML I need to make changes in the code so that don't left my code generic. Commented May 6, 2013 at 18:09
  • i did explain my comment in my answer below. Commented May 6, 2013 at 18:56
  • 1
    xsd is a well documented specification... Commented May 6, 2013 at 19:29

2 Answers 2

1

This happens because Address element is present sometime as nested and some time as flat.

That statement is not correct. Address is still nested under the Employee element. In the 2nd case, it is just empty. If you can test for "empty" element (Address element with no children) in your generic code then this issue can be solved.

Sign up to request clarification or add additional context in comments.

3 Comments

In the XML file there are various other elements which are present empty most of times or if not have a plain text value (as compared to Address element which when not empty has its own child element). I agree it is still nested under Employee but since when Address is empty it does not have its own child so in that context it is not nested.
Also if I test for empty element(one which might have children versus one which might have text value when not empty) than how I will figure out that whether this empty element should go in new file or should be in same file as its parent. Let me know if you understand what I am saying.
@user1188611 Post your code with an example. junit is even better.
1

If you have an xsd, then you could parse the xsd file and determine which elements support nested elements.

If you don't have an xsd, then you basically would have to parse the entire xml file once to determine all the possible nesting (i.e. you're basically inspecting the xml file to build your own xsd), then parse it again to actually output the final result based on the knowledge you gained from the first pass.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.