I am trying to create a nested xml from a flat XML using an XSLT however I have found that it only creates one nest and ignores the rest of the records in the source XML.
My XML input looks like this:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!-- Data -->
<table name="ecatalogue">
<!-- Row 1 -->
<tuple>
<atom name="irn">2470</atom>
<atom name="EADUnitID">da.01</atom>
<atom name="EADUnitTitle">Some title</atom>
<tuple name="AssParentObjectRef" />
</tuple>
<!-- Row 2 -->
<tuple>
<atom name="irn">5416</atom>
<atom name="EADUnitID">da.01.01</atom>
<atom name="EADUnitTitle">Child of Some title</atom>
<tuple name="AssParentObjectRef">
<atom name="EADUnitTitle">Some Title</atom>
<atom name="irn">2470</atom>
</tuple>
</tuple>
<!-- Row 3 -->
<tuple>
<atom name="irn">6</atom>
<atom name="EADUnitID">da.01.02</atom>
<atom name="EADUnitTitle">Child of Some title 2</atom>
<tuple name="AssParentObjectRef">
<atom name="EADUnitTitle">Some Title</atom>
<atom name="irn">2470</atom>
</tuple>
</tuple>
<!-- Row 4 -->
<tuple>
<atom name="irn">8</atom>
<atom name="EADUnitID">da.01.02.01</atom>
<atom name="EADUnitTitle">3rd Generation</atom>
<tuple name="AssParentObjectRef">
<atom name="EADUnitTitle">Child of Some Title 2</atom>
<atom name="irn">6</atom>
</tuple>
</tuple>
<!-- Row 5 -->
<tuple>
<atom name="irn">1130</atom>
<atom name="EADUnitID">da.02</atom>
<atom name="EADUnitTitle">Another title</atom>
<tuple name="AssParentObjectRef" />
</tuple>
<!-- Row 6 -->
<tuple>
<atom name="irn">54</atom>
<atom name="EADUnitID">da.02.01</atom>
<atom name="EADUnitTitle">Child of Another title</atom>
<tuple name="AssParentObjectRef">
<atom name="EADUnitTitle">Another Title</atom>
<atom name="irn">1130</atom>
</tuple>
</tuple>
<!-- Row 7 -->
<tuple>
<atom name="irn">16</atom>
<atom name="EADUnitID">da.02.02</atom>
<atom name="EADUnitTitle">Child of Another Title 2</atom>
<tuple name="AssParentObjectRef">
<atom name="EADUnitTitle">Another Title</atom>
<atom name="irn">1130</atom>
</tuple>
</tuple>
<!-- Row 8 -->
<tuple>
<atom name="irn">22</atom>
<atom name="EADUnitID">da.02.02.01</atom>
<atom name="EADUnitTitle">3rd Generation</atom>
<tuple name="AssParentObjectRef">
<atom name="EADUnitTitle">Child of Another Title 2</atom>
<atom name="irn">1130</atom>
</tuple>
</tuple>
</table>
The XSLT should identify the top level record and then add the children. For the top record it should duplicate its irn and EADUnitTitle as TopID and TopTitle respectively. For each child it should include the immediate ParentID and ParentTitle as well as the TopID and TopTitle. The output should look like:
<?xml version="1.0" encoding="UTF-8"?>
<table name="ecatalogue">
<collection>
<tuple>
<atom name="irn">2470</atom>
<atom name="EADUnitID">da.01</atom>
<atom name="EADUnitTitle">Some title</atom>
<atom name="TopTitle">Some title</atom>
<atom name="TopID">2470</atom>
<tuple name="children">
<tuple>
<atom name="irn">5416</atom>
<atom name="EADUnitID">da.01.01</atom>
<atom name="EADUnitTitle">Child of Some title</atom>
<atom name="ParentTitle">Some title</atom>
<atom name="ParentID">2470</atom>
<atom name="TopTitle">Some title</atom>
<atom name="TopID">2470</atom>
</tuple>
<tuple>
<atom name="irn">6</atom>
<atom name="EADUnitID">da.01.02</atom>
<atom name="EADUnitTitle">Child of Some title 2</atom>
<atom name="ParentTitle">Some title</atom>
<atom name="ParentID">2470</atom>
<atom name="TopTitle">Some title</atom>
<atom name="TopID">2470</atom>
<tuple name="children">
<tuple>
<atom name="irn">8</atom>
<atom name="EADUnitID">da.01.02.01</atom>
<atom name="EADUnitTitle">3rd Generation</atom>
<atom name="ParentTitle">Child of Some title 2</atom>
<atom name="ParentID">6</atom>
<atom name="TopTitle">Some title</atom>
<atom name="TopID">2470</atom>
</tuple>
</tuple>
</tuple>
</tuple>
</tuple>
</collection>
<collection>
<tuple>
<atom name="irn">1130</atom>
<atom name="EADUnitID">da.02</atom>
<atom name="EADUnitTitle">Another title</atom>
<atom name="TopTitle">Another title</atom>
<atom name="TopID">1130</atom>
<tuple name="children">
<tuple>
<atom name="irn">54</atom>
<atom name="EADUnitID">da.02.01</atom>
<atom name="EADUnitTitle">Child of Another title</atom>
<atom name="ParentTitle">Another title</atom>
<atom name="ParentID">1130</atom>
<atom name="TopTitle">Another title</atom>
<atom name="TopID">1130</atom>
</tuple>
<tuple>
<atom name="irn">16</atom>
<atom name="EADUnitID">da.02.02</atom>
<atom name="EADUnitTitle">Child of Another title 2</atom>
<atom name="ParentTitle">Another title</atom>
<atom name="ParentID">1130</atom>
<atom name="TopTitle">Another title</atom>
<atom name="TopID">1130</atom>
<tuple name="children">
<tuple>
<atom name="irn">22</atom>
<atom name="EADUnitID">da.02.02.01</atom>
<atom name="EADUnitTitle">3rd Generation</atom>
<atom name="ParentTitle">Child of Another title 2</atom>
<atom name="ParentID">16</atom>
<atom name="TopTitle">Another title</atom>
<atom name="TopID">1130</atom>
</tuple>
</tuple>
</tuple>
</tuple>
</tuple>
....
</collection>
</table>
The XSLT I have is:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="child" match="tuple" use="tuple[@name='AssParentObjectRef']/atom[@name='irn']" />
<xsl:template match="/table">
<table name="ecatalogue">
<collection>
<xsl:apply-templates select="tuple[not(tuple[@name='AssParentObjectRef']/atom[@name='irn'])]"/>
</collection>
</table>
</xsl:template>
<xsl:template match="tuple">
<tuple>
<xsl:copy-of select="atom"/>
<xsl:if test="key('child', atom[@name='irn'])">
<tuple name="children">
<xsl:apply-templates select="key('child', atom[@name='irn'])"/>
</tuple>
</xsl:if>
</tuple>
</xsl:template>
</xsl:stylesheet>
And whilst this will group the records, the output is just one of these collections. So from a file of 3524 records, I get one collection of 24 records.
I've experimented with the XSLT replacing:
<xsl:template match="/table">
<table name="ecatalogue">
<collection>
<xsl:apply-templates select="tuple[not(tuple[@name='AssParentObjectRef']/atom[@name='irn'])]"/>
</collection>
</table>
</xsl:template>
With:
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
And whilst this returns all the nested structures, it also duplicates the records within the nests so they become collections in themselves.
Any ideas on where I'm going wrong?
EDIT 06/06/17
When I use:
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
I get duplicates (note: the 'id' in the below example is added for illustration):
<record id='1'>
<children>
<record id='2'>
<children>
<record id='3'>
<children>
<record id='4'></record>
</children>
</record>
</children>
</record>
</children>
</record>
<record id='2'>
<children>
<record id='3'>
<children>
<record id='4'></record>
</children>
</record>
</children>
</record>
<record id='3'>
<children>
<record id='4'></record>
</children>
</record>
<record id='2'></record>
<record id='3'></record>
<record id='4'></record>
Is there anyway to remove the duplicates so I'm just left with the nested records?
EDIT - Problem Tuples
<!-- Row 3378 -->
<tuple>
<atom name="irn">115024</atom>
<atom name="ObjectType">Archives</atom>
<atom name="EADLevelAttribute">Series</atom>
<atom name="EADUnitID">D42.PL.05</atom>
<atom name="EADUnitTitle">Correspondence and Company Administration: Box Files</atom>
<atom name="EADScopeAndContent">Box files of Port Line official company correspondence and administrative papers. These papers were collected towards historical research and include correspondence from earlier periods c.1890 although the bulk of the papers relate to the two periods 1937-1939 and 1949-1951.</atom>
<atom name="EADBiographyOrHistory"></atom>
<tuple name="AssParentObjectRef">
</tuple>
<atom name="EADArrangement">The papers in this series have been retained in the original order as stored by Port Line Ltd. The contents of each box file are listed as a typescript paper and have been listed in this catalogue. Box file titles have been listed in the title field of each item in this series.</atom>
<atom name="EADUnitDate">1890-1952</atom>
<table name="EADExtent_tab">
<tuple>
<atom name="EADExtent">7 boxes.</atom>
</tuple>
</table>
<atom name="EADAccruals"></atom>
<atom name="EADOtherFindingAid"></atom>
<atom name="EADRelatedMaterial"></atom>
<tuple name="EADAcquisitionInformationRef">
</tuple>
<atom name="EADAppraisalInformation"></atom>
<atom name="EADSeparatedMaterial"></atom>
<atom name="EADTitleProper"></atom>
<atom name="EADPublicationStatement"></atom>
<atom name="EADCustodialHistory"></atom>
<atom name="EADSource"></atom>
<atom name="EADNote"></atom>
<atom name="EADAccessRestrictions">Some items in this series are closed access.</atom>
<atom name="EADUseRestrictions"></atom>
</tuple>
<!-- Row 3379 -->
<tuple>
<atom name="irn">115025</atom>
<atom name="ObjectType">Archives</atom>
<atom name="EADLevelAttribute">Item</atom>
<atom name="EADUnitID">D42.PL.05.01</atom>
<atom name="EADUnitTitle">File: Australian Homeward Trade</atom>
<atom name="EADScopeAndContent">Various papers relating to Australian Homeward Trade and includes the following:For proof copies of the Australian Homeward Agreement see D42/PL5/6.</atom>
<atom name="EADBiographyOrHistory"></atom>
<tuple name="AssParentObjectRef">
<atom name="EADUnitTitle">Correspondence and Company Administration: Box Files</atom>
<atom name="irn">115024</atom>
</tuple>
<atom name="EADArrangement"></atom>
<atom name="EADUnitDate">1920-1936</atom>
<table name="EADExtent_tab">
<tuple>
<atom name="EADExtent">1 file.</atom>
</tuple>
</table>
<atom name="EADAccruals"></atom>
<atom name="EADOtherFindingAid"></atom>
<atom name="EADRelatedMaterial"></atom>
<tuple name="EADAcquisitionInformationRef">
</tuple>
<atom name="EADAppraisalInformation"></atom>
<atom name="EADSeparatedMaterial"></atom>
<atom name="EADTitleProper"></atom>
<atom name="EADPublicationStatement"></atom>
<atom name="EADCustodialHistory"></atom>
<atom name="EADSource"></atom>
<atom name="EADNote"></atom>
<atom name="EADAccessRestrictions"></atom>
<atom name="EADUseRestrictions"></atom>
</tuple>
atom @name="irn"value in the lasttuple(for3rd Generation) is erroneously listed as1130, when it should be16instead. (Probably a copy-paste error?)