1

I am struggling to convert an excel xml file to another xml format.

Here is a simplified extract of the source:

<Workbook>
  <Worksheet>
    <Table>
      <Row>
        <Cell>
          <Data>Test 1</Data>
        </Cell>
        <Cell>
          <Data>Preconditions for test 1</Data>
        </Cell>
        <Cell>
          <Data>The setup for test 1</Data>
        </Cell>
        <Cell />
      </Row>
      <Row>
        <Cell />
        <Cell>
          <Data>Step 1</Data>
        </Cell>
        <Cell>
          <Data>Todo in step 1</Data>
        </Cell>
        <Cell>
          <Data>Expected result</Data>  <!--omitted if empty-->
        </Cell>
      </Row>
         .
         .
      <Row>
        <Cell />
        <Cell>
          <Data>Step n</Data>
        </Cell>
        <Cell>
          <Data>Todo in step n</Data>
        </Cell>
        <Cell>
          <Data>Expected result</Data>  <!--omitted if empty-->
        </Cell>
      </Row>
         .
         .
 -----------------
         .
         .
      <Row>
        <Cell>
          <Data>Test m</Data>
        </Cell>
        <Cell>
          <Data>Preconditions for test m</Data>
        </Cell>
        <Cell>
          <Data>The setup for test m</Data>
        </Cell>
        <Cell />
      </Row>
      <Row>
        <Cell />
        <Cell>
          <Data>Step 1</Data>
        </Cell>
        <Cell>
          <Data>Todo in step 1</Data>
        </Cell>
        <Cell />
        <Cell>
          <Data>Expected result</Data>  <!--omitted if empty-->
        </Cell>
      </Row>          
         .
         .
      <Row>
        <Cell />
        <Cell>
          <Data>Step k</Data>
        </Cell>
        <Cell>
          <Data>Todo in step k</Data>
        </Cell>
        <Cell>
          <Data>Expected result</Data> <!--omitted if empty-->
        </Cell>
      </Row>
    </Table>
  </Worksheet>
</Workbook>

Column 2 (Cell[2]) can be seen as header for cols 3-4

My desired output should be in the following format:

<case>
  <title>Test 1</title>
  <precond>The setup for test 1</precond>
  <step>
    <index>1</index>
    <content>Todo in step 1</content>
    <expected>Expected result</expected> <!--omitted if empty-->
  </step>
     .
     .
  <step>
    <index>n</index>
    <content>Todo in step n</content>
    <expected>Expected result</expected> <!--omitted if empty-->
  </step>
</case>
   .
   .
.......
   .
   .
<case>
  <title>Test m</title>
  <precond>The setup for test m</precond>
  <step>
    <index>1</index>
    <content>Todo in step 1</content>
    <expected>Expected result</expected>  <!--omitted if empty-->
  </step>
    .
    .
  <step>
    <index>n</index>
    <content>Todo in step n</content>
    <expected>Expected result</expected>  <!--omitted if empty-->
  </step>
</case>

My problem is to define the xslt so that the <case> tags encompass all the rows up until the next row which has a data entry in Cell[1]. If I use <xsl:if test="Cell[1]/Data"> to find the next test, the closing </case> must be entered before </xsl:if> and thus I need to iterate the following sibling rows (test steps) within that statement. How do I accomplish that?

Here's my feeble attempt at an xslt:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="utf-8" indent="yes" />


<xsl:template match="Workbook/Worksheet/Table">
  <xsl:for-each select="Row">
    <xsl:if test="Cell[1]/Data">   
      <case>
        <title>
          <xsl:value-of select="Cell[1]/Data"/>
        </title>
        <xsl:if test="Cell[3]/Data">
          <precond>
            <xsl:value-of select="Cell[3]/Data"/>
          </precond>
        </xsl:if>

        <!-- Here I need to iterate "while not" following-sibling::Cell[1]/Data and extract the data from cells 2-4.-->

      </case>
    </xsl:if>
  </xsl:for-each>
</xsl:template>
3
  • Does it need to be XSLT 1.0 or do you have the ability to use XSLT 2.0? Commented Nov 8, 2013 at 9:10
  • Mark, I haven't really considered the differences. I am not well acquainted with either. I have made a little C# program using System.Xml.Xsl as an engine in .net. From here it seems that xslt 1.0 is necessary Commented Nov 8, 2013 at 9:50
  • Also added a XSLT 1.0 solution. Commented Nov 8, 2013 at 9:56

2 Answers 2

1

If you are stuck with XSLT 1.0, Indexing is a possible and easy way:

<xsl:key name="steps" match="Row[not(Cell[1]/Data)]" use="generate-id(preceding-sibling::Row[Cell[1]/Data][1])"/>

<xsl:template match="/">
  <result>
    <xsl:for-each select="//Row[Cell[1]/Data]">
      <case>
        <title><xsl:value-of select="Cell[1]/Data"/></title>
        <precond><xsl:value-of select="Cell[3]/Data"/></precond>
        <xsl:for-each select="key('steps', generate-id(.))">
          <step>
            <index><xsl:value-of select="Cell[2]/Data" /></index>
            <content><xsl:value-of select="Cell[3]/Data" /></content>
            <expected><xsl:value-of select="Cell[4]/Data" /></expected>
          </step>
        </xsl:for-each>
      </case>
    </xsl:for-each>     
  </result>
</xsl:template>
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. Both this and Mark's solutions work. Not caring about efficiency, I pick this solution for code size.
1

If you are able to use XSLT 2.0, you could use the following XSLT as an example:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

    <!-- Identity template to 'loop' through all input XML (nodes and attributes) -->
    <xsl:template match="@*|node()">
        <xsl:apply-templates select="@*|node()" />
    </xsl:template>

    <!-- create new root element -->
    <xsl:template match="Workbook">
        <data>
            <xsl:apply-templates select="@*|node()" />
        </data>
    </xsl:template>

    <!-- Match on Table element, to do some grouping -->
    <xsl:template match="Table">
        <!-- Group all rows together, each group starts with a Row where Cell[1] is not empty -->
        <xsl:for-each-group select="Row" group-starting-with="Row[Cell[1] != '']">
            <case>
                <title><xsl:value-of select="Cell[1]/Data" /></title>
                <precond><xsl:value-of select="Cell[3]/Data" /></precond>
                <!-- Loop through the group, but forget the first occurence, since it is the header, use before this -->
                <xsl:for-each select="current-group()[position() &gt; 1]">
                    <step>
                        <index><xsl:value-of select="Cell[2]/Data" /></index>
                        <content><xsl:value-of select="Cell[3]/Data" /></content>
                        <expected><xsl:value-of select="Cell[4]/Data" /></expected>
                    </step>
                </xsl:for-each>
            </case>
        </xsl:for-each-group>
    </xsl:template>
</xsl:stylesheet>

The following XSLT will work in XSLT 1.0 and even in XSLT 2.0:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

    <!-- Identity template to 'loop' through all input XML (nodes and attributes) -->
    <xsl:template match="@*|node()">
        <xsl:apply-templates select="@*|node()" />
    </xsl:template>

    <!-- create new root element -->
    <xsl:template match="Workbook">
        <data>
            <xsl:apply-templates select="descendant::Row[Cell[1] != '']" />
        </data>
    </xsl:template>

    <!-- Match Rows where first Cell is empty and create the <case> element with header information -->
    <xsl:template match="Row[Cell[1] != '']">
        <case>
            <title><xsl:value-of select="Cell[1]/Data" /></title>
            <precond><xsl:value-of select="Cell[3]/Data" /></precond>

            <xsl:choose>
                <xsl:when test="count(following-sibling::Row[Cell[1] != '']) = 0">
                    <!-- When there are no next Row elements with empty first Cells, we can just process the remaining Rows -->
                    <xsl:apply-templates select="following-sibling::Row[Cell[1] = '']" />
                </xsl:when>
                <xsl:otherwise>
                    <!-- There are still Rows with empty first Cells, so we only process the Rows in between -->
                    <xsl:apply-templates select="following-sibling::Row[Cell[1] != ''][1]/preceding-sibling::Row[Cell[1] = '']" />
                </xsl:otherwise>
            </xsl:choose>
        </case>
    </xsl:template>

    <!-- Process Rows with empty first Cells, which will be the step information -->
    <xsl:template match="Row[Cell[1] = '']">
        <step>
            <index><xsl:value-of select="Cell[2]/Data" /></index>
            <content><xsl:value-of select="Cell[3]/Data" /></content>
            <expected><xsl:value-of select="Cell[4]/Data" /></expected>
        </step>
    </xsl:template>
</xsl:stylesheet>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.