1

I want to split up an XML file according to a certain node's different attributes creating separated XML files all with the same nodes in the top part of the file followed by the node + attribute and its underlying contents until the end of this node.

All separated XML files than need to end with similar end nodes.

Example XML file:

<?xml version=""1.0"" encoding=""UTF-8""?>
<node1>
  <node2>
    <node3 attribute='1'>item</node3>
    <node3 attribute='2'>item</node3>
    <node3 attribute='3'>item</node3>
  </node2>
<node6 attribute='1'>
    <node7>item = (node3 attribute2)</node7>
    <node8>item = (node3 attribute3)</node8>
</node6>
<node6 attribute='2'>
    <node9>item = (node3 attribute1)</node9>
    <node10>item = (node3 attribute2)</node10>
</node6>
</node1>

From this example I want to use the attribute of node6 to be the breakpoint of creating a new XML file. Resulting in 2 XML files looking like this:

Separated XML 1:

<?xml version=""1.0"" encoding=""UTF-8""?>
<node1>
  <node2>
    <node3 attribute='1'>item</node3>
    <node3 attribute='2'>item</node3>
    <node3 attribute='3'>item</node3>
  </node2>
<node6 attribute='1'>
    <node7>item = (node3 attribute2)</node7>
    <node8>item = (node3 attribute3)</node8>
</node6>

Separated XML 2:

<?xml version=""1.0"" encoding=""UTF-8""?>
<node1>
  <node2>
    <node3 attribute='1'>item</node3>
    <node3 attribute='2'>item</node3>
    <node3 attribute='3'>item</node3>
  </node2>
<node6 attribute='2'>
    <node9>item = (node3 attribute1)</node9>
    <node10>item = (node3 attribute2)</node10>
</node6>
</node1>

I have been looking and working with all these answers but they did not help me to find the right code to do as mentioned above.

https://stackoverflow.com/questions/30374533/split-xml-files-newbie

How to split an xml file in vb

Splitting Xml Document according to node

Can someone help me figure out what the best way to do this is?

2
  • Are you familiar with XSLT? It can do the job for you, for example see stackoverflow.com/questions/5578602/…. I am also a VB programmer, but I would not recommend using this or any other similar programming language for this type of task (unless you have really tight schedule which forces you to play dirty tricks instead of producing regular solutions). I recommend you to check XSLT and use it instead of VB. It is suitable tool for the job, so you can get the result with less effort. Commented Dec 29, 2016 at 20:49
  • Thanks for the info miroxlav. I am unfortunately totally not familiar with XSLT. My goal is to write a Windows Forms program for another user to be able to split their XML files with this program. Is it possible to implement a XSLT program inside a VB Windows Form? Commented Dec 29, 2016 at 21:03

2 Answers 2

1

I am aware you asked specifically for a VB solution but, here's a C# one you may be able to adapt.

using System;
using System.Windows.Forms;
using System.Xml.Linq;
using System.IO;

namespace SplitXmlFile_41385730
{
    public partial class Form1 : Form
    {
        public static string incomingXML = @"M:\StackOverflowQuestionsAndAnswers\SplitXmlFile_41385730\SplitXmlFile_41385730\Samples\data.xml";
        public static string outgoingXML = @"M:\StackOverflowQuestionsAndAnswers\SplitXmlFile_41385730\SplitXmlFile_41385730\Samples\data_out.xml";
        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {
            XElement theincomingDoc = new XElement(XDocument.Load(incomingXML).Root);//the incoming XML

            //store the header of your files
            XElement header = new XElement(theincomingDoc);
            header.Elements("node6").Remove();//remove these nodes since they need to be parked in their own file
            int fileCounter = 0;//hold on, we'll use this in a moment

            //loop through the different nodes you're interested in
            foreach (XElement item in theincomingDoc.Elements("node6"))
            {
                fileCounter++;//increment the file counter
                string outfilename = Path.GetDirectoryName(outgoingXML) + "\\" + Path.GetFileNameWithoutExtension(outgoingXML) + fileCounter + Path.GetExtension(outgoingXML);//come up with a file name that suits your needs
                XDocument newoutfile = new XDocument("", new XElement(header));//create a new document and start it with the header we already stored
                newoutfile.Element("node1").Add(item);//now add the node you need separated
                newoutfile.Save(outfilename, SaveOptions.None);//save the file
            }

        }
    }
}

Input file is this:

<?xml version="1.0"?>
<node1>
  <node2>
    <node3 attribute="1">item</node3>
    <node3 attribute="2">item</node3>
    <node3 attribute="3">item</node3>
  </node2>
<node6 attribute="1">
    <node7>item = (node3 attribute2)</node7>
    <node8>item = (node3 attribute3)</node8>
</node6>
<node6 attribute="2">
    <node9>item = (node3 attribute1)</node9>
    <node10>item = (node3 attribute2)</node10>
</node6>
</node1>

Got 2 files out that looked like this: Data_out1.xml

<?xml version="1.0" encoding="utf-8"?>
<node1>
  <node2>
    <node3 attribute="1">item</node3>
    <node3 attribute="2">item</node3>
    <node3 attribute="3">item</node3>
  </node2>
  <node6 attribute="1">
    <node7>item = (node3 attribute2)</node7>
    <node8>item = (node3 attribute3)</node8>
  </node6>
</node1>

data_out2.xml

<?xml version="1.0" encoding="utf-8"?>
<node1>
  <node2>
    <node3 attribute="1">item</node3>
    <node3 attribute="2">item</node3>
    <node3 attribute="3">item</node3>
  </node2>
  <node6 attribute="2">
    <node9>item = (node3 attribute1)</node9>
    <node10>item = (node3 attribute2)</node10>
  </node6>
</node1>
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Blaze. I used your input (converted to VB) along with other feedback to create a working solution mentioned in my answer. I hope this will help future users with a similar challange.
0

Thank you all for the support!

Using Bibek Gautam's question and answer as reference and feedback from all of you and others I came up with the following working code (in a separate class) to divide the mentioned example XML into separate XML files containing a commonstring and file specific strings using nodes 7,8,9 and 10 as reference to which node 3-attribute should be in the commonstring thus excluding the other node 3 possibilities.

The code contains some extra nodes not mentioned in the example XML.

I posted almost the complete code so others with similar objectives can use this as reference.

Code:

Shared Sub CreateXML()

    Dim xOrgXml As New XmlDocument
    Dim pSavelocation As String = "mysavelocation"
    Dim pProgressbar As ProgressBar = Form3.f3.ProgressBar1
    Dim cCommonString As String
    Dim dDocumentRootNodes As XmlNodeList
    'implemented progessbar'
    Dim mProgressBarMaximum As Integer         
    Dim mFoldername As String

    Try
        'Public class containing shared location of source XML'
        xOrgXml.Load(ClsSharedProperties.filePath)
        cCommonString = "<?xml version=""1.0""?>" & "<Node1>"
        dDocumentRootNodes = xOrgXml.GetElementsByTagName("Node1")
        mProgressBarMaximum = xOrgXml.GetElementsByTagName("Node6").Count + xOrgXml.GetElementsByTagName("Node3").Count

        pProgressbar.Minimum = 0
        pProgressbar.Maximum = mProgressBarMaximum
        pProgressbar.Value = 0
        pProgressbar.Visible = True

        '==================================================================================================================='
        'Building Common String'   
        '==================================================================================================================='

        For Each Node1Node As XmlNode In dDocumentRootNodes
            Dim Node1ChildNodes As XmlNodeList = Node1Node.ChildNodes

            For Each Node1Childnode As XmlNode In Node1ChildNodes
                If Node1Childnode.Name = "node4" Then
                    cCommonString = cCommonString & Node1Childnode.OuterXml

                Else
                    If Node1Childnode.Name = "node5" Then
                        cCommonString = cCommonString & Node1Childnode.OuterXml

                    Else
                        If Node1Childnode.Name = "node12" Then
                            cCommonString = cCommonString & Node1Childnode.OuterXml
                        End If
                    End If
                End If
            Next
        Next

        Dim mXMLDocSave As XmlDocument
        Dim mFileName As String
        Dim fFullString As String

        mXMLDocSave = New XmlDocument()

        '=============================================================='
        'Creating Directories and files For xml Getting Name and attribute value from Node6-NODE1node'
        '==============================================================='

        For Each Node1Node As XmlNode In dDocumentRootNodes
            Dim Node1ChildNodes As XmlNodeList = Node1Node.ChildNodes
            For Each NODE1node As XmlNode In Node1ChildNodes
                
                If NODE1node.Name = "Node6" Then

                    Dim Node6Attribute As XmlAttributeCollection = NODE1node.Attributes
                    If Node6Attribute.GetNamedItem("attribute").Value = "1" Then
                        pProgressbar.Increment(1)
                        Dim cCommonStringNode6_1 As String = cCommonString

                        Dim i As Integer
                        Dim s As String

                        For i = 0 To (Form3.f3.CheckedListBox1.Items.Count - 1)
                            If Form3.f3.CheckedListBox1.GetItemChecked(i) = True Then
                                s = Form3.f3.CheckedListBox1.Items(i).ToString
                                If s = "EXAMPLE A" Then
                                    mFoldername = "EXAMPLE A"
                                    If (Not IO.Directory.Exists(pSavelocation & "\" & mFoldername)) Then
                                        IO.Directory.CreateDirectory(pSavelocation & "\" & mFoldername)
                                    End If
                                ElseIf s = "EXAMPLE B" Then
                                    mFoldername = "EXAMPLE B"
                                    If (Not IO.Directory.Exists(pSavelocation & "\" & mFoldername)) Then
                                        IO.Directory.CreateDirectory(pSavelocation & "\" & mFoldername)
                                    End If
                                End If
                            End If
                        Next
                        For i = 0 To (Form3.f3.CheckedListBox1.Items.Count - 1)
                            If Form3.f3.CheckedListBox1.GetItemChecked(i) = True Then
                                s = Form3.f3.CheckedListBox1.Items(i).ToString
                                If s = "EXAMPLE A" Then
                                    mFileName = Date.Now.ToString("yyyyMMdd-HHmm") + "_" + NODE1node.Name.ToString + "_" + (Node6Attribute.GetNamedItem("attribute").Value).ToString + "_" + "EXAMPLE A"
                                    mFileName = mFileName.Replace(".", "_").Replace(" ", "_").Replace("''", "_").Replace("<", "").Replace(">", "").Replace("d", "D")
                                ElseIf s = "EXAMPLE B" Then
                                    mFileName = Date.Now.ToString("yyyyMMdd-HHmm") + "_" + NODE1node.Name.ToString + "_" + (Node6Attribute.GetNamedItem("attribute").Value).ToString + "_" + "EXAMPLE B"
                                    mFileName = mFileName.Replace(".", "_").Replace(" ", "_").Replace("''", "_").Replace("<", "").Replace(">", "").Replace("d", "D")

                                End If
                            End If
                        Next
                        For Each Node1Node2 As XmlNode In dDocumentRootNodes
                            Dim Node1ChildNodes2 As XmlNodeList = Node1Node2.ChildNodes
                            For Each NODE1node2 As XmlNode In Node1ChildNodes2

                                If NODE1node2.Name = "Node3" Then
                                    pProgressbar.Increment(1)
                                    Dim xNode6Node3List As XmlNodeList = xOrgXml.SelectNodes("/Node1/Node6[@attribute='1']//Node3")

                                    For Each Node6NODE3_Name As XmlNode In xNode6Node3List
                                        pProgressbar.Increment(1)
                                        If (Node6NODE3_Name.InnerText).ToString = (NODE1node2.Attributes("attribute").Value).ToString Then

                                            Dim NODE1_NODE3_Node_String As String = NODE1node2.OuterXml.ToString
                                            'check if node specific string already contains the selected node. If not add it else skip it'
                                            If cCommonStringNode6_1.Contains(NODE1_NODE3_Node_String) = False Then
                                                cCommonStringNode6_1 = cCommonStringNode6_1 & NODE1node2.OuterXml
                                            End If
                                        End If
                                    Next
                                End If
                            Next
                        Next
                        'create the fullstring to be saved as new XML document'
                        fFullString = cCommonStringNode6_1 & NODE1node.OuterXml & "</Node1>"
                        mXMLDocSave.LoadXml(fFullString)
                        'Make all node6 attributes have value "1"'
                        For Each node2 As XmlAttribute In mXMLDocSave.SelectNodes("//Node6/@attribute")
                            node2.Value = "1"
                        Next
                        Dim countervalue As Integer = 0
                        For Each Node1Childnode As XmlNode In mXMLDocSave.SelectNodes("/Node1/Node3")
                            If Node1Childnode.Name = "Node3" Then

                                Dim NODE3_NodeList As XmlNodeList = Node1Childnode.ChildNodes
                                For Each NODE3_Node As XmlNode In NODE3_NodeList

                                    If NODE3_Node.Name = "Node11" Then
                                        countervalue += 1
                                        NODE3_Node.InnerText = countervalue.ToString
                                    End If
                                Next
                            End If
                        Next
                        mXMLDocSave.Save(pSavelocation & "\" & mFoldername & "\" & mFileName & ".xml")
                        mXMLDocSave = New XmlDocument()
                        fFullString = String.Empty
                        mFoldername = String.Empty
                        mFileName = String.Empty
                        End If
                End If
            Next
        Next
    Catch ex As Exception
        MessageBox.Show(ex.Message & vbCrLf & "Stack Trace: " & vbCrLf & ex.StackTrace)
    End Try

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.