2

first time poster here and somewhat newbie to VBscript. I could really use some help from you guys who know this like second nature. I've tried to include some pertinent information and hopefully not too much.

I've been trying to get this to work and am finally reaching out after a few days time of attempts and a dozen iterations of code. I haven't found examples of extracting data from multiple levels (noes and chidlren) within the XML document.

I've been tasked with extracting data from an XML file using VBScript. the specific items are: Year, Account Number, Current Amount Due, Has Delinquent? (true/false) and Formatted Warrant Number.

The format of the XML file is as below, with anywhere from 1,000 to 10,000+ nodes filled with this data along with plenty of 'misc' nodes in there as well.

  <BillData>
    <BillHeader>
      <Year>2010</Year>
      <misc></misc>
      <misc2></misc2>
      <misc3></misc3>
      <AcctNumber>0002566129</AcctNumber>
      <misc4></misc4>
      <PayAmounts>
         <CurrentAmountDue>133.06</CurrentAmountDue>
         <misc5></misc5>
      </PayAmounts>
      <misc6></misc6>
      <HasDelinquents>true</HasDelinquents>
      <WarrantInfo>
         <FormattedWarrantNumber>201115447</FormattedWarrantNumber>
      </WarrantInfo>
     </BillHeader>
   </BillData>

CurrentAmountDue and FormattedWarrantNumber may not always be present. by this I dont mean they are blank, but the entire entry of CurrentAmountDue may be missing, as shown below.

<PayAmounts>
   <misc5></misc5>
</PayAmounts>

I need the extract this data to a comma separated text file. If the data is not present then I just need to insert the comman, so when the output is eventually imported to Excel it can be noted to be blank.

The challenge for me is to get into the different child nodes and extract the data correctly. I can't seem to select the different nodes correctly.

These are some links I've used as reference, but can't seem to get it working.

http://technet.microsoft.com/en-us/magazine/2007.02.heyscriptingguy.aspx this seemed to be the direction to go in, but I get an error "Node Test Expected Here":

  Set colNodes=xmlDoc.SelectNodes("/BillData/BillHeader/*" (Year | Account | CurrentAmountDue)")

I found a post on Stack which suggested using this technique below, but it doesn't work for me once I get past two values, whereas I have more. I'm guessing this is due to the CurrentAmountDue and FormattedWarrantNumber are deeper levels into the XML so to speak.

  strQuery = "/BillData/BillHeader/ " & _
  "[name()='Year' or name()='AccountNumber' or name()='HasDelinquents' or name()='CurrentAmountDue' or name()='FormattedWarrantNumber']"

To my surprise, I am able to get this to return some values but not all on the same loop so my output is off (first line will only display year, last line is missing) and is just a comma.

   strQuery = "/BillData/BillHeader/*"
   Set colNodes=xmlDoc.selectNodes(strQuery)
   For Each objNode in colNodes 

   ' some lame if then statements that get the values, but this can't be the correct approach!
   ' these three items (Year, Account and HasDelinquents are under each BillHeader as far as I can tell, but this doesn't seem to be the most effective method.
     if objNode.nodeName = "Year" then strYear = objNode.text  
     if objNode.nodeName = "Account" then strAccount = objNode.text 
     if objNode.nodeName = "HasDelinquents" then strHasDelq = objNode.text 

          for each CurrentAmt in objNode.SelectNodes("./CurrentAmountDue")
                strCurrAmt = CurrentAmt.text
                ' i finally got a value here when I use msgbox to view it.'
          next

          for each WarrantNum in objNode.SelectNodes("./FormattedWarrantNumber")
                strWarNum = WarrantNum.text   
                ' getting this value also when I use msgbox to view it.
          next
   next

So you can see my attempts are futile.

I also tried insert this line below. I put it just before the last NEXT, but it didn't work as intended. I also attempted to insert some IF-Then statements to check for values in Year and Account before writing to the file and then clearing out the values after writing to the file. That almost worked, but my first line and last lines are not producing correct data.

     objFileToWrite.WriteLine(strYear & "," & strAccount & "," & strCurrAmt & "," & strHasDelq & "," & strWarNum)

ok now that you've had a giggle with my prehistoric attempt at coding this, can you lend me a hand? :) let me know if anything else is needed. thanks for any time invested. I know some of you can likely kick this out with ease.

2 Answers 2

2

The low-tech 'design pattern' for the first half of your problem - creating and writing to a .CSV/.TXT file - is:

Get an FSO
Open traget file for writing
WriteLine Header (optional)
Loop over your data to export
    Create empty Array (elements ~ columns)
    Fill elements (if possible)
    WriteLine Join(Array, Delimiter) to traget file
Close file

In code:

  Option Explicit
  Dim oFS     : Set oFS = CreateObject("Scripting.FileSystemObject")
  Dim sFSpec  : sFSpec  = "..\data\step00.csv"
  Dim sDelim  : sDelim  = ";"
  Dim aFields : aFields = Split("Yr ANum Amnt Delq FWNum")
  Dim oTS     : Set oTS = oFS.CreateTextFile(sFSpec)
  Dim nRecs   : nRecs   = 10
  Dim nRec
  oTS.WriteLine Join(aFields, sDelim)
  For nRec = 1 To nRecs
      ReDim aData(UBound(aFields))
      aData(0) = nRec
      If nRec Mod 2 Then aData(1) = "odd"

      oTS.WriteLine Join(aData, sDelim)
  Next
  oTS.Close

  WScript.Echo oFS.OpenTextFile(sFSpec).ReadAll()

Output:

Yr;ANum;Amnt;Delq;FWNum
1;odd;;;
2;;;;
3;odd;;;
4;;;;
5;odd;;;
6;;;;
7;odd;;;
8;;;;
9;odd;;;
10;;;;

Please mark the difference between

oTS.WriteLine Join(aData, sDelim)

and

objFileToWrite.WriteLine(strYear & "," & strAccount & "," & strCurrAmt & "," & strHasDelq & "," & strWarNum)
(spurious param list (), btw)

A skeleton for the second part - looping over structured XML - should look like this

Get an msxml2.domdocument
Configure
Load .XML file
If error
   deal with it
Else
   use top level XPath to get your top level nodelist
   Loop nodelist
      handle sub-parts
End If

in code:

  Option Explicit
  Dim oFS     : Set oFS = CreateObject("Scripting.FileSystemObject")
  Dim sFSpec  : sFSpec  = oFS.GetAbsolutePathName("..\data\step01.xml")
  WScript.Echo oFS.OpenTextFile(sFSpec).ReadAll()

  Dim oXD : Set oXD = CreateObject("msxml2.domdocument")
  oXD.setProperty "SelectionLanguage", "XPath"
  oXD.async = False
  oXD.load sFSpec
  If oXD.parseError.errorCode Then
     WScript.Echo "fail", sFSpec
     WScript.Echo oXD.parseError.reason
  Else
     WScript.Echo "ok", sFSpec
     Dim ndlBills : Set ndlBills = oXD.selectNodes("/Bills/BillData/BillHeader")
     If ndlBills.length Then
        WScript.Echo ndlBills.length, "bill nodes"
        Dim ndBill
        For Each ndBill In ndlBills
            Dim ndSub
            Set ndSub = ndBill.selectSingleNode("Year")
            If ndSub Is Nothing Then
               WScript.Echo "no Year"
            Else
               WScript.Echo "Year", ndSub.text
            End If
            Set ndSub = ndBill.selectSingleNode("PayAmounts/CurrentAmountDue")
            If ndSub Is Nothing Then
               WScript.Echo "no Amount"
            Else
               WScript.Echo "Amount", ndSub.text
            End If
        Next
     End If
  End If

output:

<?xml version="1.0" encoding="utf-8" ?>
<Bills>
 <BillData>
  <BillHeader>
   <Year>2012</Year>
  </BillHeader>
 </BillData>
 <BillData>
  <BillHeader>
   <PayAmounts>
    <CurrentAmountDue>123.45</CurrentAmountDue>
   </PayAmounts>
  </BillHeader>
 </BillData>
</Bills>

ok E:\trials\SoTrials\answers\19571565\data\Step01.xml
2 bill nodes
Year 2012
no Amount
no Year
Amount 123.45

As you want to put the data from each BillHeader into one line of the .CSV and elements are missing, don't risk wrong mappings by using // or other kinds of loose queries. Just get a list of all "/Bills/BillData/BillHeader" and drill down.

The merge of both scripts:

  Option Explicit
  Dim oFS     : Set oFS = CreateObject("Scripting.FileSystemObject")
  Dim sXFSpec : sXFSpec = oFS.GetAbsolutePathName("..\data\step02.xml")
  WScript.Echo oFS.OpenTextFile(sXFSpec).ReadAll()
  Dim sCFSpec : sCFSpec = "..\data\step02.csv"
  Dim sDelim  : sDelim  = ","
  Dim aFields : aFields = Split("Yr ANum Amnt Delq FWNum")
  Dim oTS     : Set oTS = oFS.CreateTextFile(sCFSpec)
  oTS.WriteLine Join(aFields, sDelim)

  Dim oXD : Set oXD = CreateObject("msxml2.domdocument")
  oXD.setProperty "SelectionLanguage", "XPath"
  oXD.async = False
  oXD.load sXFSpec
  If oXD.parseError.errorCode Then
     WScript.Echo "fail", sXFSpec
     WScript.Echo oXD.parseError.reason
  Else
     WScript.Echo "ok", sXFSpec
     Dim ndlBills : Set ndlBills = oXD.selectNodes("/Bills/BillData/BillHeader")
     If ndlBills.length Then
        WScript.Echo ndlBills.length, "bill nodes"
        Dim ndBill
        For Each ndBill In ndlBills
            ReDim aData(UBound(aFields))
            Dim ndSub
            Set ndSub = ndBill.selectSingleNode("Year")
            If Not ndSub Is Nothing Then
               aData(0) = ndSub.text
            End If
            Set ndSub = ndBill.selectSingleNode("PayAmounts/CurrentAmountDue")
            If Not ndSub Is Nothing Then
               aData(2) = ndSub.text
            End If
            oTS.WriteLine Join(aData, sDelim)
        Next
     End If
  End If
  oTS.Close

  WScript.Echo oFS.OpenTextFile(sCFSpec).ReadAll()

output:

<?xml version="1.0" encoding="utf-8" ?>
<Bills>
 <BillData>
  <BillHeader>
   <Year>2012</Year>
  </BillHeader>
 </BillData>

  <BillHeader>
   <Year>0000</Year>
   <PayAmounts>
    <CurrentAmountDue>0.0</CurrentAmountDue>
   </PayAmounts>
   <junk/>
  </BillHeader>

 <BillData>
  <BillHeader>
   <PayAmounts>
    <CurrentAmountDue>123.45</CurrentAmountDue>
   </PayAmounts>
  </BillHeader>
 </BillData>

 <BillData>
  <BillHeader>
   <Year>2013</Year>
   <PayAmounts>
    <CurrentAmountDue>47.11</CurrentAmountDue>
   </PayAmounts>
  </BillHeader>
 </BillData>
</Bills>

ok E:\trials\SoTrials\answers\19571565\data\Step02.xml
3 bill nodes
Yr,ANum,Amnt,Delq,FWNum
2012,,,,
,,123.45,,
2013,,47.11,,

To solve your real-world problem you can weave in more IF clauses like

Set ndSub = ndBill.selectSingleNode("XPath")
If Not ndSub Is Nothing Then
   aData(N) = ndSub.text
End If

or - probably better in the long run

Define an array of queries (in field order)

Dim aQueries : aQueries = Array( _ "Year" _ , "PayAmounts/CurrentAmountDue" _ )

Reduce the innermost loop to

Dim ndBill
For Each ndBill In ndlBills
    oTS.WriteLine Join(getData(ndBill, aQueries), sDelim)
Next

Define getData()

Function getData(ndBill, aQueries)
  Dim nUb : nUb = UBound(aQueries)
  ReDim aData(nUb)
  Dim q
  For q = 0 To nUb
      Dim ndSub
      Set ndSub = ndBill.selectSingleNode(aQueries(q))
      If Not ndSub Is Nothing Then
         aData(q) = ndSub.text
      End If
  Next
  getData = aData
End Function
Sign up to request clarification or add additional context in comments.

1 Comment

wow, that is something else! appreciate all the time you put into that and I'm going to read up on some of the items you listed such as the 'joins', array items, the if not statements, etc. It's a bit overwhelming but I'll attempt it and try to follow it so I can incorporate the warrant number search.
0

You get only the nodes Year and HasDelinquents, because the nodes CurrentAmountDue and FormattedWarrantNumber are not immediate child nodes of /BillData/BillHeader, and there are no nodes named AccountNumber (the correct node name would be AcctNumber). To select nodes from anywhere in the XML tree try an expression like this:

//*[name()='Year' or name()='AcctNumber' or name()='HasDelinquents' or name()='CurrentAmountDue' or name()='FormattedWarrantNumber']

1 Comment

I'll go through one of the older iterations of my attempts in which I tried this approach but it looks like I failed to include the * in the pathname and I'll reduce it to how you have it to see if I can gather the data as I need it. appreciate the comment!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.