2

How to read an XML file and extract data from that file which has large number of tags using PowerShell? I am using the below code to extract tag, but am unable to read data from sub tag.

$xmlFile= "D:\Testing\TestcasesOutput\1ac.xml"
$xmlConfig = [System.Xml.XmlDocument](Get-Content $xmlFile)
$XmlDocument.Breakfast_menu.price

I expect the output to read the whole xml file but unable to read whole xml file.

<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu>
    <food>
        <food>Belgian Waffles</food>
        <price>$5.95</price>
        <description>Two of our famous Belgian Waffles with plenty of real maple 
        syrup</description>
        <calories>650</calories>
    </food>
    <food>
        <food>Strawberry Belgian Waffles</food>
        <price>$7.95</price>
        <description>Light Belgian waffles covered with strawberries and whipped 
        cream</description>
        <calories>900</calories>
    </food>
    <food>
        <food>Berry-Berry Belgian Waffles</food>
        <price>$8.95</price>
        <description>Light Belgian waffles covered with an assortment of fresh 
        berries and whipped cream</description>
        <calories>900</calories>
    </food>
    <food>
        <food>French Toast</food>
        <price>$4.50</price>
        <description>Thick slices made from our homemade sourdough 
        bread</description>
        <calories>600</calories>
    </food>
    <food>
    <food>Homestyle Breakfast</food>
        <price>$6.95</price>
        <description>Two eggs, bacon or sausage, toast, and our ever-popular hash 
        browns</description>
        <calories>950</calories>
    </food>
</breakfast_menu>
7
  • I am using the below code to extract breakfast_menu tag but unable to read data from breakfast_menu sub tag. $xmlFile= "D:\Testing\TestcasesOutput\1ac.xml" $xmlConfig = [System.Xml.XmlDocument](Get-Content $xmlFile) $XmlDocument.Breakfast_menu.price Commented May 5, 2019 at 13:27
  • Don't comment your own question, instead append these important lines to it. Commented May 5, 2019 at 13:34
  • 3
    Please edit your question and put (the relevant part of) the xml in there. Next, ditto to what @quantummind comments Commented May 5, 2019 at 13:40
  • @Theo <?xml version="1.0" encoding="UTF-8"?> <breakfast_menu> <food> <food>Belgian Waffles</food> <price>$5.95</price> <description>Two of our famous Belgian Waffles with plenty of real maple syrup</description> <calories>650</calories> </food> Commented May 5, 2019 at 13:47
  • 2
    That's a nono I'm afraid.. As commented, click edit underneath your question and put the info in there in a formatted way. Never paste vital information or code in a comment because that becomes unreadable very rapidly. Commented May 5, 2019 at 13:50

2 Answers 2

8

Reading XML with PowerShell is really easy.

Suppose your xml file looks similar to this:

<?xml version="1.0" encoding="UTF-8"?> 
<breakfast_menu> 
    <food> 
        <food>Belgian Waffles</food>
        <price>$5.95</price>
        <description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
        <calories>650</calories>
    </food>
    <food> 
        <food>Fried Egg</food>
        <price>$1.80</price>
        <description>blahblah</description>
        <calories>3500</calories>
    </food>
</breakfast_menu>

You simply read and have PowerShell parse the file into an object, using this

[xml]$xml = Get-Content 'D:\Testing\TestcasesOutput\1ac.xml'

Next, you can use the properties of this $xml object to get whatever it is you want to extract from it:

For instance, loop through all <food> items and output the information you want

$xml.breakfast_menu.food | ForEach-Object {
    [PSCustomObject]@{
        'MenuItem' = $_.food
        'Price'    = $_.price
    }
}

results in this output:

MenuItem        Price
--------        -----
Belgian Waffles $5.95
Fried Egg       $1.80

Or select just the one the item for 'Belgian Waffles':

$xml.breakfast_menu.food | Where-Object { $_.food -eq 'Belgian Waffles' } | 
                           Select-Object @{Name = 'MenuItem'; Expression = {$_.food}}, Price

outputs:

MenuItem        price
--------        -----
Belgian Waffles $5.95

If all you are after is the price for a certain food item, you can do this:

$xml.breakfast_menu.food | Where-Object { $_.food -eq 'Belgian Waffles' } | 
                           Select-Object -ExpandProperty Price

or even shorten that code:

($xml.breakfast_menu.food | Where-Object { $_.food -eq 'Belgian Waffles' }).price

Hope that explains


Edit

If you need to do this on multiple xml files and these files are inside the same root path, you could loop through using Get-ChildItem to get the xml files and process them like in the examples I gave.

Get-ChildItem -Path 'ROOTFOLDER OF THE FOLDERS WHERE THE XML FILES ARE KEPT' -Filter '*.xml' -File -Recurse | 
    ForEach-Object {
        [xml]$xml = Get-Content -Path $_.FullName
        # in this example simply output the menu items and their price for each xml file
        foreach ($item in $xml.breakfast_menu.food) {
            [PSCustomObject]@{
                'File'     = $_.FullName    # added the file FullName so you know where the item came from
                'MenuItem' = $item.food
                'Price'    = $item.price
            }
        }
    }

Or from several locations:

$folders = 'D:\Testing\TestcasesOutput\1ac7b5a0-2d62-403c-8394-5bd33330cbe7',
           'D:\Testing\TestcasesOutput\227c619a-b7d1-4da6-8fe5-f2c923ddcb7a',
           'D:\Testing\TestcasesOutput\d4370ae1-643f-4c44-ba41-7f640afcc276'

$result = Get-ChildItem -Path $folders -Filter '*.xml' -File | 
    ForEach-Object {
        [xml]$xml = Get-Content -Path $_.FullName
        # in this example simply output the menu items and their price for each xml file
        foreach ($item in $xml.breakfast_menu.food) {
            [PSCustomObject]@{
                'File'     = $_.FullName
                'MenuItem' = $item.food
                'Price'    = $item.price
            }
        }
    }

#output to screen:
$result

# output to CSV
$result | Export-Csv -Path 'PATH AND FILENAME FOR THE OUTPUT CSV FILE' -NoTypeInformation
Sign up to request clarification or add additional context in comments.

17 Comments

Yes it helps alot but i have another xml file which has 3000 lines and by using the same method as you described i parsed the data but unable to get it. Can you please confirm is there some issue on large xml file ?
@Ali 3000 lines should not be a problem.. Study the structure carefully. Is that the same as this xml, only larger?
Thanks for your support. Can you please explain further how to parse 100xml files located at a path and grep the required data and delete the remaining data. It will be very helpful if you guide me further ??
By using the below script I am able to process a single xml file [xml]$xmlDocument= Get-Content -Path "D:\Testing\TestcasesOutput\1ac\Export25.xml" $xmlConfig = [System.Xml.XmlDocument](Get-Content $xmlFile) $xmlDocument.Content.sessionEventTypeItem
But when I am trying to process 100 xml files it is not displaying the output. [xml]$xmlDocument= Get-Content -Path "D:\Testing\TestcasesOutput\1ac\Export25.xml" $xmlConfig = [System.Xml.XmlDocument](Get-Content $xmlFile) $xmlDocument.Content.sessionEventTypeItem
|
0

Another nice way to open an xml file, and avoid any utf8 no bom encoding problems. I couldn't do it in one step. After a load & save, it puts the bom in, if it has an encoding tag. Utf8 is assumed with no tag.

<?xml version="1.0" encoding="utf-8"?>
<name>😊</name>
$xml = [xml]::new()
$xml.load("$pwd\file.xml")
$xml.save("$pwd\file.xml")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.