Use an XML Parser. Personally - like XML::Twig and perl.
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new( );
$twig->parsefile ( 'your_file.xml' );
foreach my $saw_user ( $twig->get_xpath('//saw:user') ) {
print $saw_user ->att('name'), "\n";
}
This prints:
[email protected]
[email protected]
[email protected]
If you want a 'one liner' then instead:
perl -MXML::Twig -0777 -e 'print map { $_ -> att('name')."\n"} ( XML::Twig->parse( <> )->get_xpath('//saw:user') )' your_xml_file
Please for the sake of future maintenance programmers and sysadmins - DO NOT use regular expressions to parse XML. Why you may ask? Well, because taking your XML as an example - it can look like any of these and still be semantically identical:
(your example +
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot
jobID="36"
priority="normal"
version="1"
xmlns:saw="com.siebel.analytics.web/report/v1">
<saw:schedule
disabled="false"
timeZoneId="(GMT-05:00) Eastern Time (US & Canada)">
<saw:start
endTime="23:59:00"
repeatMinuteInterval="60"
startImmediately="true"
/>
<saw:recurrence runOnce="false">
<saw:weekly
fri="true"
mon="true"
thu="true"
tue="true"
wed="true"
weekInterval="1"
/>
</saw:recurrence>
</saw:schedule>
<saw:dataVisibility
runAs="cgm"
type="recipient"
/>
<saw:choose>
<saw:when condition="true">
<saw:deliveryContent>
<saw:headline>
<saw:caption>
<saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
<saw:otherwise/>
</saw:choose>
<saw:deliveryDestinations>
<saw:destination category="dashboard" />
<saw:destination category="activeDeliveryProfile" />
</saw:deliveryDestinations>
<saw:recipients
customize="false"
specificRecipients="false"
subscribers="true">
<saw:subscribers>
<saw:user name="[email protected]" />
<saw:user name="[email protected]" />
<saw:user name="[email protected]" />
</saw:subscribers>
</saw:recipients>
<saw:conditionQuery>
<saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content" />
</saw:conditionQuery>
</saw:ibot>
Or like this (note tag wrapping of elements)
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot jobID="36" priority="normal" version="1" xmlns:saw="com.siebel.analytics.web/report/v1">
<saw:schedule disabled="false" timeZoneId="(GMT-05:00) Eastern Time (US & Canada)">
<saw:start endTime="23:59:00" repeatMinuteInterval="60" startImmediately="true"/>
<saw:recurrence runOnce="false">
<saw:weekly fri="true" mon="true" thu="true" tue="true" wed="true" weekInterval="1"/>
</saw:recurrence>
</saw:schedule>
<saw:dataVisibility runAs="cgm" type="recipient"/>
<saw:choose>
<saw:when condition="true">
<saw:deliveryContent>
<saw:headline>
<saw:caption>
<saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
<saw:otherwise/>
</saw:choose>
<saw:deliveryDestinations>
<saw:destination category="dashboard"/>
<saw:destination category="activeDeliveryProfile"/>
</saw:deliveryDestinations>
<saw:recipients customize="false" specificRecipients="false" subscribers="true">
<saw:subscribers>
<saw:user name="[email protected]"/>
<saw:user name="[email protected]"/>
<saw:user name="[email protected]"/>
</saw:subscribers>
</saw:recipients>
<saw:conditionQuery>
<saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content"/>
</saw:conditionQuery>
</saw:ibot>
Or like this:
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot
jobID="36"
priority="normal"
version="1"
xmlns:saw="com.siebel.analytics.web/report/v1"
><saw:schedule
disabled="false"
timeZoneId="(GMT-05:00) Eastern Time (US & Canada)"
><saw:start
endTime="23:59:00"
repeatMinuteInterval="60"
startImmediately="true"
/><saw:recurrence
runOnce="false"
><saw:weekly
fri="true"
mon="true"
thu="true"
tue="true"
wed="true"
weekInterval="1"
/></saw:recurrence></saw:schedule><saw:dataVisibility
runAs="cgm"
type="recipient"
/><saw:choose
><saw:when
condition="true"
><saw:deliveryContent
><saw:headline
><saw:caption
><saw:text
>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text></saw:caption></saw:headline><saw:conditionalReport
/></saw:deliveryContent><saw:postActions
/></saw:when><saw:otherwise
/></saw:choose><saw:deliveryDestinations
><saw:destination
category="dashboard"
/><saw:destination
category="activeDeliveryProfile"
/></saw:deliveryDestinations><saw:recipients
customize="false"
specificRecipients="false"
subscribers="true"
><saw:subscribers
><saw:user
name="[email protected]"
/><saw:user
name="[email protected]"
/><saw:user
name="[email protected]"
/></saw:subscribers></saw:recipients><saw:conditionQuery
><saw:reportRefNode
path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content"
/></saw:conditionQuery></saw:ibot>
Hopefully by looking at these samples, you'll see that by reformatting your XML in a PERFECTLY VALID fashion, your regex might one day break mysteriously.
sedorawk. 2. We can't provide you examples of code to run without seeing the XML that contains the data you want to retrieve. 3. Don't parse XML withsedorawk. 4. Please update your question to provide a minimal example XML file. 5. Don't parse XML withsedorawk.{}marker to indent the content by four spaces. I'll do it for you once again.../tmp/xml:33.18: Opening and ending tag mismatch: subscribers line 29 and recipientsand other errors