2

I have a configuration file which is in XML format. I need to parse the XML and convert to JSON. I'm able to convert it with XML2JSON module of perl. But the problem is, it is not maintaining the order of XML elements. I strictly need the elements in order otherwise I cannot configure

My XML file is something like this. I have to configure an IP address and set that IP as a gateway to certain route.

<Config>
<ip>
    <address>1.1.1.1</address>
    <netmask>255.255.255.0</netmask>
</ip>
<route>
    <network>20.20.20.0</network>
    <netmask>55.255.255.0</netmask>
    <gateway>1.1.1.1</gateway>
</route>
</Config>

This is my perl code to convert to JSON

my $file = 'config.xml';
use Data::Dumper;
open my $fh, '<',$file or die;
$/ = undef;
my $data = <$fh>;
my $XML = $data;
my $XML2JSON = XML::XML2JSON->new();
my $Obj = $XML2JSON->xml2obj($XML);
print Dumper($Obj);

The output I'm getting is,

$VAR1 = {'Config' => {'route' => {'netmask' => {'$t' => '55.255.255.0'},'gateway' => {'$t' => '1.1.1.1'},'network' => {'$t' => '20.20.20.0'}},'ip' => {'netmask' => {'$t' =>                        '255.255.255.0'},'address' => {'$t' => '1.1.1.1'}}},'@encoding' => 'UTF-8','@version' => '1.0'};

I have a script which reads the json object and configure.. But it fails as it first tries to set gateway ip address to a route where the ip address is not yet configured and add then add ip address.

I strictly want key ip to come first and then route for proper configuration without error. Like this I have many dependencies where order of keys is a must.

Is there any way I can tackle this problem? I tried almost all modules of XML parsing like XML::Simple,Twig::XML,XML::Parser. But nothing helped..

2
  • 3
    Hash elements aren't ordered, so if you extract the data into a hash, it will be unordered, no matter what parser you use. My parser of choice is XML::LibXML, and it will provide the elements in the order they are found. But so will two of the ones you mentioned (XML::Twig and the low-level XML::Parser). Commented May 16, 2016 at 14:39
  • 2
    {'Config' => {'route' => ... is not JSON. Even if you use the convert method to generate actual JSON, though, the docs say: "The order of child elements is not always preserved. This is because the conversion to json makes use of hashes in the resulting json." Commented May 16, 2016 at 14:43

2 Answers 2

2

Here's a program that I hacked together that uses XML::Parser to parse some XML data and generate the equivalent JSON in the same order. It ignores any attributes, processing instructions etc. and requires that every XML element must contain either a list of child elements or a text node. Mixing text and elements won't work, and this isn't checked except that the program will die trying to dereference a string

It's intended to be a framework for you to enhance as you require, but works fine as it stands with the XML data you show in your question

use strict;
use warnings 'all';

use XML::Parser;


my $parser = XML::Parser->new(Handlers => {
    Start => \&start_tag,
    End   => \&end_tag,
    Char  => \&text,
});

my $struct;
my @stack;

$parser->parsefile('config.xml');

print_json($struct->[1]);


sub start_tag {
    my $expat = shift;
    my ($tag, %attr) = @_;

    my $elem = [ $tag => [] ];
    if ( $struct ) {
        my $content = $stack[-1][1];
        push @{ $content }, $elem;
    }
    else {
        $struct = $elem;
    }
    push @stack, $elem;
}


sub end_tag {
    my $expat = shift;
    my ($elem) = @_;
    die "$elem <=> $stack[-1][0]" unless $stack[-1][0] eq $elem;
    for my $content ( $stack[-1][1] ) {
        $content = "@$content" unless grep ref, @$content;
    }
    pop @stack;
}


sub text {
    my $expat = shift;
    my ($string) = @_;
    return unless $string =~ /\S/;
    $string =~ s/\A\s+//;
    $string =~ s/\s+\z//;
    push @{ $stack[-1][1] }, $string;
}


sub print_json {
    my ($data, $indent, $comma) = (@_, 0, '');

    print "{\n";

    for my $i ( 0 .. $#$data ) {

        # Note that $data, $indent and $comma are overridden here
        # to reflect the inner context
        #
        my $elem = $data->[$i];
        my $comma = $i < $#$data ? ',' : '';
        my ($tag, $data) = @$elem;
        my $indent = $indent + 1;

        printf qq{%s"%s" : }, '  ' x $indent, $tag;

        if ( ref $data ) {
            print_json($data, $indent, $comma);
        }
        else {
            printf qq{"%s"%s\n}, $data, $comma;
        }
    }

    # $indent and $comma (and $data) are restored here
    #
    printf "%s}%s\n", '  ' x $indent, $comma;
}

output

{
  "ip" : {
    "address" : "1.1.1.1",
    "netmask" : "255.255.255.0"
  },
  "route" : {
    "network" : "20.20.20.0",
    "netmask" : "55.255.255.0",
    "gateway" : "1.1.1.1"
  }
}
Sign up to request clarification or add additional context in comments.

Comments

1

The problem isn't so much to do with XML parsing, but because perl hashes are not ordered. So when you 'write' some JSON... it can be any order.

The way to avoid this is to apply a sort function to your JSON.

You can do this by using sort_by to explicitly sort:

#!/usr/bin/env perl
use strict;
use warnings;

use XML::Twig; 
use JSON::PP; 

use Data::Dumper;

sub order_nodes {
   my %rank_of = ( ip => 0, route => 1, address => 2, network => 3, netmask => 4, gateway => 5 ); 
   print "$JSON::PP::a <=> $JSON::PP::b\n";
   return $rank_of{$JSON::PP::a} <=> $rank_of{$JSON::PP::b};
}

my $twig = XML::Twig -> parse (\*DATA); 

my $json = JSON::PP -> new;
$json ->sort_by ( \&order_nodes );
print $json -> encode( $twig -> simplify );

__DATA__
<Config>
<ip>
    <address>1.1.1.1</address>
    <netmask>255.255.255.0</netmask>
</ip>
<route>
    <network>20.20.20.0</network>
    <netmask>55.255.255.0</netmask>
    <gateway>1.1.1.1</gateway>
</route>
</Config>

In some scenarios, setting canonical can help, as that sets ordering to lexical order. (And means your JSON output would be consistently ordered). This doesn't apply to your case.

You could build the node ordering via XML::Twig, either by an xpath expression, or by using twig_handlers. I gave it a quick go, but got slightly unstuck in figuring out how you'd 'tell' how to figure out ordering based on getting address/netmask and then network/netmask/gateway.

As a simple example you could:

my $count = 0; 
foreach my $node ( $twig -> get_xpath ( './*' ) ) {
    $rank_of{$node->tag} = $count++ unless $rank_of{$node->tag};    
}

print Dumper \%rank_of; 

This will ensure ip and route are always the right way around. However it doesn't order the subkeys.

That actually gets a bit more complicated, as you'd need to recurse... and then decide how to handle 'collisions' (like netmask - address comes before, but how does it sort compared to network).

Or alternatively:

my $count = 0;
foreach my $node ( $twig->get_xpath('.//*') ) {
   $rank_of{ $node->tag } = $count++ unless $rank_of{ $node->tag };
}

This walks all the nodes, and puts them in order. It doesn't quite work, because netmask appears in both stanzas though.

You get:

{"ip":{"address":"1.1.1.1","netmask":"255.255.255.0"},"route":{"netmask":"55.255.255.0","network":"20.20.20.0","gateway":"1.1.1.1"}}

I couldn't figure out a neat way of collapsing both lists.

8 Comments

There's no way that setting canonical will duplicate the order in which the elements were found in the XML (unless they just happened to be sorted lexically, which is not the case in the example).
Thanks for the help.. But here we are hard-coding %rank_of hash... Can we do something generic..Because I shouldn't be changing %rank_of hash for each config file parsing... Can you please tell a generic way which works for any XML... My apology if my ask is too much..
I've edited with an example of how to build the rank hash. But it doesn't work exhaustively, because you've got a collision in the form of netmask.
But the JSON you got is encoded as string.. But I need the JSON object.. If I decode the json encoded string(i.e your output), again the order is lost... If I'm not clear , I just want a Json object / hash from XML which is in order and should be able to deference.
You can't have an ordered JSON object, for all the reasons you can't have an ordered hash. It just doesn't work that way.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.