Perl sort multiple column array

Question

The hostlist.txt file has only 1 col. The prog reads hostlist.txt file, remove duplicate hostnames, sort the list, lookup ip address of each host in the list, and print the output on terminal.

hostlist.txt

host01
host03
host02
host01

output on terminal

host01,192.168.1.15
host02,192.168.1.12
host03,192.168.1.33

Program:

open(HOSTFILE, "hostlist.txt") or die "Couldn't open location file: $!\n";
while ($hosts = <HOSTFILE>) {
    chomp($hosts);
    push(@hostnames, $hosts);
}
close HOSTFILE;
@hostnameUnique = uniq(@hostnames);
@hostnameUniqueSorted = sort { lc($a) cmp lc($b) } @hostnameUnique; 

foreach $hostname (@hostnameUniqueSorted){
    $ipaddr = inet_ntoa((gethostbyname($hostname))[4]);
    print "$hostname,$ipaddr\n";
}

I want to do the same thing as above, except the input file newhostlist.txt has 3 cols. Remove the duplicate hostname, sort first col($type), then sort 3rd col($location), then sort 2nd col($hostname), lookup ip address, and print output.
How do I process the multiple column array?

newhostlist.txt

dell,host01,dc2
dell,host03,dc1
hp,host02,dc1
dell,host01,dc2

Output:

dell,host03,192.168.1.33,dc1
hp,host02,192.168.1.12,dc1
dell,host01,192.168.1.15,dc2

flesk · Accepted Answer · 2012-01-13 12:24:00Z

3

#!/usr/bin/perl
use strict;
use warnings;

open(my $fh, '<', "newhostlist.txt") or die "Unable to open file: $!\n";

my %unique = map {$_ => 1} <$fh>;

my @data = 
    map {join",", ($_->[0], $_->[1], (@{$_->[3]}=gethostbyname($_->[1]))?inet_ntoa($_->[3][4]):'-' , $_->[2])}
    sort {$a->[0] cmp $b->[0] ||
          $a->[2] cmp $b->[2] ||
          $a->[1] cmp $b->[1]}
    map {s/^\s+|\s+$//g; [split/,/]} keys %unique;

edited Jan 13, 2012 at 12:24

answered Jan 13, 2012 at 6:52

flesk

7,5994 gold badges27 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

user11496 Over a year ago

Holy, this is awesome. I only understand the sort portion (from the research online), but hv no idea the how the other part of the code works. One thing though, there's a "space" in front of each line starting line number 2. How do I eliminate that?

user11496 Over a year ago

Found another issue: if the hostname doesn't return an ip address, the script stops. I found this piece on the Net, how do I incorporate it to your code? <br/> my @host = gethostbyname($hostname); if (scalar(@host) == 0) { $ipaddr = "host not found - ip not avail"; } else { $ipaddr = inet_ntoa($host[4]); }

TLP Over a year ago

map looks cool, but it's not the easiest to read, and rather hard to maintain, especially for someone who is not too familiar with perl.

Josh Y. Over a year ago

Note: the map { … } sort { … } map { … } is a common Perl idiom known as the Schwartzian transform.

user11496 Over a year ago

@flesk, Thanks!! The "space" is still there, I get rid of them by adding this block at the end: foreach $date (@data) { chomp($data);}

|

TLP · Accepted Answer · 2012-01-13 19:58:11Z

3

ETA: Added the check for failed ipaddr lookup.

The easiest way to handle this would be to use the diamond operator, I feel:

use strict;
use warnings;
use ARGV::readonly;

my %seen;
while (<>) {
    chomp;  # to avoid missing newline at eof errors
    next if $seen{$_};
    $seen{$_}++; 
    my @row = split /,/, $_;
    my @host = gethostbyname($hostname); 
    my $ipaddr;
    if (@host == 0) { 
         $ipaddr = "host not found - ip not avail";
    } else {
         $ipaddr = inet_ntoa($host[4]);
    } 

    splice @row, 2, 0, $ipaddr;
    print join ",", @row;
}

Using ARGV::readonly allows for somewhat safer usage of the implicit file opens used with the diamond operator.

After that, simply weed out lines already seen by using a hash, split the row, put in the value you need where you need it, and print out the reassembled row.

If you expect more complicated data in your rows, you might wish to look at a csv module, such as Text::CSV.

edited Jan 13, 2012 at 19:58

answered Jan 13, 2012 at 7:36

TLP

68.3k10 gold badges97 silver badges156 bronze badges

6 Comments

Øyvind Skaar Over a year ago

+1, very readable. But why use if (@host == 0) instead of unless(@host) ?

TLP Over a year ago

@ØyvindSkaar I just modified the OPs code from a comment, it was originally if (scalar(@host) == 0). unless (@host) works just as well, I was just pointing out that scalar is redundant with an array that is already in scalar context.

flesk Over a year ago

Might I also suggest changing next if $seen{$_}; and $seen{$_}++; to just next if $seen{$_}++;?

TLP Over a year ago

@flesk You could do that, but I don't know if there's any real point to it.

flesk Over a year ago

@TLP: Not other than that they fit together logically, since they're part of the same "skip-if-seen-expression", but that's probably a matter of opinion.

|

gangabass · Accepted Answer · 2012-01-13 06:38:25Z

0

I recommend to use an array of hashes for this:

.....
my ($type, $hostname, $location) = split /,/, $line;
push @records, {
    type => $type,
    hostname => $hostname,
    location => $location,
};
.....

my @records_sorted = sort { $a->{type} cmp $b->{type} || $a->{location} cmp $b->{location} || $a->{hostname} cmp $b->{hostname} } @records;
...

answered Jan 13, 2012 at 6:38

gangabass

10.7k2 gold badges26 silver badges36 bronze badges

2 Comments

user11496 Over a year ago

Thanks the sorting works, but I still need to filter the duplicate hostname. Guess some adv Perl magic is needed. :)

Ilion Over a year ago

Store them as a hash to begin with instead of an array of hashes and you should be able to avoid duplicates.

Collectives™ on Stack Overflow

Perl sort multiple column array

hostlist.txt

output on terminal

Program:

newhostlist.txt

Output:

3 Answers 3

7 Comments

6 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

hostlist.txt

output on terminal

Program:

newhostlist.txt

Output:

3 Answers 3

7 Comments

6 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related