There are several interesting subproblems. First, you want to keep track of the most recent header (ie, =IP1). Second, you want to keep track of lists of numbers that are associated with some keys, and third, you want to generate range strings.
Here's how I would do it:
#!/usr/bin/env perl
use strict;
use warnings;
my $tl;
my %h;
# First process the lines of the input file.
while(<DATA>) {
chomp;
next unless length;
if(/^(=\w{2}\d+)$/) { # Recognize and track a top level heading.
$tl = $1;
next;
}
if(/^(\w+)\[(\d+)\]$/) { # Or grab a key/value pair.
my($k,$v) = ($1,$2);
push @{$h{$tl}{$k}}, $v; # push the value into the right bucket.
next;
}
warn "Unrecognized format cannot be processed at $.: (($_))\n";
}
# Sort the top level headers alphabetically and numerically.
# Uses a Schwartzian Transform so that we don't need to recompute
# sort keys repeatedly.
my @topkeys = map {$_->[0]}
sort {$a->[1] cmp $b->[1] || $a->[2] <=> $b->[2]}
map {
my($alpha, $num) = $_ =~ m/^=(\w+)(\d+)$/;
[$_, $alpha, $num]
} keys %h;
# Now iterate through the structure in sorted order, generate range
# strings on the fly, and print our output.
foreach my $top (@topkeys) {
print "$top\n";
foreach my $k (sort keys %{$h{$top}}) {
my @vl = sort {$a <=> $b} @{$h{$top}{$k}};
my $range = num2range(@vl);
print "$k\[$range]\n";
}
}
sub num2range {
local $_ = join ',' => @_;
s/(?<!\d)(\d+)(?:,((??{$++1}))(?!\d))+/$1-$+/g;
return $_;
}
__DATA__
=IP1
abc[0]
abc[1]
abc[2]
=IP2
def[4]
def[8]
def[9]
The following output is produced:
=IP1
abc[0-2]
=IP2
def[4,8-9]
This solution could be optimized further if answers to some of the questions that Borodin asked as a comment to your original post were answered. For example, it would be unnecessary to sort our number list before generating a range if we knew that the numbers were already in order. And some complexity (and computational work) might be eliminated if we knew more about what "abc" and "def" are. And if sorted order doesn't matter, we could simplify further while also reducing the amount of work being done.
Also, the Set::IntSpan module could probably provide a more robust approach to generate a range string, and is probably worth considering if this script is intended to live beyond the "one off" lifespan. If you choose to use Set::IntSpan your num2range sub could look like this:
sub num2range{ return Set::IntSpan->new(@_) }
The Set::IntSpan object has overloaded stringification, so printing it gives a text representation of the range. If you went this route, you could eliminate the code that sorts the lists of numbers -- that's handled by Set::IntSpan internally.
=line start with the same string? For instance, do all the lines after=IP1always start withabcor can there be a mixture? And are the numbers in brackets always in increasing order?abcs and then all thexyzs and then=IP2. And are the numbers in brackets always in sorted order?=IP3and=IP4havexyzvalues?