4

I have an array of hashes, all with the same set of keys, e.g.:

my $aoa= [
 {NAME=>'Dave', AGE=>12, SEX=>'M', ID=>123456, NATIONALITY=>'Swedish'},
 {NAME=>'Susan', AGE=>36, SEX=>'F', ID=>543210, NATIONALITY=>'Swedish'},
 {NAME=>'Bart', AGE=>120, SEX=>'M', ID=>987654, NATIONALITY=>'British'},
]

I would like to write a subroutine that will convert this into a hash of hashes using a given key hierarchy:

my $key_hierarchy_a = ['SEX', 'NATIONALITY'];
aoh_to_hoh ($aoa, $key_hierarchy_a) = @_;
 ...
}

will return

{M=>
  {Swedish=>{{NAME=>'Dave', AGE=>12, ID=>123456}},
   British=>{{NAME=>'Bart', AGE=>120, ID=>987654}}}, 
 F=>
  {Swedish=>{{NAME=>'Susan', AGE=>36,  ID=>543210}}
}

Note this not only creates the correct key hierarchy but also remove the now redundant keys.

I'm getting stuck at the point where I need to create the new, most inner hash in its correct hierarchical location.

The problem is I don't know the "depth" (i.e. the number of keys). If I has a constant number, I could do something like:

%h{$inner_hash{$PRIMARY_KEY}}{$inner_hash{$SECONDARY_KEY}}{...} = filter_copy($inner_hash,[$PRIMARY_KEY,$SECONDARY_KEY])

so perhaps I can write a loop that will add one level at a time, remove that key from the hash, than add the remaining hash to the "current" location, but it's a bit cumbersome and also I'm not sure how to keep a 'location' in a hash of hashes...

3
  • 2
    You expected data structure looks wrong. For example, if there were two Swedish females, what should $expected{FEMALE}{Swedish} contain? The way you've shown it (hashes all the way down), there isn't a good answer to this question. My assumption is that $expected{FEMALE}{Swedish} needs to be an array reference containing the pruned hash references. Commented Oct 3, 2010 at 13:25
  • This actually is not that difficult to do, but you have to list a more defined structure. Perhaps describing in XML the hierarchy and which are attributes/once-occurring elements, and which can be listed multiple times. Commented Oct 3, 2010 at 18:55
  • With regards to what FM has said, you wouldn't need to have an arrayref, but you would need some sort of unique key system. Arrays are good because they innately create a unique index. Commented Oct 3, 2010 at 19:16

2 Answers 2

6
use Data::Dumper;

my $aoa= [
 {NAME=>'Dave', AGE=>12, SEX=>'M', ID=>123456, NATIONALITY=>'Swedish'},
 {NAME=>'Susan', AGE=>36, SEX=>'F', ID=>543210, NATIONALITY=>'Swedish'},
 {NAME=>'Bart', AGE=>120, SEX=>'M', ID=>987654, NATIONALITY=>'British'},
];

sub aoh_to_hoh {
  my ($aoa, $key_hierarchy_a) = @_;
  my $result = {};
  my $last_key = $key_hierarchy_a->[-1];
  foreach my $orig_element (@$aoa) {
    my $cur = $result;
    # song and dance to clone an element
    my %element = %$orig_element;
    foreach my $key (@$key_hierarchy_a) {
      my $value = delete $element{$key};
      if ($key eq $last_key) {
        $cur->{$value} ||= [];
        push @{$cur->{$value}}, \%element;
      } else {
        $cur->{$value} ||= {};
        $cur = $cur->{$value};
      }
    }
  }
  return $result;
}

my $key_hierarchy_a = ['SEX', 'NATIONALITY'];
print Dumper(aoh_to_hoh($aoa, $key_hierarchy_a));

As per @FM's comment, you really want an extra array level in there.

The output:

$VAR1 = {
          'F' => {
                   'Swedish' => [
                                  {
                                    'ID' => 543210,
                                    'NAME' => 'Susan',
                                    'AGE' => 36
                                  }
                                ]
                 },
          'M' => {
                   'British' => [
                                  {
                                    'ID' => 987654,
                                    'NAME' => 'Bart',
                                    'AGE' => 120
                                  }
                                ],
                   'Swedish' => [
                                  {
                                    'ID' => 123456,
                                    'NAME' => 'Dave',
                                    'AGE' => 12
                                  }
                                ]
                 }
        };

EDIT: Oh, BTW - if anyone knows how to elegantly clone contents of a reference, please teach. Thanks!

EDIT EDIT: @FM helped. All better now :D

Sign up to request clarification or add additional context in comments.

1 Comment

Storable::dclone can be used for generically copying the contents of a deep data structure.
2

As you've experienced, writing code to create hash structures of arbitrary depth is a bit tricky. And the code to access such structures is equally tricky. Which makes one wonder: Do you really want to do this?

A simpler approach might be to put the original information in a database. As long as the keys you care about are indexed, the DB engine will be able to retrieve rows of interest very quickly: Give me all persons where SEX = female and NATIONALITY = Swedish. Now that sounds promising!

You might also find this loosely related question of interest.

1 Comment

Perhaps you're right. I should take a look into databases in Perl sometime.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.