2

As seen below, I have a foreach loop inside which, a value inside an array of hashes is being replaced with a value from another array of hashes.

The second foreach loop is just to print and test whether the values got assigned correctly.

foreach my $row (0 .. $#row_buff) {
    $row_buff[$row]{'offset'} = $vars[$row]{'expression'};
    print $row_buff[$row]{'offset'},"\n";
}

foreach (0 .. $#row_buff) {
    print $row_buff[$_]{'offset'},"\n";
}

Here @row_buff and @vars are the two array of hashes. They are prefilled with values for all keys used.

The hashes were pushed into the arrays like so:

push @row_buff, \%hash;

ISSUE: Let's say the print statement in the first foreach print's like this:

string_a
string_b
string_c
string_d

Then the print statement in the second foreach loop print's like so:

string_d
string_d
string_d
string_d

This is what confuses me. Both print statements are supposed to print the exact same way am I right? But the value that gets printed by the second print statement is just the last value alone in a repeated manner. Could someone please point me to what could be going wrong here? Any hint is greatly appreciated. This is my first time putting up a question so pardon me if I missed anything.

UPDATE

There was a bit of information that I could have added, sorry about that everyone. There was one more line before the foreach, it was like so:

@row_buff = (@row_buff) x $itercnt;
foreach my $row (0 .. $#row_buff) {
    $row_buff[$row]{'offset'} = $vars[$row]{'expression'};
    print $row_buff[$row]{'offset'},"\n";
}

foreach (0 .. $#row_buff) {
    print $row_buff[$_]{'offset'},"\n";
}

$itercnt is an integer. I was using it to replicate the @row_buff that many number of times.

3
  • 3
    it's likely that you push a ref to the same hash (\%hash) each time when populating the array. But as toolic says, without seeing the real code that's just a guess. Commented Oct 1, 2021 at 22:16
  • 2
    Use push @row_buff, { %hash }; if Dave's guess is correct. Commented Oct 1, 2021 at 22:28
  • Thanks for those hints Dave and Shawn, they helped me figure it out! Commented Oct 2, 2021 at 15:24

1 Answer 1

5

This clearly has to do with storing references on the array, instead of independent data. How that comes about isn't clear since details aren't given, but the following discussion should help.

Consider these two basic examples.

First, place a hash (reference) on an array, first changing a value each time

use warnings;
use strict;
use feature 'say';
use Data::Dump qw(dd);
# use Storable qw(dclone);

my %h = ( a => 1, b => 2 );

my @ary_w_refs;

for my $i (1..3) {
    $h{a} = $i; 
    push @ary_w_refs, \%h;           # almost certainly WRONG

    # push @ary_w_refs, { %h };      # *copy* data
    # push @ary_w_refs, dclone \%h;  # may be necessary, or just safer
}

dd $_ for @ary_w_refs;

I use Data::Dump for displaying complex data structures, for its simplicity and default compact output. There are other modules for this purpose, Data::Dumper being in the core (installed).

The above prints

{ a => 3, b => 2 }
{ a => 3, b => 2 }
{ a => 3, b => 2 }

See how that value for key a, that we changed in the hash each time, and so supposedly set for each array element, to a different value (1, 2, 3) -- is the same in the end, and equal to the one we assigned last? (This appears to be the case in the question.)

This is because we assigned a reference to the hash %h to each element, so even though every time through the loop we first change the value in the hash for that key in the end it's just the reference there, at each element, to that same hash.

So when the array is queried after the loop we can only get what is in the hash (at key a it's the last assigned number, 3). The array doesn't have its own data, only a pointer to hash's data. (Thus hash's data can be changed by writing to the array as well, as seen in the example below.)

Most of the time, we want a separate, independent copy. Solution? Copy the data.

Naively, instead of

push @ary_w_refs, \%h;

we can do

push @ary_w_refs, { %h };

Here {} is a constructor for an anonymous hash, so %h inside gets copied. So actual data gets into the array and all is well? In this case, yes, where hash values are plain strings/numbers.

But what when the hash values themselves are references? Then those references get copied, and @ary_w_refs again does not have its own data! We'll have the exact same problem. (Try the above with the hash being ( a => [1..10] ))

If we have a complex data structure, carrying references for values, we need a deep copy. One good way to do that is to use a library, and Storable with its dclone is very good

use Storable qw(dclone);
...

    push @ary_w_refs, dclone \%h;

Now array elements have their own data, unrelated (but at the time of copy equal) to %h.

This is a good thing to do with a simple hash/array as well, to be safe from future changes, whereby the hash is changed but we forget about the places where it's copied (or the hash and its copies don't even know about each other).

Another example. Let's populate an array with a hashref, and then copy it to another array

use warnings;
use strict;
use feature 'say';    
use Data::Dump qw(dd pp);

my %h = ( a => 1, b => 2 );

my @ary_src = \%h;
say "Source array: ", pp \@ary_src;

my @ary_tgt = $ary_src[0];
say "Target array: ", pp \@ary_tgt;

$h{a} = 10;
say "Target array: ", pp(\@ary_tgt), " (after hash change)";

$ary_src[0]{b} = 20;
say "Target array: ", pp(\@ary_tgt), " (after hash change)";

$ary_tgt[0]{a} = 100;
dd \%h;

(For simplicity I use arrays with only one element.)

This prints

Source array: [{ a => 1, b => 2 }]
Target array: [{ a => 1, b => 2 }]
Target array: [{ a => 10, b => 2 }] (after hash change)
Target array: [{ a => 10, b => 20 }] (after hash change)
{ a => 100, b => 20 }

That "target" array, which supposedly was merely copied off of a source array, changes when the distant hash changes! And when its source array changes. Again, it is because a reference to the hash gets copied, first to one array and then to the other.

In order to get independent data copies, again, copy the data, each time. I'd again advise to be on the safe side and use Storable::dclone (or an equivalent library of course), even with simple hashes and arrays.

Finally, note a slightly sinister last case -- writing to that array changes the hash! This (second-copied) array may be far removed from the hash, in a function (in another module) that the hash doesn't even know of. This kind of an error can be a source of really hidden bugs.

Now if you clarify where references get copied, with a more complete (simple) representation of your problem, we can offer a more specific remedy.


An important way of using a reference that is correct, and which is often used, is when the structure taken the reference of is declared as a lexical variable every time through

for my $elem (@data) { 
    my %h = ...
    ... 
    push @results, \%h;  # all good
}

That lexical %h is introduced anew every time so the data for its reference on the array is retained, as the array persists beyond the loop, independently for each element.

It is also more efficient doing it this way since the data in %h isn't copied, like it is with { %h }, but is just "re-purposed," so to say, from the lexical %h that gets destroyed at the end of iteration to the reference in the array.

This of course may not always be suitable, if a structure to be copied naturally lives outside of the loop. Then use a deep copy of it.

The same kind of a mechanism works in a function call

sub some_func {
    ...
    my %h = ...
    ...
    return \%h;  # good
}

my $hashref = some_func();

Again, the lexical %h goes out of scope as the function returns and it doesn't exist any more, but the data it carried and a reference to it is preserved, since it is returned and assigned so its refcount is non-zero. (At least returned to the caller, that is; it could've been passed yet elsewhere during the sub's execution so we may still have a mess with multiple actors working with the same reference.) So $hashref has a reference to data that had been created in the sub.

Recall that if a function was passed a reference, when it was called or during its execution (by calling yet other subs which return references), changed and returned it, then again we have data changed in some caller, potentially far removed from this part of program flow.

This is done often of course, with larger pools of data which can't just be copied around all the time, but then one need be careful and organize code (to be as modular as possible, for one) so to minimize chance of errors.

This is a loose use of the word "pointer," for what a reference does, but if one were to refer to C I'd say that it's a bit of a "dressed" C-pointer

In a different context it can be a block

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you so much for the detailed and informative answer. I tried using push \@ary_w_refs, \%h;, push @ary_w_refs, { %h }; and push \@ary_w_refs, dclone \%h; like you had mentioned but it wasn't fixing the issue, then I realized I had a line like so: \@row_buff = (\@row_buff) x $itercnt; In this line the has references in the array were getting replicated, so I replaced this line using your suggestion in a foreach loop like so:
$size = $#row_buff; foreach (0 .. $itercnt -2){ foreach (0 .. $size){ push @row_buff, dclone \%h; } } Here now instead of replicating the array by duplicating the hash references in it, it's deep cloning each element of the array and pushing it back in to have the same effect of replicating but without the issue of duplicating the references. This helped to solve the issue.
@Vishnu3333 Great that you found it and fixed it! :) Glad that this helped :). Note, I've added a bit in the footnote, about returning a reference from a sub. Please let me know if questions pop up.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.