Check hash values before pushing to an array in perl

Question

For creating a Json file i am pushing a hashes in an array but the values are getting duplicated now i don't want to add those hashes which are already in the array.

foreach my $corp_id(@{corpId}) {
    foreach my $rcode(@{$brands_map->{$corp_id->{s_brand}}}) {
            my corpIdAccessCode;
            $corpIdAccessCode->{accessCode} = $corp_id->{s_id};
            $corpIdAccessCode->{corporateId} = $corp_id->{c_id};
            $corpIdAccessCode->{bcode} = $rcode;
            push @{$accessCode_array} ,$corpIdAccessCode; **// Here before pushing to array i want to have a check wheather $corp_id->{s_id}, $corp_id->{c_id} and  $rcode already exists or not in the accessCode_array**
    }
}

So from the below array of hashes i don't want duplicate ones

[
      {
        "accessCode": "NQ",
        "bcode": "PD",
        "corporateId": "12"
      },
      {
        "accessCode": "NQ",
        "bcode": "CI",
        "corporateId": "2122121"
      },
      {
        "accessCode": "NQ",
        "bcode": "CI",
        "corporateId": "2122121"
      },
      {
        "accessCode": "CD",
        "bcode": "PD",
        "corporateId": "12"
      },

The final ooutput from the code changes should give a result like below :

[
      {
        "accessCode": "NQ",
        "bcode": "PD",
        "corporateId": "12"
      },
      {
        "accessCode": "NQ",
        "bcode": "CI",
        "corporateId": "2122121"
      },

      {
        "accessCode": "CD",
        "bcode": "PD",
        "corporateId": "12"
      },

Or is there any way we can remove duplicate hashes from the array.

Is my $corpIdaccessCode; a transcription error, or actually in the code? — jhnc
– jhnc, Commented Mar 20, 2019 at 19:16
Do those hashes have any uniq field or you need to check all fields? — UjinT34
– UjinT34, Commented Mar 20, 2019 at 19:17
@UjinT34 before pushing to the array i want to check all three hash keys values exists in the array or not — Developer
– Developer, Commented Mar 20, 2019 at 19:19

UjinT34 · Accepted Answer · 2019-03-20 19:36:55Z

It would be ineficient to check the whole array before pushing or remove duplicates afterwards. So you need to keep track what data you have pushed already:

my $seen;
foreach my $corp_id(@{corpId}) {
    foreach my $rcode(@{$brands_map->{$corp_id->{s_brand}}}) {
            my ($k1, $k2, $k3) = ($corp_id->{s_id}, $corp_id->{c_id}, $rcode);
            if ($seen->{$k1}->{$k2}->{$k3}) {
                next;
            }
            $seen->{$k1}->{$k2}->{$k3} = 1;

            my $corpIdAccessCode;
            $corpIdAccessCode->{accessCode} = $corp_id->{s_id};
            $corpIdAccessCode->{corporateId} = $corp_id->{c_id};
            $corpIdAccessCode->{bcode} = $rcode;
            push @{$accessCode_array} ,$corpIdAccessCode; **// Here before pushing to array i want to have a check wheather $corp_id->{s_id}, $corp_id->{c_id} and  $rcode already exists or not in the accessCode_array**
    }
}

my ($k1, $k2, $k3) just to make it shorter and more readable.

ikegami · Accepted Answer · 2019-03-20 19:51:32Z

0

Checking !$seen{$key}++ is a common way of identifying the first of duplicates. For example,

my %seen;
my @uniques = grep { !$seen{$_}++ } @items;

This can be unrolled into a foreach loop.

my %seen;
for my $corp_id (@corpId) {
    for my $rcode (@{ $brands_map->{ $corp_id->{s_brand} } }) {
        next if $seen{ $corp_id->{s_id} }{ $corp_id->{c_id} }{ $rcode }++;

        push @$accessCode_array, {
            accessCode  => $corp_id->{s_id},
            corporateId => $corp_id->{c_id},
            bcode       => $rcode,
        };
    }
}

To save memory, you could even replace

next if $seen{ $corp_id->{s_id} }{ $corp_id->{c_id} }{ $rcode }++;

with

next if $seen{ join ":", $corp_id->{s_id}, $corp_id->{c_id}, $rcode }++;

but that assumes none of the three fields can contain :.

edited Mar 20, 2019 at 19:51

answered Mar 20, 2019 at 19:42

ikegami

391k17 gold badges291 silver badges555 bronze badges

7 Comments

UjinT34 Over a year ago

I would write something like this myself but postfix conditionals and ++ are considered to be a bad style. $seen{ $corp_id->{s_id} }{ $corp_id->{c_id} }{ $rcode }++ is hard to read. And using a delimeter is a bit unreliable.

ikegami Over a year ago

@UjinT34, Re "considered to be a bad style.", Quite that contrary, !$seen{$_}++ is an idiom, so it can't possibly be bad style. It also means that deviations are harder to read by virtue of being departures from the usual.

ikegami Over a year ago

@UjinT34, Re "And using a delimeter is a bit unreliable.", huh?

ysth Over a year ago

@UjinT34 postfix ++ for uniqifying is a pretty universal idiom. doing ++$seen{...} == 1 instead of !$seen{...}++ might be considered more readable I guess, but I think that ship has long sailed.

ikegami Over a year ago

@UjinT34, Actually, you don't. You just need to know that !$seen{$_}++ finds the first of duplicates (and that $seen{$_}++ finds everything else). Knowing how it works is secondary.

|

Collectives™ on Stack Overflow

Check hash values before pushing to an array in perl

2 Answers 2

Comments

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related