0

For creating a Json file i am pushing a hashes in an array but the values are getting duplicated now i don't want to add those hashes which are already in the array.

foreach my $corp_id(@{corpId}) {
    foreach my $rcode(@{$brands_map->{$corp_id->{s_brand}}}) {
            my corpIdAccessCode;
            $corpIdAccessCode->{accessCode} = $corp_id->{s_id};
            $corpIdAccessCode->{corporateId} = $corp_id->{c_id};
            $corpIdAccessCode->{bcode} = $rcode;
            push @{$accessCode_array} ,$corpIdAccessCode; **// Here before pushing to array i want to have a check wheather $corp_id->{s_id}, $corp_id->{c_id} and  $rcode already exists or not in the accessCode_array**
    }
}

So from the below array of hashes i don't want duplicate ones

[
      {
        "accessCode": "NQ",
        "bcode": "PD",
        "corporateId": "12"
      },
      {
        "accessCode": "NQ",
        "bcode": "CI",
        "corporateId": "2122121"
      },
      {
        "accessCode": "NQ",
        "bcode": "CI",
        "corporateId": "2122121"
      },
      {
        "accessCode": "CD",
        "bcode": "PD",
        "corporateId": "12"
      },

The final ooutput from the code changes should give a result like below :

[
      {
        "accessCode": "NQ",
        "bcode": "PD",
        "corporateId": "12"
      },
      {
        "accessCode": "NQ",
        "bcode": "CI",
        "corporateId": "2122121"
      },

      {
        "accessCode": "CD",
        "bcode": "PD",
        "corporateId": "12"
      },

Or is there any way we can remove duplicate hashes from the array.

4
  • Is my $corpIdaccessCode; a transcription error, or actually in the code? Commented Mar 20, 2019 at 19:16
  • Do those hashes have any uniq field or you need to check all fields? Commented Mar 20, 2019 at 19:17
  • @jhnc i have updated the question sorry for confusion Commented Mar 20, 2019 at 19:18
  • @UjinT34 before pushing to the array i want to check all three hash keys values exists in the array or not Commented Mar 20, 2019 at 19:19

2 Answers 2

1

It would be ineficient to check the whole array before pushing or remove duplicates afterwards. So you need to keep track what data you have pushed already:

my $seen;
foreach my $corp_id(@{corpId}) {
    foreach my $rcode(@{$brands_map->{$corp_id->{s_brand}}}) {
            my ($k1, $k2, $k3) = ($corp_id->{s_id}, $corp_id->{c_id}, $rcode);
            if ($seen->{$k1}->{$k2}->{$k3}) {
                next;
            }
            $seen->{$k1}->{$k2}->{$k3} = 1;

            my $corpIdAccessCode;
            $corpIdAccessCode->{accessCode} = $corp_id->{s_id};
            $corpIdAccessCode->{corporateId} = $corp_id->{c_id};
            $corpIdAccessCode->{bcode} = $rcode;
            push @{$accessCode_array} ,$corpIdAccessCode; **// Here before pushing to array i want to have a check wheather $corp_id->{s_id}, $corp_id->{c_id} and  $rcode already exists or not in the accessCode_array**
    }
}

my ($k1, $k2, $k3) just to make it shorter and more readable.

Sign up to request clarification or add additional context in comments.

Comments

0

Checking !$seen{$key}++ is a common way of identifying the first of duplicates. For example,

my %seen;
my @uniques = grep { !$seen{$_}++ } @items;

This can be unrolled into a foreach loop.

my %seen;
for my $corp_id (@corpId) {
    for my $rcode (@{ $brands_map->{ $corp_id->{s_brand} } }) {
        next if $seen{ $corp_id->{s_id} }{ $corp_id->{c_id} }{ $rcode }++;

        push @$accessCode_array, {
            accessCode  => $corp_id->{s_id},
            corporateId => $corp_id->{c_id},
            bcode       => $rcode,
        };
    }
}

To save memory, you could even replace

next if $seen{ $corp_id->{s_id} }{ $corp_id->{c_id} }{ $rcode }++;

with

next if $seen{ join ":", $corp_id->{s_id}, $corp_id->{c_id}, $rcode }++;

but that assumes none of the three fields can contain :.

7 Comments

I would write something like this myself but postfix conditionals and ++ are considered to be a bad style. $seen{ $corp_id->{s_id} }{ $corp_id->{c_id} }{ $rcode }++ is hard to read. And using a delimeter is a bit unreliable.
@UjinT34, Re "considered to be a bad style.", Quite that contrary, !$seen{$_}++ is an idiom, so it can't possibly be bad style. It also means that deviations are harder to read by virtue of being departures from the usual.
@UjinT34, Re "And using a delimeter is a bit unreliable.", huh?
@UjinT34 postfix ++ for uniqifying is a pretty universal idiom. doing ++$seen{...} == 1 instead of !$seen{...}++ might be considered more readable I guess, but I think that ship has long sailed.
@UjinT34, Actually, you don't. You just need to know that !$seen{$_}++ finds the first of duplicates (and that $seen{$_}++ finds everything else). Knowing how it works is secondary.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.