How do I remove duplicate items from an array in Perl?

Question

I have an array in Perl:

my @my_array = ("one","two","three","two","three");

How do I remove the duplicates from the array?

Wolf · Accepted Answer · 2024-02-29 15:04:52Z

183

You can do something like this as demonstrated in perlfaq4:

sub uniq {
    my %seen;
    grep !$seen{$_}++, @_;
}

my @array = qw(one two three two three);
my @filtered = uniq(@array);

print "@filtered\n";

Outputs:

one two three

If you are aiming for universality, you should take a look at the uniq function. It is included in the core module List::Util as of Perl v5.26.0 (for older versions, use List::MoreUtils). This function differs from the above sketch in that it treats undef as a separate value, different from '', and does not issue a warning.

edited Feb 29, 2024 at 15:04

Wolf

10.3k8 gold badges72 silver badges117 bronze badges

answered Aug 11, 2008 at 10:16

Greg Hewgill

1.0m192 gold badges1.2k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

szabgab Over a year ago

please don't use $a or $b in examples as they are the magic globals of sort()

ephemient Over a year ago

It's a my lexical in this scope, so it's fine. That being said, possibly a more descriptive variable name could be chosen.

vol7ron Over a year ago

@ephemient yes, but if you were to add sorting in this function then it would trump $::a and $::b, wouldn't it?

szabgab Over a year ago

@BrianVandenberg Welcome to the world of 1987 - when this was created - and almost 100% backword compbaility for perl - so it cannot be eliminated.

ikegami Over a year ago

sub uniq { my %seen; grep !$seen{$_}++, @_ } is a better implementation since it preserves order at no cost. Or even better, use the one from List::MoreUtils.

|

brian d foy · Accepted Answer · 2021-12-03 18:41:19Z

131

The Perl documentation comes with a nice collection of FAQs. Your question is frequently asked:

% perldoc -q duplicate

The answer, copy and pasted from the output of the command above, appears below:

Found in /usr/local/lib/perl5/5.10.0/pods/perlfaq4.pod

How can I remove duplicate elements from a list or array? (contributed by brian d foy)

Use a hash. When you think the words "unique" or "duplicated", think "hash keys".

If you don't care about the order of the elements, you could just create the hash then extract the keys. It's not important how you create that hash: just that you use "keys" to get the unique elements.

   my %hash   = map { $_, 1 } @array;
   # or a hash slice: @hash{ @array } = ();
   # or a foreach: $hash{$_} = 1 foreach ( @array );

   my @unique = keys %hash;

If you want to use a module, try the "uniq" function from "List::MoreUtils". In list context it returns the unique elements, preserving their order in the list. In scalar context, it returns the number of unique elements.

   use List::MoreUtils qw(uniq);

   my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7
   my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7

You can also go through each element and skip the ones you've seen before. Use a hash to keep track. The first time the loop sees an element, that element has no key in %Seen. The "next" statement creates the key and immediately uses its value, which is "undef", so the loop continues to the "push" and increments the value for that key. The next time the loop sees that same element, its key exists in the hash and the value for that key is true (since it's not 0 or "undef"), so the next skips that iteration and the loop goes to the next element.

   my @unique = ();
   my %seen   = ();

   foreach my $elem ( @array )
   {
     next if $seen{ $elem }++;
     push @unique, $elem;
   }

You can write this more briefly using a grep, which does the same thing.

   my %seen = ();
   my @unique = grep { ! $seen{ $_ }++ } @array;

edited Dec 3, 2021 at 18:41

brian d foy

134k31 gold badges214 silver badges613 bronze badges

answered Aug 11, 2008 at 14:27

John Siracusa

15.3k7 gold badges44 silver badges54 bronze badges

4 Comments

szabgab Over a year ago

perldoc.perl.org/…

brian d foy Over a year ago

John iz in mah anzers stealing mah rep!

Brad Gilbert Over a year ago

I think you should get bonus points for actually looking the question up.

Parthian Shot Over a year ago

I like that the best answer is 95% copy-paste and 3 sentences of OC. To be perfectly clear, this is the best answer; I just find that fact amusing.

Coding Minds · Accepted Answer · 2016-05-15 14:32:32Z

71

Install List::MoreUtils from CPAN

Then in your code:

use strict;
use warnings;
use List::MoreUtils qw(uniq);

my @dup_list = qw(1 1 1 2 3 4 4);

my @uniq_list = uniq(@dup_list);

edited May 15, 2016 at 14:32

Coding Minds

31 silver badge3 bronze badges

answered Aug 31, 2008 at 10:01

Ranguard

7114 silver badges3 bronze badges

4 Comments

yPhil Over a year ago

The fact that List::MoreUtils is not bundled w/ perl kinda damages the portability of projects using it :( (I for one won't)

incutonez Over a year ago

@Ranguard: @dup_list should be inside the uniq call, not @dups

Francisco Zarabozo Over a year ago

@yassinphilip CPAN is one of the things that make Perl as powerful and great as it can be. If you are writing your projects based only on core modules, you're putting a huge limit on your code, along with possibly pourly written code that attempts to do what some modules do much better just to avoid using them. Also, using core modules doesn't guarantee anything, as different Perl versions can add or remove core modules from the distribution, so portability is still depending on that.

Sundeep Over a year ago

Perl v5.26.0 onwards, List::Util has uniq , so MoreUtils wouldn't be needed

Chankey Pathak · Accepted Answer · 2014-07-20 16:35:58Z

25

My usual way of doing this is:

my %unique = ();
foreach my $item (@myarray)
{
    $unique{$item} ++;
}
my @myuniquearray = keys %unique;

If you use a hash and add the items to the hash. You also have the bonus of knowing how many times each item appears in the list.

edited Jul 20, 2014 at 16:35

Chankey Pathak

21.8k12 gold badges88 silver badges138 bronze badges

answered Aug 11, 2008 at 10:18

Xetius

47.4k25 gold badges92 silver badges125 bronze badges

2 Comments

Nathan Fellman Over a year ago

This has the downside of not preserving the original order, if you need it.

Onlyjob Over a year ago

It is better to use slices instead of foreach loop: @unique{@myarray}=()

Wolf · Accepted Answer · 2021-12-03 11:14:37Z

11

Can be done with a simple Perl one-liner.

my @in=qw(1 3 4  6 2 4  3 2 6  3 2 3 4 4 3 2 5 5 32 3); #Sample data 
my @out=keys %{{ map{$_=>1}@in}}; # Perform PFM
print join ' ', sort{$a<=>$b} @out;# Print data back out sorted and in order.

The PFM block does this:

Data in @in is fed into map. map builds an anonymous hash. keys are extracted from the hash and feed into @out

edited Dec 3, 2021 at 11:14

Wolf

10.3k8 gold badges72 silver badges117 bronze badges

answered Nov 9, 2011 at 21:23

Hawk

5618 silver badges10 bronze badges

Comments

Kamal Nayan · Accepted Answer · 2017-05-09 15:29:44Z

9

Method 1: Use a hash

Logic: A hash can have only unique keys, so iterate over array, assign any value to each element of array, keeping element as key of that hash. Return keys of the hash, its your unique array.

my @unique = keys {map {$_ => 1} @array};

Method 2: Extension of method 1 for reusability

Better to make a subroutine if we are supposed to use this functionality multiple times in our code.

sub get_unique {
    my %seen;
    grep !$seen{$_}++, @_;
}
my @unique = get_unique(@array);

Method 3: Use module `List::MoreUtils`

use List::MoreUtils qw(uniq);
my @unique = uniq(@array);

answered May 9, 2017 at 15:29

Kamal Nayan

1,96023 silver badges35 bronze badges

Comments

brian d foy · Accepted Answer · 2021-12-03 18:42:20Z

9

The variable @array is the list with duplicate elements

%seen=();
@unique = grep { ! $seen{$_} ++ } @array;

edited Dec 3, 2021 at 18:42

brian d foy

134k31 gold badges214 silver badges613 bronze badges

answered Oct 23, 2010 at 16:18

Sreedhar

911 silver badge1 bronze badge

Comments

jh314 · Accepted Answer · 2013-07-16 03:37:30Z

4

That last one was pretty good. I'd just tweak it a bit:

my @arr;
my @uniqarr;

foreach my $var ( @arr ){
  if ( ! grep( /$var/, @uniqarr ) ){
     push( @uniqarr, $var );
  }
}

I think this is probably the most readable way to do it.

edited Jul 16, 2013 at 3:37

jh314

27.9k16 gold badges66 silver badges83 bronze badges

answered Jan 23, 2009 at 23:35

Jay

Comments

YenForYang · Accepted Answer · 2019-01-02 02:32:07Z

1

Previous answers pretty much summarize the possible ways of accomplishing this task.

However, I suggest a modification for those who don't care about counting the duplicates, but do care about order.

my @record = qw( yeah I mean uh right right uh yeah so well right I maybe );
my %record;
print grep !$record{$_} && ++$record{$_}, @record;

Note that the previously suggested grep !$seen{$_}++ ... increments $seen{$_} before negating, so the increment occurs regardless of whether it has already been %seen or not. The above, however, short-circuits when $record{$_} is true, leaving what's been heard once 'off the %record'.

You could also go for this ridiculousness, which takes advantage of autovivification and existence of hash keys:

...
grep !(exists $record{$_} || undef $record{$_}), @record;

That, however, might lead to some confusion.

And if you care about neither order or duplicate count, you could for another hack using hash slices and the trick I just mentioned:

...
undef @record{@record};
keys %record; # your record, now probably scrambled but at least deduped

edited Jan 2, 2019 at 2:32

answered Jan 2, 2019 at 0:38

YenForYang

3,35233 silver badges24 bronze badges

1 Comment

stevesliva Over a year ago

For those comparing: sub uniq{ my %seen; undef @seen{@_}; keys %seen; } Neat.

saschabeaumont · Accepted Answer · 2015-05-26 01:56:44Z

0

Try this, seems the uniq function needs a sorted list to work properly.

use strict;

# Helper function to remove duplicates in a list.
sub uniq {
  my %seen;
  grep !$seen{$_}++, @_;
}

my @teststrings = ("one", "two", "three", "one");

my @filtered = uniq @teststrings;
print "uniq: @filtered\n";
my @sorted = sort @teststrings;
print "sort: @sorted\n";
my @sortedfiltered = uniq sort @teststrings;
print "uniq sort : @sortedfiltered\n";

answered May 26, 2015 at 1:56

saschabeaumont

22.5k4 gold badges66 silver badges89 bronze badges

Comments

Sandeep_black · Accepted Answer · 2017-03-30 09:47:16Z

0

Using concept of unique hash keys :

my @array  = ("a","b","c","b","a","d","c","a","d");
my %hash   = map { $_ => 1 } @array;
my @unique = keys %hash;
print "@unique","\n";

Output: a c b d

answered Mar 30, 2017 at 9:47

Sandeep_black

1,46118 silver badges19 bronze badges

Collectives™ on Stack Overflow

How do I remove duplicate items from an array in Perl?

11 Answers 11

10 Comments

4 Comments

4 Comments

2 Comments

Comments

Method 1: Use a hash

Method 2: Extension of method 1 for reusability

Method 3: Use module `List::MoreUtils`

Comments

Comments

Comments

1 Comment

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

10 Comments

4 Comments

4 Comments

2 Comments

Comments

Method 1: Use a hash

Method 2: Extension of method 1 for reusability

Method 3: Use module List::MoreUtils

Comments

Comments

Comments

1 Comment

Comments

Comments

Linked

Related

Method 3: Use module `List::MoreUtils`