How can I efficiently extract repeated elements in a Ruby array? [duplicate]

Question

I have an array like [1,1,1,2,4,6,3,3] and I would like to get the list of repeated elements, in this case [1,3]. I wrote this:

my_array.select{|obj|my_array.count(obj)>1}.uniq

But it is tragically inefficient (o(n²)). Do you have a better idea? If possible concise.

Thanks

MiniQuark · Accepted Answer · 2009-04-24 18:49:26Z

9

Inspired by Ilya Haykinson's answer:

def repeated(array)
  counts = Hash.new(0)
  array.each{|val|counts[val]+=1}
  counts.reject{|val,count|count==1}.keys
end

answered Apr 24, 2009 at 18:49

community wiki

MiniQuark

Sign up to request clarification or add additional context in comments.

4 Comments

Greg Campbell Over a year ago

Yeah, I think that's cleaner than mine. Just for fun, here's that method all on one line, assuming availability of the "tap" method from Ruby >= 1.8.7. array.inject(Hash.new(0)){|counts,val|counts.tap{|c|c[val]+=1}}.reject{|val,count|count==1}.keys I think yours is more readable, though. :)

Marek Příhoda Over a year ago

I really, really like this solution, and I like it because it is the most readable/understandable one among all the O(n) solutions. Here's a one-liner modification, just for fun: array.inject(Hash.new(0)) { |h, i| h[i] += 1; h }.reject { |v, c| c == 1 }.keys

rapcal Over a year ago

Thanks! Amazing... I was suffering with detect, find_all, etc

Adamantish Over a year ago

This is ok but anyone who thinks it's the best answer needs to get familiar with Set.new. It uses a hash under hood and is great when you need the O(1) hash key access but with the simplicity of an array. Plus it aides readability as all the logic shrinks to the beautifully obvious dups.add(val) if seen_already.include?(val)

Greg Campbell · Accepted Answer · 2009-04-24 18:09:02Z

6

Using Ruby's Set library:

require 'set'

ary = [1,1,1,2,4,6,3,3]
dups = Set.new
test_set = Set.new
ary.each {|val| dups.add(val) unless test_set.add?(val)}
dups.to_a # [1, 3]

I believe this should be O(n), because Set#add and Set#add? are constant-time operations, as far as I know.

answered Apr 24, 2009 at 18:09

Greg Campbell

15.3k3 gold badges46 silver badges46 bronze badges

Comments

MiniQuark · Accepted Answer · 2009-04-24 18:37:38Z

4

How about something like this? It will run in O(n).

a = [1,1,1,2,4,6,3,3]
b = {}
a.each { |v| if b.has_key? v then b[v] = b[v]+1 else b[v]=1 end }
b.reject { |k,v| if v > 1 then false else true end }.keys

edited Apr 24, 2009 at 18:37

MiniQuark

48.9k37 gold badges156 silver badges188 bronze badges

answered Apr 24, 2009 at 18:34

Ilya Haykinson

5694 silver badges7 bronze badges

2 Comments

MiniQuark Over a year ago

I like the idea. You could beautify the last line like this: b.reject{|k,v| v==1}.keys

MiniQuark Over a year ago

Also, you could use b=Hash.new(0), and then you would have a simpler 3rd line: a.each{|v|b[v]+=1}

tokland · Accepted Answer · 2011-12-11 22:18:57Z

3

A O(n) solution (change << x to + [x] and update to merge to make it purely functional):

rs = xs.inject([[], {}]) do |(out, seen), x| 
  [(seen[x] == 1 ? (out << x) : out), seen.update(x => (seen[x] || 0)+1)]
end[0]

A much simpler yet less space-efficient approach:

rs = xs.group_by { |x| x }.select { |y, ys| ys.size > 1 }.keys

The same idea avoiding the intermediate hash using a "list-comprehension":

rs = xs.group_by { |x| x }.map { |y, ys| y if ys.size > 1 }.compact

edited Dec 11, 2011 at 22:18

answered Dec 10, 2011 at 19:00

tokland

68.2k13 gold badges151 silver badges174 bronze badges

3 Comments

Jan Over a year ago

There's a problem with this solution. See xs = [1,1,1].

Andrew Grimm Over a year ago

Wouldn't group_by be a better fit?

tokland Over a year ago

@Andrew. I thought there was already a solution using group_by, but it seems it was in the other question. I'll add it. Now that Ruby has ordered hashes we can preserve the order of the original enumerable. However, it's less space efficient than a custom solution.

Diosney · Accepted Answer · 2013-07-19 19:37:11Z

1

Using inject

[1,1,1,2,4,6,3,3].inject({}){ |ele, n| ele[n] = nil; ele }.keys 
# => [1, 2, 4, 6, 3]

EXPLANATION:

ele hash it's initialled to {}, each iteration a key with the number n and nil value is added to the ele hash. At the end ele is returned as:

{1=>nil, 2=>nil, 4=>nil, 6=>nil, 3=>nil}

We only want the keys, so .keys ends the job.

edited Jul 19, 2013 at 19:37

Diosney

10.6k15 gold badges70 silver badges113 bronze badges

answered Jul 19, 2013 at 18:59

ivanxuu

8629 silver badges10 bronze badges

1 Comment

MiniQuark Over a year ago

Thanks but I only wanted repeated elements, as indicated in the example.

dirkgently · Accepted Answer · 2009-04-24 17:45:17Z

0

Some ideas: you'd have to figure out the correct library data structures:

1 Sort the array O(nlogn), then run through the array

2 Create a set, search for the current array element in the set and if not found, insert and proceed for all the elements -- O(nlogn) again.

answered Apr 24, 2009 at 17:45

dirkgently

112k16 gold badges135 silver badges190 bronze badges

Comments

marekj · Accepted Answer · 2009-12-18 06:20:15Z

0

I was thinking of counting how many times a unique element appears in array. It may be really inefficient just like the original suggestion but it was fun looking at the problem. I didn't do any benchmarks on larger arrays so this is just an excercise.

a = [1,1,1,2,4,6,3,3]

dupes = []
a.uniq.each do |u|
  c = a.find_all {|e| e == u}.size
  dupes << [u, c] unless c == 1
end

puts dupes.inspect

# dupes = [[1, 3], [3, 2]]
# 1 appears 3 times
# 3 appears twice


# to extract just the elment a bit cleaner
dupes = a.uniq.select do |u|
  a.find_all {|e| e == u}.size != 1
end
puts dupes.inspect
# returns [1,3]

edited Dec 18, 2009 at 6:20

answered Dec 18, 2009 at 6:05

marekj

1,2552 gold badges10 silver badges12 bronze badges

Comments

Justin Love · Accepted Answer · 2009-12-18 06:45:20Z

0

This will work if the duplicated entries are always consecutive, as in your example; otherwise you would have to sort first. each_cons examines a rolling window of the specified size.

require 'set'

my_array = [1,1,1,2,4,6,3,3]
dups = Set.new
my_array.each_cons(2) {|a,b| dups.add(a) if (a == b)}
p dups.to_a

answered Dec 18, 2009 at 6:45

Justin Love

4,46729 silver badges37 bronze badges

Collectives™ on Stack Overflow

How can I efficiently extract repeated elements in a Ruby array? [duplicate]

8 Answers 8

4 Comments

Comments

2 Comments

3 Comments

EXPLANATION:

1 Comment

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

4 Comments

Comments

2 Comments

3 Comments

EXPLANATION:

1 Comment

Comments

Comments

Comments

Linked

Related