3

I have two arrays that have thousands of elements. I need to find what elements are missing in one array by comparing it to another. Is there a way to get the missing elements without iterating through the entire array? Or is there something faster than what I am doing?

Here is what I am using now:

def find_missing(array1, array2)
    missing_elements = []
    array1.each { |e|  
        unless array2.include? e
            missing_elements << e
        end
    }

    return missing_elements
end

array1 = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
array2 = [1, 2, 4, 5, 6, 7, 9]

puts find_missing(array1, array2)

4 Answers 4

9

You want a copy of the first array, but removing any elements that appears in the second array? That's what Array#- (array difference) does:

array1 - array2
# => [10, 8, 3]
Sign up to request clarification or add additional context in comments.

1 Comment

@MrAutoIt The Array class has a bunch of these handy methods. If you ever want to know what tools are there, check the documentation as there's a lot of things to enjoy.
2

You can just use the difference operator

@irb(main):001:0> array1 = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
=> [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
@irb(main):002:0> array2 = [1, 2, 4, 5, 6, 7, 9]
=> [1, 2, 4, 5, 6, 7, 9]
@irb(main):003:0> array1 - array2
=> [10, 8, 3]

Comments

1

I doubt that you can beat array1 - array2, Array#- being coded in C, but you might want to benchmark it against:

require 'set'

def use_sets(array1, array2)
  a1 = array1.uniq
  s2 = array2.to_set
  a1.reject { |e| s2.include?(e) }
end

added..

require 'fruity'

def compare_em(array1, array2)
  compare do 
    _minus { array1 - array2 }
    _sets  { use_sets(array1, array2) }
  end
end

n = 100_000
array1 = (1..n).to_a.shuffle
array2 = (1..n-2).to_a.shuffle
compare_em(array1, array2)
  #_minus is faster than _sets by 2x ± 1.0

n = 1_000_000
array1 = (1..n).to_a.shuffle
array2 = (1..n-2).to_a.shuffle
compare_em(array1, array2)
  #_minus is faster than _sets by 2x ± 1.0

n = 100_000
array1 = (([1]*n).concat [1,2]).shuffle
array2 = [1]*n
compare_em(array1, array2)
  #_minus is faster than _sets by 5x ± 1.0

Comments

0

To add to the original accepted answer, it is worth noting that the below solution would ensure finding the difference between two arrays, irrelevant of the order of comparison.

array1 - array2 | array2 - array1
# => [10, 8, 3]

If for some reason you were to compare the two arrays the other way round, it would return empty array and give the illusion nothing is missing. This may be obvious for small arrays but not for much larger ones.

array2 - array1
# => []

However a caveat to this is if you have differences in both arrays, this would capture the collective difference between them.

array1 = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
array2 = [1, 2, 4, 5, 6, 7, 9, 11]

array1 - array2 | array2 - array1
# => [11, 10, 8, 3]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.