1

I have 2 hashes, let's say A, B

A: { 'key1' => [a, b], 'key2' => 'c' }
B: { 'key1' => [b, a], 'key2' => 'c' }

What is the best possible way to compare these 2 hashes. The ordering of the array contents does not matter. So in my case, hash A and B are equal

2
  • If the order does not matter, wouldn't it make more sense to make them sets or (if multiplicity matters) multisets / bags in the first place? Then, a simple equality check, i.e. A == B would just automatically work. Commented Feb 16, 2019 at 7:13
  • Are you trying to order the hashes in some way, or are you just interested in determining are those hashes equal or not? Commented Feb 17, 2019 at 8:13

4 Answers 4

2

It's not as easy as it seems at first glance.
It is necessary to take into account several nuances:

  • the number of elements in the hashes may not match;
  • items with the same key in two hashes can be of different types.

A relatively universal solution can be as follows:

def hashes_comp(hash1, hash2)
  return false if hash1.size != hash2.size
  hash1.each do |key, value|
    if value.class == Array
      return false if hash2[key].class != Array || value.sort != hash2[key].sort
    else
      return false if value != hash2[key]
    end
  end
  true
end

hash_a = {'key1' => ['a', 'b'], 'key2' => 'c'}
hash_b = {'key1' => ['b', 'a'], 'key2' => 'c'}
hash_c = {'key1' => ['a', 'c'], 'key2' => 'c'}
hash_d = {'key1' => ['a', 'b'], 'key2' => 'd'}
hash_e = {'key1' => ['a', 'b'], 'key2' => ['a', 'b']}
hash_f = {'key1' => ['a', 'b'], 'key2' => 'c', 'key3' => 'd'}

hashes_comp(hash_a, hash_b) #=> true
hashes_comp(hash_a, hash_c) #=> false
hashes_comp(hash_a, hash_d) #=> false
hashes_comp(hash_a, hash_e) #=> false
hashes_comp(hash_a, hash_f) #=> false
Sign up to request clarification or add additional context in comments.

Comments

1

One can sort the arrays but that can be an expensive operation if the arrays are large. If n equals the size of the array, the time complexity of heapsort, for example, is O(n log(n)). It's faster to replace arrays with counting hashes, the construction of which enjoys a time complexity of O(n).

h1 = { 'k1' => [1, 2, 1, 3, 2, 1], 'k2' => 'c' }
h2 = { 'k1' => [3, 2, 1, 2, 1, 1], 'k2' => 'c' }

def same?(h1, h2)
  return false unless h1.size == h2.size
  h1.all? do |k,v|
    if h2.key?(k)
      vo = h2[k]
      if v.is_a?(Array)
        if vo.is_a?(Array) 
          convert(v) == convert(vo)
        end
      else
        v == vo
      end
    end
  end
end           

def convert(arr)
  arr.each_with_object(Hash.new(0)) { |e,g| g[e] += 1 }
end

same?(h1, h2)
  #=> true

Here

convert([1, 2, 1, 3, 2, 1])
  #=> {1=>3, 2=>2, 3=>1} 
convert([3, 2, 1, 2, 1, 1])
  #=> {3=>1, 2=>2, 1=>3}

and

{1=>3, 2=>2, 3=>1} == {3=>1, 2=>2, 1=>3}
  #=> true

See Hash::new, specifically the case where the method takes an argument that equals the default value.

The guard clause return false unless h1.size == h2.size is to ensure that h2 does not have keys that are not present in h1. Note that the following returns the falsy value nil:

if false
  #...
end
  #=> nil

In a couple of places I've written that rather than the more verbose but equivalent expresion

if false
  #...
else
  nil
end

6 Comments

I like your approach with Array converting. Your solution can be faster than my (stackoverflow.com/a/54719267/10306509). But I think it makes sense for you to give preference to iterating over the first Hash with an member-to-member comparison. It doesn't make sense to convert both Hashes entirely if the difference could be found in their first members.
What if we pass sorted array from method, i.e. using sort instead of each_with_object which will later compare sorted arrays?
@Ivan, I don't understand. One must sort both arrays or construct both hashes. Once that's done, in comparing the two arrays or two hashes Ruby will quit as soon as she discovers they are unequal.
@CarySwoveland, I think the fastest way is to convert-and-compare each corresponding elements of both Hashes, i.e. change value.sort != hash2[key].sort in my algorithm to converting to Hashes and comparing, so we can quit converting-and-comparison after first mismatch.
@Ivan, it finally dawned on me what you are suggesting. It's what you've done for arrays. (I'm a bit slow this evening.) Good idea. I'll edit my answer.
|
0

I would definitely agree with Ivan it's not as easy as it initially seems but I figured I would try doing it with recursion. This has the added benefit of being able to compare beyond just hashes.

hash_a = {'key1' => ['a', 'b'], 'key2' => 'c'}
hash_b = {'key1' => ['b', 'a'], 'key2' => 'c'}
hash_c = {'key1' => ['a', 'c'], 'key2' => 'c'}
hash_d = {'key1' => ['a', 'b'], 'key2' => 'd'}
hash_e = {'key1' => ['a', 'b'], 'key2' => ['a', 'b']}
hash_f = {'key1' => ['a', 'b'], 'key2' => 'c', 'key3' => 'd'}


def recursive_compare(one, two)
  unless one.class == two.class
    return false
  end

  match = false

  # If it's not an Array or Hash...
  unless one.class == Hash || one.class == Array
    return one == two
  end

  # If they're both Hashes...
  if one.class == Hash
    one.each_key do |k|
      match = two.key? k
      break unless match
    end

    two.each_key do |k|
      match = one.key? k
      break unless match
    end

    if match
      one.each do |k, v|
        match = recursive_compare(v, two[k])
        break unless match
      end
    end
  end

  # If they're both Arrays...
  if one.class == Array
    one.each do |v|
      match = two.include? v
      break unless match
    end
    two.each do |v|
      match = one.include? v
      break unless match
    end
  end

  match
end

puts recursive_compare(hash_a, hash_b) #=> true
puts recursive_compare(hash_a, hash_c) #=> false
puts recursive_compare(hash_a, hash_d) #=> false
puts recursive_compare(hash_a, hash_e) #=> false
puts recursive_compare(hash_a, hash_f) #=> false

Comments

0

I came up with this solution:

def are_equals?(a, b)
  (a.keys.sort == b.keys.sort) &&
  a.merge(b) { |k, o_val, n_val| [o_val, n_val].all? { |e| e.kind_of? Array} ? o_val.sort == n_val.sort : o_val == n_val }.values.all?
end


How it works.

The first part tests for key equality, using Hash#keys, which returns the array of keys, sorted of course:

a.keys.sort == b.keys.sort

For the second part I used Hash#merge to compare values related to the same key, and can be expanded in this way:

res = a.merge(b) do |k, o_val, n_val|
  if [o_val, n_val].all? { |e| e.kind_of? Array}
    o_val.sort == n_val.sort
  else
    o_val == n_val
  end
end

#=> {"key1"=>true, "key2"=>true}

It returns a Hash where values are true or false, then checks if all values are true using Enumerable#all?:

res.values.all?
#=> [true, true].all? => true

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.