Hash use array as key in ruby

Question

I have a hash that uses array as its key. When I change the array, the hash can no longer get the corresponding key and value:

1.9.3p194 :016 > a = [1, 2]
 => [1, 2] 
1.9.3p194 :017 > b = { a => 1 }
 => {[1, 2]=>1} 
1.9.3p194 :018 > b[a]
 => 1 
1.9.3p194 :019 > a.delete_at(1)
 => 2 
1.9.3p194 :020 > a
 => [1] 
1.9.3p194 :021 > b
 => {[1]=>1} 
1.9.3p194 :022 > b[a]
 => nil 
1.9.3p194 :023 > b.keys.include? a
 => true

What am I doing wrong?

Update: OK. Use a.clone is absolutely one way to deal with this problem. What if I want to change "a" but still use "a" to retrieve the corresponding value (since "a" is still one of the keys) ?

Here's another snippet to think about: pastie.org/4609694

Sergio Tulentsev
– Sergio Tulentsev

2012-08-29 11:56:47 +00:00
Commented Aug 29, 2012 at 11:56 — Sergio Tulentsev
– Sergio Tulentsev, Commented Aug 29, 2012 at 11:56
weird! seems like a bug?

tybro0103
– tybro0103

2012-08-29 12:57:05 +00:00
Commented Aug 29, 2012 at 12:57 — tybro0103
– tybro0103, Commented Aug 29, 2012 at 12:57

steenslag · Accepted Answer · 2012-08-29 13:37:37Z

19

The #rehash method will recalculate the hash, so after the key changes do:

b.rehash

answered Aug 29, 2012 at 13:37

steenslag

80.2k16 gold badges144 silver badges174 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Eagle Over a year ago

This method is what I want. Thank you!!

Jason Over a year ago

Although this may solve the OP's problem, there is a deeper issue at play. #rehash should never need calling to begin with. A need to call #rehash probably means there is something else wrong. There is another answer stackoverflow.com/a/36821695/361855 that talks about some better solutions.

Beni Cherniavsky-Paskin · Accepted Answer · 2019-09-19 11:06:41Z

TL;DR: consider Hash#compare_by_indentity

You need to decide if you want the hash to work by array value or array identity.

By default arrays .hash and .eql? by value, which is why changing the value confuses ruby. Consider this variant of your example:

pry(main)> a = [1, 2]
pry(main)> a1 = [1]
pry(main)> a.hash
=> 4266217476190334055
pry(main)> a1.hash
=> -2618378812721208248
pry(main)> h = {a => '12', a1 => '1'}
=> {[1, 2]=>"12", [1]=>"1"}
pry(main)> h[a]
=> "12"
pry(main)> a.delete_at(1)
pry(main)> a
=> [1]
pry(main)> a == a1
=> true
pry(main)> a.hash
=> -2618378812721208248
pry(main)> h[a]
=> "1"

See what happened there? As you discovered, it fails to match on the a key because the .hash value under which it stored it is outdated [BTW, you can't even rely on that! A mutation might result in same hash (rare) or different hash that lands in the same bucket (not so rare).]

But instead of failing by returning nil, it matched on the a1 key.
See, h[a] doesn't care at all about the identity of a vs a1 (the traitor!). It compared the current value you supply — [1] with the value of a1 being [1] and found a match.

That's why using .rehash is just band-aid. It will recompute the .hash values for all keys and move them to the correct buckets, but it's error-prone, and may also cause trouble:

pry(main)> h.rehash
=> {[1]=>"1"}
pry(main)> h
=> {[1]=>"1"}

Oh oh. The two entries collapsed into one, since they now have the same value (and which wins is hard to predict).

Solutions

One sane approach is embracing lookup by value, which requires the value to never change. .freeze your keys. Or use .clone/.dup when building the hash, and feel free to mutate the original arrays — but accept that h[a] will lookup the current value of a against the values preserved from build time.

The other, which you seem to want, is deciding you care about identity — lookup by a should find a whatever its current value, and it shouldn't matter if many keys had or now have the same value.
How?

Object hashes by identity. (Arrays don't because types that .== by value tend to also override .hash and .eql? to be by value.) So one option is: don't use arrays as keys, use some custom class (which may hold an array inside).
But what if you want it to behave directly like a hash of arrays? You could subclass Hash, or Array but it's a lot of work to make everything work consistently. Luckily, Ruby has a builtin way: h.compare_by_identity switches a hash to work by identity (with no way to undo, AFAICT). If you do this before you insert anything, you can even have distinct keys with equal values, with no confusion:
```
[39] pry(main)> x = [1]
=> [1]
[40] pry(main)> y = [1]
=> [1]
[41] pry(main)> h = Hash.new.compare_by_identity
=> {}
[42] pry(main)> h[x] = 'x'
=> "x"
[44] pry(main)> h[y] = 'y'
=> "y"
[45] pry(main)> h
=> {[1]=>"x", [1]=>"y"}
[46] pry(main)> x.push(7)
=> [1, 7]
[47] pry(main)> y.push(7)
=> [1, 7]
[48] pry(main)> h
=> {[1, 7]=>"x", [1, 7]=>"y"}
[49] pry(main)> h[x]
=> "x"
[50] pry(main)> h[y]
=> "y"
```
Beware that such hashes are counter-intuitive if you try to put there e.g. strings, because we're really used to strings hashing by value.

waldrumpus · Accepted Answer · 2012-08-29 13:04:14Z

2

Hashes use their key objects' hash codes (a.hash) to group them. Hash codes often depend on the state of the object; in this case, the hash code of a changes when an element has been removed from the array. Since the key has already been inserted into the hash, a is filed under its original hash code.

This means you can't retrieve the value for a in b, even though it looks alright when you print the hash.

answered Aug 29, 2012 at 13:04

waldrumpus

2,59020 silver badges50 bronze badges

Comments

Kulbir Saini · Accepted Answer · 2012-08-29 11:50:18Z

1

You should use a.clone as key

irb --> a = [1, 2]
==> [1, 2]

irb --> b = { a.clone => 1 }
==> {[1, 2]=>1}

irb --> b[a]
==> 1

irb --> a.delete_at(1)
==> 2

irb --> a
==> [1]

irb --> b
==> {[1, 2]=>1} # STILL UNCHANGED

irb --> b[a]
==> nil # Trivial, since a has changed

irb --> b.keys.include? a
==> false # Trivial, since a has changed

Using a.clone will make sure that the key is unchanged even when we change a later on.

edited Aug 29, 2012 at 11:50

answered Aug 29, 2012 at 11:44

Kulbir Saini

3,9151 gold badge28 silver badges34 bronze badges

4 Comments

Sergio Tulentsev Over a year ago

How do you explain the original snippet? When keys contains a, but value can't be retrieved?

Kulbir Saini Over a year ago

@SergioTulentsev You are right. From that perspective, it's weird because b.keys[0].object_id == a.object_id returns true after deleting the key when a is used instead of a.clone.

waldrumpus Over a year ago

@SergioTulentsev The reason for that is that the hash code a.hash is different after removing an element from the array, even if the object stays the same. This, the key can't be found anymore.

waldrumpus Over a year ago

@SergioTulentsev When looking up a key in a hash, the hash code is being used - no luck in this case, because the code has changed since the insertion of the key. The keys property of the hash, however, is an array, and in array search the equal? method is used for equality testing. Thus, the value is found in the keys array.

waldrumpus · Accepted Answer · 2012-08-29 11:52:22Z

1

As you have already said, the trouble is that the hash key is the exact same object you later modify, meaning that the key changes during program execution.

To avoid this, make a copy of the array to use as a hash key:

a = [1, 2]
b = { a.clone => 1 }

Now you can continue to work with a and leave your hash keys intact.

answered Aug 29, 2012 at 11:52

waldrumpus

2,59020 silver badges50 bronze badges

2 Comments

Sergio Tulentsev Over a year ago

I think, he wants to be able to modify arrays and still be able to retrieve the values using those modified versions. I might be wrong.

Eagle Over a year ago

Yes, that's what I want. Can't I change "a" but still use "a" as a key?

Collectives™ on Stack Overflow

Hash use array as key in ruby

5 Answers 5

2 Comments

You need to decide if you want the hash to work by array value or array identity.

Solutions

Comments

Comments

4 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

You need to decide if you want the hash to work by array value or array identity.

Solutions

Comments

Comments

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related