1

I am trying to implement a generic HashTable. (This question is in continuation to the question asked here)

I have implemented the generic hash function for the case when the size of the table is fixed. But, in real time situations, it's a pretty bad idea to use a HashTable whose size is initially fixed to about the 2^32 bits since it might lead to a lot of memory wastage.

So, what I am now trying to do is to dynamically increase the size of the hast table, from some initial value, whenever it's full.

But, when I do this the hash function will now return new values to the the previously hashed keys.

Is there any way to overcome this problem other than re-hashing the values previously hashed values with the new ones.

3
  • Re-hashing is what most hash tables do. It's not that bad (you're touching the whole table anyway when resizing). Why do you want to avoid it? Commented Apr 27, 2013 at 10:10
  • The error has to do with "dynamically increasing the size of the hast table" then. You need to show some code. Commented Apr 27, 2013 at 10:10
  • @dasblinkenlight: It's not an error, the behavior is expected since the hash functions generally use values modulo something to ascertain that the returned hashed values are within the size of the hash table. Commented Apr 27, 2013 at 10:13

1 Answer 1

0

You cannot avoid rehashing: the position of the bucket where an element ends up inside your hash table depends on two or three things, depending on your collision resolution strategy:

  • The hash code of the element,
  • The size of the table, and
  • When you use linear probing, it's also the presence or absence of prior elements with hash codes that put them in the same bucket

If you change any of these three factors, you need a full rehash: unless you do something really bad, such as picking a non-prime table size, the value of the hashCode % tableSize exression that determines the position is going to be different when you change the tableSize. The presence or absence of elements that hash to the same bucket in linear probing is going to change as well. That is why you need a full rehash.

Sign up to request clarification or add additional context in comments.

2 Comments

What you have mentioned is true. To avoid collisions, I have implemented the hash table as an array of linked lists. So, to re-hash there will a lot of messy code flowing around with a lot of pointer manipulation. Can you provide a better way of implementing the collision avoidance which can avoid messy pointer manipulation in the code since it generally leaves a code smell?
@Sankalp With the strategy that you use (it's called "separate chaining"), the code shouldn't be all that messy: all you need is walking all lists in all buckets, and hashing them into their new places. There are other collision resolution strategies that you can use: for example, linear probing lets you avoid linked lists. This other strategy may or may not be right for your set of hash keys, though.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.