Using a vector < pair < string, int > > to represent a hash table

Question

I am trying to represent a hash table as a vector of pair < string, int>. I am using a hash function to return the value of the index of the vector where I wish to place the pair. I have been able to successfully create a pair and index the pair's string with the hash function. Now that I know where I want to place my pair in my vector I try to put it there but my program has a segmentation fault at this point. My hash function:

size_t hashfunction(const string& ident){
    unsigned hash = 0;
    for(int i = 0; i < ident.size(); ++i) {
       char c = ident[i];
       hash ^= c + 0x9e3779b9 + (hash<<6) + (hash>>2);
    }
    return hash;
}

My main function:

int main(){
    vector < pair < string, int > > hashtable;
    pair <string, int> testone ("bartering", 5);

    size_t testoneindex = hashfunction(testone.first);
    hashtable[testoneindex] = testone;
    return 0;

}

This section of code compiles but produces a segmentation fault at the line

hashtable[testoneindex] = testone;

What am I doing wrong?

You need to modulo your hash index down to the range of indices in your vector. For example, initialize your vector to have 1000 buckets, and use hashfunction(..) % 1000. That brings the next question: how do you plan to handle hash collisions? — Joe Z
– Joe Z, Commented Dec 7, 2013 at 20:00
Thanks for you quick answer. That change worked perfectly. I was planning on using linear probing to handle hash collisions. — user3078377
– user3078377, Commented Dec 7, 2013 at 20:12

bobah · Accepted Answer · 2013-12-07 20:13:40Z

1

You cannot realistically have your container done this way because of the memory required. Instead you'd want the container and insertion code to be closer to classic hash container design, something like this:

typedef pair <string, int> value_t;
value_t val;
vector<list<value_t>> buckets;
buckets.resize(current_size);
auto& bucket = buckets[hashfunc(val.first) % buckets.size()];
auto itr = find_if(bucket.begin(), bucket.end(), [&](value_t const& other) {
    return other.first == val.first;
});
if (itr == bucket.end()) bucket.push_back(val);

answered Dec 7, 2013 at 20:13

bobah

19k2 gold badges43 silver badges74 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Joe Z · Accepted Answer · 2013-12-07 20:16:49Z

1

You need to modulo your hash index down to the range of indices in your vector. For example, initialize your vector to have 1000 buckets, and use hashfunction(..) % 1000.

answered Dec 7, 2013 at 20:16

Joe Z

18k3 gold badges30 silver badges39 bronze badges

Comments

Dietmar Kühl · Accepted Answer · 2013-12-07 20:19:44Z

The std::vector<...> you created is empty. Placing an object anywhere in this object won't work. You need to resize the hashtable object to a suitable size, i.e., you need to give that object the number of buckets, e.g., using

std::size_t number_of_buckets = ...;
std::vector<std::pair<std::string, int> > hashtable(number_of_buckets);

Note, that the approach you take for hashing is a bit too simplistic, though: especially for smaller number of buckets there is a chance that two different hashes as keyed to the same bucket. That is, you'll need to deal with collisions. The two approaches for dealing with collisions I'm aware of are

Determine a new bucket with a key if the first bucket found is already used (and keep searching for new buckets until an empty bucket is found). The main issue with this approach is that you can't really remove objects as other rebucketed objects can be found.
Use a list in each bucket for all the keys which use the same bucket. This approach has the additional advantage that you don't need to create any key or value until a bucket is actually used (you'd have the lists, though, but this can be made fairly cheap).

Collectives™ on Stack Overflow

Using a vector < pair < string, int > > to represent a hash table

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related