Search⌘ K
AI Features

The Hash Function

Explore the role of hash functions in managing key sizes within C++ hash tables. Understand how these functions convert large keys into valid array indices using methods like modular arithmetic, truncation, and folding to optimize data structure performance.

Restricting the Key Size

In the last lesson, we learned that an array could be used to implement a hash table in C++. A key is used to map a value on the array, and the efficiency of a hash table depends on how a key is computed. At first glance, you may observe that we can directly use the indices as keys because each index is unique.

The only problem is that the key would eventually exceed the size of the array, and at every insertion, the array would need to be resized. We can increase the array size by increasing their capacity exponentially, but the process still takes O(n) time because it will copy all the elements into the new array.

In order to limit the range of the keys to the boundaries of the array, we need a function that converts a large key into a smaller key. This is the job of the hash function.

What Hash Functions Do?

Have a look at the following illustration to get the analogy of a hash function.

A hash function simply takes an item’s key and returns the corresponding index in the array for that item. Depending on your program, the calculation of this index can be simple arithmetic or a very complicated encryption method. However, it is very important to choose an efficient hashing function as it directly affects the performance of the hash table mechanism.

Let’s have a look at some of the most common hash functions used in modern programming.

Arithmetic Modular

In this approach, we take the modular of the key with the array size:

index=key MOD tableSizeindex = key \text{ } MOD \text{ } tableSize

Hence, the index will always stay between 0 and tableSize - 1.

C++
int hashModular(int key, int size){ //takes key and size of the list
return key % size; // return the index
}
int main() {
int size = 10;
int key = 47; // setting key
int index = hashModular(key, size); // Fit the key into the list size
cout << "The index for key " << key << " is " << index << endl;
}

Truncation

Select a part of the key as the index rather than the whole key. Once again, we can use a mod function for this operation, although it does not need to be based on the array size:

key=123456 > index=3456 key = 123456 \text{ } -> \text{ } index = 3456

C++
int hashTruncation(int key){
return key % 5000; // we will use key upto 2 digits
}
int main() {
int key = 123456; // setting key
int index = hashTruncation(key); // Fit the key into the list size
cout << "The index for key " << key << " is " << index << endl;
}

Folding

Divide the key into small chunks and apply a different arithmetic strategy at each chunk. For example, you add all the smaller chunks together:

key=456789,  chunk=2 > index=45+67+89key = 456789,\text{ }\text{ } chunk = 2 \text{ } -> \text{ } index = 45+67+89

C++
int hashFold(int key, int chunkSize) {
cout << "Key: " << key << endl;
string strKey = std::to_string(key); // Convert integer into string for slicing
int hashVal = 0, tempNum=0;
string temp;
cout << "Chunks: ";
// increment i to chunksize everytime
for(int i = 0; i < strKey.length(); i+=chunkSize){
temp = "";
if(i + chunkSize <= strKey.length()) //check if chunksize is less than equal to key
{
for(int j=i; j< i+chunkSize; j++) {
temp += strKey[j];
cout << strKey[j];
}
cout << " ";
// converting string to integer
stringstream conv(temp);
conv >> tempNum;
hashVal = hashVal + tempNum; // adding sliced number to hashVal
}
else{
for(int j = i; j <= strKey.length(); j++){
temp += strKey[j];
cout << strKey[j];
}
// converting string to integer
stringstream conv(temp);
conv >> tempNum;
hashVal = hashVal + tempNum;// adding sliced number to hashVal
}
}
return hashVal;
}
int main() {
int key = 456789;
int chunkSize = 2;
cout << endl << "Hash Key: " << hashFold(key, chunkSize) << endl;
return 0;
}