Hashing in Computer Science
The term hashing is not common to a lot of people but it is common to Computer scientists as it is part of the core File processing course that each computer scientist has to undergo during their four (4) or five(5) years of study. This article would answer a lot of questions related to hashing so ensure that you read the article thoroughly.
Meaning of Hashing
Hashing is a technique used basically for security purposes and it is an efficient technique to directly search the location of desired data on the hash table also it is important to note that hashing is composed of two aspects namely:
- Hash Function
- Collision Resolution Method
There are numerous definitions for Hashing functions but the basic one is that Hashing function is a function that is used to map all the sets of search keys to actual record database and they are various types of hash functions and the one to use at a particular time is determined by the person using it. Below is a list of popular hash functions:
- F(key) = Key Mod N where N is the table size
- F(key) = Key Mod P where P is the smallest prime number
For example, let us say that we want to store a set of numbers 12, 32, 54,66,77,87 on a hash table, the hash function would be used to generate the address for each set of data and that is why a hash function is very important.
Collison Resolution Method
As the name implies, collision resolution methods are those methods used to get rid of collision. Collision is the process by which two keys hash to the same address.
Using the example in the previous explanation, if the address of 12 and 32 are the same, then that is what we call resolution and in order to avoid it, then we would have to use collision resolution methods.
Basically, there are two types of collision resolution method and those two are further divided into many examples. The following are the two collision resolution Methods:
- Open Addressing
- Coalesced Hashing/ Direct Chaining
In Open Addressing, all the keys are stored inside the Hash table and none of them is stored outside. Under open addressing, we have linear probing, quadratic probing, and double hashing.
Coalesced Hashing/ Direct Chaining
In this technique, a pointer is used to create an extension and the value is stored there. This technique uses a combination of both separate chaining and open addressing.
Now that we have known how one can resolve collisions, then it is also important to note the classes of hashing functions or techniques in hashing.
Other methods of resolving collision are:
- Changing the hashing function
- Reducing the packing factor
TECHNIQUES OF HASHING
Mod Hashing Method: The system looks for the ASCII code for each record and then divides it by a certain prime number, then stores the result as the index number.
FOLDING METHOD: In the folding method, and arithmetic technique is used to derive its index or hashing function and it is important to know that the folding method is divided into two categories namely:
- Folding by boundary
- Folding by Shifting
Folding By Boundary: The left and the right numbers are folded on a fixed boundary between them and the center. Also, the two outside values are reversed.
Folding By Shifting: The key values are divided into parts whose size matches the size of the required address and added.
Truncation: In truncation, a portion of the key is used as the address space and ignores the rest of the key.
Example: Suppose we have a number 1 2 3 4 5 6 7 8 9, 478 may be used as the index key.
Radix Transformation: Radix transformation can involve changing the number of bases from one base to another.
Polynomial Hashing: In polynomial hashing, the function is used to produce the cyclic check bytes for checking for transmission errors in data that is f (transformation area) – Cyclic check byte
From what you have read so far, you can see that hashing is a very broad and important course for aspiring computer scientists because storage is a very important aspect of computer science also please ensure to read up on linear probing, and quadratic probing because they may seem similar but they are not.