C Programming and Data Structures: Unit IV: b. Hashing

Collision Handling

Definition, Operations, Structure, Example C programs

Definition: If collisions occur then it should be handled by applying some techniques, such techniques are called collision handling techniques.

Collision Handling

AU: Dec.-15, 18, May - 16, 17, 19, Marks 16

Definition: If collisions occur then it should be handled by applying some techniques, such techniques are called collision handling techniques.


1. Chaining

1. Chaining without replacement

In collision handling method chaining is a concept which introduces an additional field with data i.e. chain. A separate chain table is maintained for colliding data. When collision occurs we store the second colliding data by linear probing method. The address of this colliding data can be stored with the first colliding element in the chain table, without replacement.

For example consider elements,

131, 3, 4, 21, 61, 6, 71, 8, 9

From the example, you can see that the chain is maintained the number who demands for location 1. First number 131 comes we will place at index 1. Next comes 21 but collision  occurs so by linear probing we will place 21 at index 2, and chain is maintained by writing 2 in chain table at index 1 similarly next comes 61 by linear probing we can place 61 at index 5 and chain will be maintained at index 2. Thus any element which gives hash key as 1 will be stored by linear probing at empty location but a chain is maintained so that traversing the hash table will be efficient.

The drawback of this method is in finding the next empty location. We are least bothered about the fact that when the element which actually belonging to that empty location cannot obtain its location. This means logic of hash function gets disturbed.


2. Chaining with replacement

The method of chaining without replacement has a drawback of loosing the meaning of the hash function, to overcome this drawback the method known as chaining with replacement is introduced. Let us discuss the example to understand the method. Suppose we have to store following elements:

131, 21, 31, 4, 5


Now next element is 2. As hash function will indicate hash key as 2 but already at index 2. We have stored element 21. But we also know that 21 is not of that position at which currently it is placed.

Hence we will replace 21 by 2 and accordingly chain table will be updated. See the table:


The value -1 in the hash table and chain table indicate the empty location. The advantage of this method is that the meaning of hash function is preserved. But each time some logic is needed to test the element, whether it is at its proper position.

 

2. Open Addressing

Open addressing is a collision handling technique in which the entire hash table is searched in systematic way for empty cell to insert new item if collision occurs.

Various techniques used in open addressing are

1. Linear probing

2. Quadratic probing

3. Double hashing

1. Linear probing

When collision occurs i.e. when two records demand for the same location in the hash table, then the collision can be solved by placing second record linearly down wherever the empty location is found.

For example


In the hash table given in Fig. 6.4.3 the hash function used is number % 10. If the first number which is to be placed is 131 then 131 % 10 = 1 i.e. remainder is 1 so hash key = 1. That means we are supposed to place the record at index 1. Next number is 21 which gives hash key 1 as 21 % 10 = 1. But already 131 is placed at index 1. That means collision is occurred. We will now apply linear probing. In this method, we will search the place for number 21 from location of 131. In this case we can place 21 at index 2. Then 31 at index 3. Similarly 61 can be stored at 6 because number 4 and 5 are stored before 61. Because of this technique, the searching becomes efficient, as we have to search only limited list to obtain the desired number.

 

Ex. 6.4.1: Implement the hash table and perform collision handling by linear probing

Sol. :

#include <stdio.h>

#include<conio.h>

void display(int a[ ], int n)

{

// Displaying complete hash table

for (int i = 0; i < n; i++)

{

printf("\n %d %d",i,a[i]);

}

}

void Linear_prob(int table[ ], int tsize,int num)

{

int key = num % tsize;//Computing the hash key using hash function

while (table[key]!= -1)//if empty slot is not found

{

key (key+1)%tsize;//then move linearly dow in search of empty slot

}

table[key]=num ;//if empty slot gets then insert the element

display(table, tsize);

}

int main()

{

int SIZE = 10;// Size of the hash table

int num;

int hash_table[SIZE];

char ans;

// Initializing the hash table

for (int i = 0; i < SIZE; i++)

{

hash_table[i] = -1;//-1 indicates empty slot

}

do

{

printf("\nEnter The Number: ");

scanf("%d", &num);//This element is to be inserted in hash table

Linear_prob(hash_table,SIZE,num);

printf("\n Do U Wish To Continue? (y/n)");

ans = getch();

}while(ans=='y');

return 0;

}

Output

Enter The Number: 131

0 -1

1 131

2 -1

3 -1

4 -1

5 -1

6 -1

7 -1

8 -1

9 -1

Do U Wish To Continue?(y/n)

Enter The Number: 21

0 -1

1 131

2  21

3 -1

4 -1

5 -1

6 -1

7 -1

8 -1

9 -1

Do U Wish To Continue?(y/n)

Enter The Number: 3

1- 0

1 131

2 21

3  3

4 -1

5 -1

6 -1

7 -1

8 -1

9 -1

Do U Wish To Continue? (y/n)

Enter The Number: 4

0 -1

1  131

2  21

3  3

4 4

5 -1

6 -1

7 -1

8 -1

9 -1

Do U Wish To Continue?(y/n)

Enter The Number: 5

0 -1

1 131

2 21

3 3

4 4

5

LO

5

6 -1

7 -1

8 -1

9 -1

Do U Wish To Continue? (y/n)

Enter The Number: 8

0 -1

1

131

2 21

3 3

4 4

5 5

6 -1

7 -1

8  8

9 -1

Do U Wish To Continue?(y/n)

Enter The Number: 9

0 -1

1 131

2  21

3   3

4  4

5 5

6 -1

7 -1

8  8

9  9

Do U Wish To Continue?(y/n)

Enter The Number: 18

0  18

1  131

2  21

3  3

4  4

5  5

6  -1

7  -1

8  8

9  9

Do U Wish To Continue?(y/n)

Enter The Number: 33

0  18

1  131

2  21

3  3

4  4

5  5

6  33

7 -1

8  8

9  9

Do U Wish To Continue?(y/n)

Problem with linear probing

One major problem with linear probing is primary clustering. Primary clustering is a process in which a block of data is formed in the hash table when collision is resolved.

For example:


This clustering problem can be solved by quadratic probing.

2. Quadratic probing

Quadratic probing operates by taking the original hash value and adding successive values of an arbitrary quadratic polynomial to the starting value. This method uses following formula -

Hi(key) = (Hash(key)+i2)%m

where m can be a table size or any prime number.

For example: If we have to insert following elements in the hash table with table size 10: 37, 90, 55, 22, 11, 17, 49, 87.

We will fill the hash table step by step


Now if we want to place 17 a collision will occur as 17%10-7 and bucket 7 has already an element 37. Hence we will apply quadratic probing to insert this record in the hash table.

Hi(key) = (Hash(key)+i2)%m

we will choose value i = 0, 1, 2, whichever is applicable.

Consider i = 0 then

 (17+02) %10 = 7

 (17+12) %10 = 8, when i = 1

The bucket 8 is empty hence we will place the element at index 8.

Then comes 49 which will be placed at index 9.

49%10 = 9

Now to place 87 we will use quadratic probing.

(87+0)%10 = 7

(87 + 1)%10 = 8 ... but already occupied

(87 +22)%10 = 1 ... already occupied

 (87+32)%10 = 6... this slot is free

We place 87 at 6th index.


It is observed that if we want to place all the necessary elements in the hash table the size of divisor (m) should be twice as large as total number of elements.

 

Ex. 6.4.2 Write a C program to implement quadratic probing.

Sol. :

#include <stdio.h>

#include<conio.h>

void display(int a[], int n)

{

// Displaying complete hash table

for (int i = 0; i < n; i++)

{

printf("\n %d %d",i,a[i]);

}

}

void Quadratic_prob(int table[], int tsize,int num)

{

int key = num % t using size;//computing the hash key mod function

// Insert in the table if there is empty slot

if (table[key] == -1)

table[key] = num;

else.

{

for (int i = 0; i < tsize; i++)//using formula for quadratic probing

{

int index = (key + i* i) % tsize;

if (table[index] == -1)//if empty slot is found

{

table[index] break; = num;//then insert element at that index

break;

}

}

}

display(table, tsize);

}

int main()

{

int SIZE = 10;// Size of the hash table

int num;

int hash_table[SIZE];

char ans;

// Initializing the hash table

for (int i = 0; i < SIZE; i++)

{

hash_table[i] = -1;//-1 indicates empty slot

}

do

{

printf("\nEnter The Number: ");

scanf("%d", &num);//Number to be inserted in the hash table

Quadratic prob(hash_table,SIZE,num);

printf("\n Do U Wish To Continue?(y/n)");

ans = getch();

while(ans= ='y');

return 0;

}

Output

Enter The Number: 37

0 -1

1 -1

2 -1

3 -1

4 -1

5 -1

6 -1

7 37

8 -1

9 -1

Do U Wish To Continue?(y/n)

Enter The Number: 90

0 90

1 -1

2 -1

3 -1

4 -1

5 -1

6 -1

7 37

8 -1

9 -1

Do U Wish To Continue?(y/n)

Enter The Number: 55

0  90

1 -1

2  -1

3  -1

4  -1

5  55

6 -1

7  37

8  -1

9 -1

Do U Wish To Continue?(y/n)

Enter The Number: 22

0  90

1  -1

2  22

3 -1

4  -1

5  55

6 -1

7  37

8 -1

9 -1

Do U Wish To Continue?(y/n)

Enter The Number: 11

0  90

1  11

2  22

3  -1

4  -1

5  55

6 -1

7  37

8  -1

9 -1

Do U Wish To Continue?(y/n)

Enter The Number: 17

0  90

1 11

2  22

3  -1

4  -1

5  55

6  -1

7  37

8  17

9  -1

Do U Wish To Continue?(y/n)

Enter The Number: 49

0  90

1  11

2  22

3  -1

4  -1

5  55

6  -1

7  37

8  17

9  49

Do U Wish To Continue?(y/n)

Enter The Number: 87

0  90

1  11

2  22

3  -1

4  -1

5  55

6  87

7  37

8  17

9  49

Do U Wish To Continue?(y/n)

3. Double hashing

Double hashing is technique in which a second hash function is applied to the key when a collision occurs. By applying the second hash function we will get the number of positions from the point of collision to insert.

There are two important rules to be followed for the second function :

• It must never evaluate to zero.

• Must make sure that all cells can be probed.

The formula to be used for double hashing is

H1(key) = key mod tablesize

H2(key) = M (key mod M)

where M is a prime number smaller than the size of the table.

Consider the following elements to be placed in the hash table of size 10

37, 90, 45, 22, 17, 49, 55

Initially insert the elements using the formula for H1(key).

Insert 37, 90, 45, 22.


Now if 17 is to be inserted then

H1(17) = 17%10 = 7

H2(key) = M - (key%M)

Here M is a prime number smaller than the size of the table. Prime number smaller than table size 10 is 7.

Hence M = 7

H2(17) = 7 (17%7) = 7 – 3 = 4


That means we have to insert the element 17 at 4 places from 37. In short we have to take 4 jumps. Therefore the 17 will be placed at index 1.

Now to insert number 55.

H1 (55) 55%10 = 5... collision

H2(55) = 7-(55%7) = 7 - 6 = 1

That means we have to take one jump from index 5 to place 55. Finally the hash table will be -


Comparison of quadratic probing and double hashing

The double hashing requires another hash function whose probing efficiency is same as some another hash function required when handling random collision.

The double hashing is more complex to implement than quadratic probing. The quadratic probing is fast technique than double hashing.

 

Ex. 6.4.3 Give the input {4371, 1323, 6173, 4199, 4344, 9679, 1989} and hash function X(mod 10), show the results for the following:

i) Open addressing hash table using linear probing

ii) Open addressing hash table using quadratic probing

iii) Open addressing hash table with second hash function h2 (X) = 7 - (X mod 7).

AU: Dec.-18, Marks 15

Sol. i) Open addressing hash table using linear probing :

We assume mod function as mod 10.

4371 mod 10 = 1

1323 mod 10 = 3

6173 mod 10 = 3 collision occurs

Hence by linear probing we will place 6173 at next empty location. That is, at location 4.

4199 mod 10 = 9

4344 mod 10 = 4 but location 4 is not empty.

Hence we will place 4344 at next empty location i.e. 5,

9679 mod 10 = 9 collision occurs so place at next location at 0. The hash table is of size 10. Hence we find the next empty location by rolling the table in forward direction.

1989 mod 10 = 9 collision occurs, so we find the next empty location at index 2.

The hash table will then be


ii) Open addressing hash table using quadratic probing

In quadratic probing we consider the original hash key and then add an arbitrary polynomial. This sum is then considered for hash function. The hash function will be

H(Key) = (Key +i2) % me-

where m can be a table size

If we assume m = 10, then the numbers can be inserted as follows -


iii) Open addressing hash table with second hash function

Step 1:


Step 2:


Step 3:


Step 4:

 

Ex. 6.4.4 What do you understand by collision in hashing? Represent the following keys in memory using linear probing with or without replacement. Use modulo (10) as your hashing function: (24, 13, 16, 15, 19, 20, 22, 14, 17, 26, 84, 96)

Sol. Collision in hashing - Refer section 6.4.

i) Linear probing with replacement

We will consider the hash function as modulo (10) for the hash table size 12, that is from 0 to 11. In linear probing with replacement, we first find the probable position of the key element using hash function. If the location which we obtain from hash function is empty then place the corresponding key element at that location. If the location is not empty and the key element which is present at that location belongs to that location only then, move down in search of empty slot. Place the record at the empty slot.

If the location contains a record which does not belong to that location then replace that record by the current key element. Place the replaced record at some empty slot, which can be obtained by moving linearly down.


We get the empty location at index 7. Hence 14 is placed at index 7.




Ex. 6.4.5: Consider a hash table with 9 slots. The hash function is h(k) = k mod 9. The following keys are inserted in the order 5, 28, 19, 15, 20, 33, 12, 17, 10. Draw the contents of the hash table when the collisions are resolved by

(i) Chaining (ii) Linear probing (iii) Double hashing.

The second hash function h2(x) = 7-(x mod 7).

AU: May-19 Marks 13

Sol.: i) Chaining


Similarly remaining elements can be placed. When collision occurs, double hashing is applied.


 

3. Rehashing

Rehashing is a technique in which the table is resized, i.e., the size of table is doubled by creating a new table. It is preferable if the total size of table is a prime number. There are situations in which the rehashing is required –

• When table is completely full.

• With quadratic probing when the table is filled half.

• When insertions fail due to overflow.

In such situations, we have to transfer entries from old table to the new table by recomputing their positions using suitable hash functions.

Consider we have to insert the elements 37, 90, 55, 22, 17, 49 and 87. The table size is 10 and I will use hash function,


Now this table is almost full and if we try to insert more elements collisions will occur and eventually further insertions will fail. Hence we will rehash by doubling the table size. The old table size is 10 then we should double this size for new table, that becomes 20. But 20 is not a prime number, we will prefer to make the table size as 23. And new hash function will be

Now the hash table is sufficiently large to accommodate new insertions.

Advantages

1. This technique provides the programmer a flexibility to enlarge the table size if required.

2. Only the space gets doubled with simple hash function which avoids occurrence of collisions.

Review Questions

1. Explain the following collision resolution strategies with example.

i) Separate chaining ii) Linear probing iii) Quadratic probing

AU: Dec.-15, Marks 16

2. Explain the following: Rehashing.

AU: May-16, Marks 8

3. Illustrate with example the open addressing and chaining methods of collision resolution techniques in hashing. AU: May-17, Marks 16

4. Explain open addressing in detail. AU: Dec.-18, Marks 6

5. When do you perform rehashing? Illustrate with example.

AU: May-19, Marks 8

 

C Programming and Data Structures: Unit IV: b. Hashing : Tag: : Definition, Operations, Structure, Example C programs - Collision Handling