Hash Table Organization

  • Uploaded by: akhot86
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Hash Table Organization as PDF for free.

More details

  • Words: 798
  • Pages: 13
Hash Table Organization • The search prediction depends on the value of s i.e., e is a function of s • Three possibilities exist concerning the predicted entry – the entry may be occupied by s, the entry may be occupied by some other symbol, or the entry may be empty • The situation in the second case (s ≠ se), is called a collision. • Following a collision, the search continues with a new prediction. • In the third case, s is entered in the predicted entry

Algorithm 1. e := h (s) 2. Exit with success if s = se, and with failure if entry e is unoccupied. 3. Repeat steps 1 and 2 with different functions h' and h'', etc. The function h is called the hashing function

Notations n : number of entries in the table f : number of occupied entries in the table k : number of distinct symbols in the source language kp: Number of symbols used in some source program Sp: Set of symbols used in some source program N: Address table of the table K: key space of the system Kp: key space of a program

Properties of a Hash Function • A hashing function has the property that 1 ≤ h(symb) ≤ n • If k ≤ n, we can select a one-one function as the hashing function h • This will eliminate collisions in the symbol table since entry number e given by e = h(s) can only be occupied by s • We refer to the organization as direct entry organization

Properties of a Hash Function • However, k is a very large number in practice hence use of one-to-one function will require a very large symbol table. • For good performance it is adequate if the hashing function implements a mapping Kp => N • The effectiveness of a hashing organization depends on the average value of ps. • For a given size of the hash table, the value of ps can be expected to increase with the value of kp.

Hashing Functions • While hashing, the representation of s is treated as a binary number • The hashing function performs numerical transformation on this number to obtain e • Let the representation has b bits in it and let the host computer uses m bit arithmetic. Call it rs • If b ≤ m, the representation of s can be padded with 0’s. If b > m the representation of s is split into pieces of m bits each, and the bitwise OR operations are performed to obtain rs

Hashing Functions A hashing function h possess the following properties to ensure good search performance: 1. h should not be sensitive to the symbols in Sp. Thus, the value of ps should only depend on kp. 2. h should execute reasonably fast

Hashing Functions Two classes of such hashing functions: • Multiplication Function: Analogous to functions used in random number generator h(s)=(a × rs + b) mod 2m. The table size should be a power of 2, say 2g, such that lower order g bits of h(s) can be used as e • Division function: h(s) = remainder of rs/n + 1 where n is the size of the table. If n is the prime number, it is called prime number hashing

Hashing Functions • A multiplication function has the advantage of being slightly faster but suffers from the drawback that the table size has to be a power of 2. • Prime division hashing is slower but has the advantage that prime numbers are more closely spaced than powers of 2. This provides a wider choice of table size

Collision Handling Two approaches: • Rehashing: accommodate the colliding entry elsewhere in the hash table • Overflow Chaining: accommodate the colliding entry in the separate table

Rehashing • Uses a sequence of hashing functions h1, h2,… to resolve collisions • If the collision occurs at hi(s), then a new prediction can be made at hi+1 (s) • Sequential rehashing: hi+1 (s) = hi(s) mod n + 1 • Drawback: colliding entry accommodated elsewhere may lead to more collisions. This may lead to clustering of entries in the table.

Overflow Chaining • Avoids the problem of clustering by accommodating the entries in the separate table called the overflow table • Thus, a search which encounters a collision in the primary table has to be continued in the overflow table. • A pointer field is added in the primary and overflow tables Symbol

Other info

Pointer

Overflow Chaining • Drawback: extra memory requirement • An organization called as scatter table is often used to reduce the memory requirements. • The hash table contains only pointers and all the symbol table entries are stored in the overflow table

Related Documents

Hash
May 2020 6
Hash Code
June 2020 5
Organization
May 2020 20
Organization
May 2020 31

More Documents from "simply_coool"

June 2020 6