Hash Join Internals
Unix + Oracle =
Ixora
Steve Adams
[email protected]
Hash Functions • A hash function maps arbitrary key values to integer “hash values” • Data is stored in an array based on the hash values • Data access is very efficient • just compute the hash value of the key and lookup the corresponding array element
Unix + Oracle =
Ixora
Steve Adams
[email protected]
Hash Buckets • Too many possible hash values • hash tables would be large and sparse
• Hash values are mapped to hash buckets using the mod() function • 100056732 % 100 = 32
• Each hash table element is a “bucket” • Hash values that map to the same bucket are stored on a “collision chain” Unix + Oracle =
Ixora
Steve Adams
[email protected]
Numeric keys • Key data itself is the hash value • Hash bucket is computed using MOD • binary hashes are cheap to compute • just a bitwise SHIFT operation
• prime number hashes randomize the distribution of keys to hash buckets • this prevents uneven or “skew” distributions
Unix + Oracle =
Ixora
Steve Adams
[email protected]
Non-numeric Keys • Internal hash function to get hash value • examples: • ‘Adams’ 3016007180 • ‘Millsap’ 1765538108
• DBMS_UTILITY.GET_HASH_VALUE()
• Binary MOD() used to map hash values to hash buckets
Unix + Oracle =
Ixora
Steve Adams
[email protected]
Hash Tables
Key Value
Hash Function
Hash Value
Hash Bucket Number
Hash Table hash bucket headers
Bucket 1
Hash Value
collision chains
Bucket 2
Bucket 3
Bucket 4
Unix + Oracle =
Ixora
Key
Bucket 7
Bucket 8
Hash Value
Key
Hash Value
Key
Key
Key Hash Value
Hash Value
Bucket 6
Key Hash Value
Hash Value
Bucket 5
Hash Value
Key
Key
Steve Adams
[email protected]
Hash Joins • Concept • read first row source and build a hash table • read second row source and join via hash table • applicable to (in)-equality joins & CBO only
• Approach • first “partition” both inputs by hash value • for corresponding pairs of partitions • build an in-memory hash table from one input • probe the hash table with rows from the other Unix + Oracle =
Ixora
Steve Adams
[email protected]
Hash Table Partitioning Hash Table Data hash bucket bitmap
hash table partition buffers
Hash Bucket Bitmap
Partition 1
Partition 2
Partition 3
Partition 4
saved partition extents Unix + Oracle =
Ixora
Steve Adams
[email protected]
Bit Vector Filtering • When partitioning the first input • build a bitmap of non-empty hash buckets
• When partitioning the second input • check the hash bucket bitmap • keys that map to empty hash buckets cannot be joined • for equality joins these rows can be immediately excluded
Unix + Oracle =
Ixora
Steve Adams
[email protected]
Partition Histogram • Partition sizes will be uneven if the data distribution is skew • Partition histogram records for each partition pair • number of keys • bytes of memory required
• Allows dynamic role reversal • for each partition, the hash table is built from the input with the smaller memory requirement
• Allows optimum memory use when joining multiple partitions simultaneously Unix + Oracle =
Ixora
Steve Adams
[email protected]
Hash Join Processing • Phases • initialization (planning memory use) • build input partitioning • probe input partitioning • may begin to return rows
• joining of saved partitions • may require sub-partitioning of large partitions
• Hash area memory used differently in each phase Unix + Oracle =
Ixora
Steve Adams
[email protected]
Build Input Partitioning Hash Area Join Key
Non-Key Columns
Input Buffers Partition Histogram
Hash Function Partition Buffers
Hash Partition Number
Hash Value
Hash Bucket Number
Unix + Oracle =
Ixora
Join Key
Non-Key Columns
Hash Bucket Bitmap
Steve Adams
[email protected]
Probe Input Partitioning Hash Area Input Buffers Partition Histogram
Hash Function
Hash Table
Output Rows
Partition Buffers Hash Bucket Bitmap
Unix + Oracle =
Ixora
Steve Adams
[email protected]
Joining Saved Partitions Hash Area Input Buffers Partition Histogram
Hash Table
Unix + Oracle =
Ixora
Output Rows
Steve Adams
[email protected]
Demonstration • Event 10104 to show hash join internals
Unix + Oracle =
Ixora
Steve Adams
[email protected]