Adams - Hash Joins Oracle

  • Uploaded by: rockerabc123
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Adams - Hash Joins Oracle as PDF for free.

More details

  • Words: 693
  • Pages: 15
Hash Join Internals

Unix + Oracle =

Ixora

Steve Adams [email protected]

Hash Functions • A hash function maps arbitrary key values to integer “hash values” • Data is stored in an array based on the hash values • Data access is very efficient • just compute the hash value of the key and lookup the corresponding array element

Unix + Oracle =

Ixora

Steve Adams [email protected]

Hash Buckets • Too many possible hash values • hash tables would be large and sparse

• Hash values are mapped to hash buckets using the mod() function • 100056732 % 100 = 32

• Each hash table element is a “bucket” • Hash values that map to the same bucket are stored on a “collision chain” Unix + Oracle =

Ixora

Steve Adams [email protected]

Numeric keys • Key data itself is the hash value • Hash bucket is computed using MOD • binary hashes are cheap to compute • just a bitwise SHIFT operation

• prime number hashes randomize the distribution of keys to hash buckets • this prevents uneven or “skew” distributions

Unix + Oracle =

Ixora

Steve Adams [email protected]

Non-numeric Keys • Internal hash function to get hash value • examples: • ‘Adams’  3016007180 • ‘Millsap’  1765538108

• DBMS_UTILITY.GET_HASH_VALUE()

• Binary MOD() used to map hash values to hash buckets

Unix + Oracle =

Ixora

Steve Adams [email protected]

Hash Tables

Key Value

Hash Function

Hash Value

Hash Bucket Number

Hash Table hash bucket headers

Bucket 1

Hash Value

collision chains

Bucket 2

Bucket 3

Bucket 4

Unix + Oracle =

Ixora

Key

Bucket 7

Bucket 8

Hash Value

Key

Hash Value

Key

Key

Key Hash Value

Hash Value

Bucket 6

Key Hash Value

Hash Value

Bucket 5

Hash Value

Key

Key

Steve Adams [email protected]

Hash Joins • Concept • read first row source and build a hash table • read second row source and join via hash table • applicable to (in)-equality joins & CBO only

• Approach • first “partition” both inputs by hash value • for corresponding pairs of partitions • build an in-memory hash table from one input • probe the hash table with rows from the other Unix + Oracle =

Ixora

Steve Adams [email protected]

Hash Table Partitioning Hash Table Data hash bucket bitmap

hash table partition buffers

Hash Bucket Bitmap

Partition 1

Partition 2

Partition 3

Partition 4

saved partition extents Unix + Oracle =

Ixora

Steve Adams [email protected]

Bit Vector Filtering • When partitioning the first input • build a bitmap of non-empty hash buckets

• When partitioning the second input • check the hash bucket bitmap • keys that map to empty hash buckets cannot be joined • for equality joins these rows can be immediately excluded

Unix + Oracle =

Ixora

Steve Adams [email protected]

Partition Histogram • Partition sizes will be uneven if the data distribution is skew • Partition histogram records for each partition pair • number of keys • bytes of memory required

• Allows dynamic role reversal • for each partition, the hash table is built from the input with the smaller memory requirement

• Allows optimum memory use when joining multiple partitions simultaneously Unix + Oracle =

Ixora

Steve Adams [email protected]

Hash Join Processing • Phases • initialization (planning memory use) • build input partitioning • probe input partitioning • may begin to return rows

• joining of saved partitions • may require sub-partitioning of large partitions

• Hash area memory used differently in each phase Unix + Oracle =

Ixora

Steve Adams [email protected]

Build Input Partitioning Hash Area Join Key

Non-Key Columns

Input Buffers Partition Histogram

Hash Function Partition Buffers

Hash Partition Number

Hash Value

Hash Bucket Number

Unix + Oracle =

Ixora

Join Key

Non-Key Columns

Hash Bucket Bitmap

Steve Adams [email protected]

Probe Input Partitioning Hash Area Input Buffers Partition Histogram

Hash Function

Hash Table

Output Rows

Partition Buffers Hash Bucket Bitmap

Unix + Oracle =

Ixora

Steve Adams [email protected]

Joining Saved Partitions Hash Area Input Buffers Partition Histogram

Hash Table

Unix + Oracle =

Ixora

Output Rows

Steve Adams [email protected]

Demonstration • Event 10104 to show hash join internals

Unix + Oracle =

Ixora

Steve Adams [email protected]

Related Documents

Joins Oracle
June 2020 1
Oracle 10g Joins
July 2020 1
Joins
May 2020 14
Hash
May 2020 6

More Documents from ""