Hotsos Symposium 2005
Memory Management and Latching Improvements in Oracle9i and 10g Tanel Põder independent consultant http://integrid.info Tanel Põder
integrid.info
1/34
Introduction & About • Name: Tanel Põder • Oracle experience: 7 years • Occupation: independent consultant • Company: integrid.info • Member of Oaktable Network • Oracle Certified Master DBA • Europe, Middle-East & Africa Oracle User Group BoD member Tanel Põder
integrid.info
2/34
Performance and Scalability • Performance – Avoiding unneccessary work
• Scalability – Avoiding blocking others doing the same operation (serialization)
• Sometimes sacrifices are needed in one area to gain in other – Direct-path insert – Redo copy latches
Tanel Põder
integrid.info
3/34
Speed of Data Access • CPU registers • CPU cache – Caches code, data, also TLB for virtual memory management
• RAM – Directly accessible for local processes
• External storage / network – May have cache on controller or disk array – Access requires system calls Tanel Põder
integrid.info
4/34
Memory As Intermediate Workarea • Need to use it as efficiently as possible – Despite it is “cheap” – Allocating precisely required amounts – heap management – Reusing unused memory – LRU and free lists
• Protect memory from corruptions and inconsistent reads in multiuser env. – Some kind of atomic locking/serializationis needed – Oracle uses latches for it
Tanel Põder
integrid.info
5/34
Oracle Memory Management • Kernel Shared Memory - KSM – KSMG – kernel shared memory granule allocation
• Generic Heap Manager – KGH – Shared Pool • Is shared so concurrently accessible for many processes – latching is needed • Contains library cache, which is managed by KGL
• PGA – No shared access, usually private memory – No latching needed Tanel Põder
integrid.info
6/34
Dynamic SGA • SGA_MAX_SIZE defines max size of SGA – All virtual memory is allocated immediately (not physical memory) – Beware when using ISM, lock_sga, _lock_sga_areas or pre_page_sga – Problems with non-functioning DISM
• Memory allocated in granules – 4 or 16 MB depending whether sga_max_size > 128MB on Unix – 4 or 8 MB Tanel Põder
integrid.info
7/34
Automatic SGA Tuning • SGA_TARGET parameter for specifying total SGA size • MMAN process acts as SGA Memory broker • May resize various components of SGA • DBA should be very cautious with this feature – Does it give any benefit? – What MMAN decides to reduce (and flush) half of buffer cache during heavy operations? – Even more dangerous with shared pool... Tanel Põder
integrid.info
8/34
Realfree PGA Heap Management • _use_realfree_heap=true makes Oracle to allocate CGA and UGA heaps independently (as top-level heaps), not to PGA heap – 9.2 new feature – Is set automatically to true when pga_aggregate_target is set – Default = true in 10g
• mmap() is used instead malloc() and brk() • _realfree_heap_pagesize_hint (10g) – Bytes to preallocate at a time
• _use_ism_for_pga (10g) – Use ISM for large PGA extent allocation
Tanel Põder
integrid.info
9/34
Realfree Heap Mgmt 2 • ksmarfg() and ksmfrfg() internal functions are used for realfree heap allocation • strace (truss) output of PGA mem allocation mmap2(NULL,1048576,..,MAP_PRIVATE|MAP_NORESERVE, 8,0xb6)= 0xb68c6000 mmap2(0xb68c6000, 65536,.., MAP_PRIVATE|MAP_FIXED, 8, 0) = 0xb68c6000 mmap2(0xb68d6000, 65536,.., MAP_PRIVATE|MAP_FIXED, 8, 0) = 0xb68d6000 mmap2(0xb68e6000, 65536,.., MAP_PRIVATE|MAP_FIXED, 8, 0) = 0xb68e6000 mmap2(0xb68f6000, 65536,.., MAP_PRIVATE|MAP_FIXED, 8, 0) = 0xb68f6000 mmap2(0xb6906000, 65536,.., MAP_PRIVATE|MAP_FIXED, 8, 0) = 0xb6906000 ^^^^^^^ ^^^ ^^^^^^^ ^^^^ ^ map to address length private copy fix addr filedescr to map $ ls -l /proc/13786/fd/8 lr-x------ 1 oracle dba 64 Nov 3 01:41 /proc/13786/fd/8 -> /dev/zero
• _pga_large_extent_size, _uga_cga_large_extent_size – set large extent size for initial mmap() Tanel Põder
integrid.info
10/34
_use_realfree_heap=true • Heapdump level 1 (PGA) HEAP DUMP heap name="top call heap" desc=0xbb56660 extent sz=0x213c alt=96 het=32767 rec=0 flg=2 opc=2 parent=(nil) owner=(nil) nex=(nil) xsz=0x7fffc EXTENT 0 addr=0xb6946004 Chunk b694600c sz= 948 freeable "callheap ds=0xbb56f80 Chunk b69463c0 sz= 208 perm "perm Chunk b6946490 sz= 129904 free " EXTENT 1 addr=0xb6a80004 Chunk b6a8000c sz= 424 perm "perm Chunk b6a801b4 sz= 107268 free " Chunk b6a9a4b8 sz= 4164 freeable "callheap ds=0xbb56f80 Chunk b6a9b4fc sz= 4164 freeable "callheap ds=0xbb56f80 Chunk b6a9c540 sz= 1352 freeable "callheap ds=0xbb56f80 Chunk b6a9ca88 sz= 1072 freeable "callheap ds=0xbb56f80 Tanel Põder
integrid.info
" " "
alo=208
" " "
alo=424
" " " 11/34
_use_realfree_heap=false • Heapdump level 1 HEAP DUMP heap name="pga heap" desc=0xbb58c80 extent sz=0x213c alt=88 het=32767 rec=0 flg=3 opc=2 parent=(nil) owner=(nil) nex=(nil) xsz=0xe23c EXTENT 0 addr=0xbc87c80 ... EXTENT 11 addr=0xbc30d88 Chunk bc30d90 sz= 3140 freeable "session heap ds=0xbbd3884 Chunk bc319d4 sz= 124 free " Chunk bc31a50 sz= 1072 freeable "callheap ds=0xbb56f80 Chunk bc31e80 sz= 4164 freeable "callheap ds=0xbb56f80 ... EXTENT 16 addr=0xbbf2e60 Chunk bbf2e68 sz= 4336 free " Chunk bbf3f58 sz= 4164 freeable "callheap ds=0xbb56f80 Tanel Põder
integrid.info
" " " "
" "
12/34
Automatic PGA Management • workarea_size_policy – Set to manual for sessions needing manual workarea control (*_area_size parameters)
• pga_aggregate_target • _smm_max_size 5% of total aggregate PGA • _smm_px_max_size 30% of total agg. PGA • _smm_retain_size (in kB) – wrkarea mem to retain for shared server sessions
• _smm_freeable_retain (in kB) – workarea memory to retain for dedicated servers Tanel Põder
integrid.info
13/34
V$PGASTAT SQL> select * from v$pgastat; NAME VALUE UNIT ----------------------------------------- ---------- -------aggregate PGA target parameter 25165824 bytes aggregate PGA auto target 4194304 bytes global memory bound 1257472 bytes total PGA inuse 36829184 bytes total PGA allocated 85028864 bytes maximum PGA allocated 86201344 bytes total freeable PGA memory 3276800 bytes PGA memory freed back to OS 524288 bytes total PGA used for auto workareas 0 bytes maximum PGA used for auto workareas 423936 bytes total PGA used for manual workareas 0 bytes maximum PGA used for manual workareas 0 bytes over allocation count 1136 bytes processed 15126528 bytes extra bytes read/written 0 bytes cache hit percentage 100 percent Tanel Põder
integrid.info
14/34
X$QESMMSGA SQL> select qesmmsganm name, qesmmsgamu multiplier, 2 qesmmsgavl value, qesmmsgaun unit 3 from x$qesmmsga where qesmmsgavs = 0; NAME MULTIPLIER VALUE UNIT ----------------------------------------- ---------- ---------- ---------total expected memory 1024 0 0 drifting from expected memory 1024 0 0 PGA max size parameter 1024 204800 0 workarea SGA retain size 1024 40 0 active processes 1 19 2 max active processes 1 20 2 percentage freeable memory to be released 100 0 3 recompute count (queries) 1 0 2 recompute count (total) 1 715 2 last recompute time 1 131 1 maximum recompute time 1 208 1 cumulated IMM deamon time 1 89482 1 maximum IMM deamon time 1 296 1 cumulated v$pga_advice simulation time 1 2731 1 workarea simulated count 1 0 2 interrupt v$pga_advice simulation count 1 0 2 skip v$pga_advice simulation count 1 0 2 BUG count in v$pga_advice 1 0 2 Tanel Põder
integrid.info
15/34
ISM in Solaris • Intimate Shared Memory • Allocated in 4MB pages, less PTEs needed • Locked into physical memory – Therefore virtual <-> physical memory mapping information – pagetable entries can be shared among all processes using shared segment – Be careful with SGA_MAX_SIZE (or use DISM)
• VM mappings are likely usable in CPU TLB cache also after context switch • Statistic: OS Other system trap CPU time ? Tanel Põder
integrid.info
16/34
Hugepages In Linux • Used with use_indirect_data_buffers=true • VM is allocated in 2-4MB pages • Less PTEs/TLB usage – better CPU cache utilization possible • Can't be paged out • Requires physically contiguous memory for pages $ cat /proc/meminfo HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 4096 kB Kernel 2.6 and RHAS 3.0's 2.4.21 Tanel Põder
integrid.info
17/34
Relieving Shared Pool Contention • If all else fails... • Shared pool can be split into multiple separate areas, each protected by a different latch – If shared pool > 256M and cpu_count >= 4 – Can be set using _kghdsidx_count parameter (max. value 7) – x$kghlu shows one row per shared pool area (+ 1 for java pool, if defined) – Every shared pool area is protected by it's own shared pool child latch (max 7 latches) Tanel Põder
integrid.info
18/34
Relieving Shared Pool Contention • Streams pool – streams_pool_size – If not defined, streams objects stored in shared pool • may take up to total 10% of shared pool • and max 50% of a single shared pool area (9.2.0.5)
• Use Large pool if needed (naturally) – _large_pool_min_alloc defaults to 16000 – Fragmentation is relieved by aligning allocations to fixed boundaries dependent on above parameter Tanel Põder
integrid.info
19/34
Redo Logging • Old feature: redo copy latches – _log_simultaneous_copies – Got in immediate mode – Cycled through all latches if needed
• Multiple redolog strands for 16+ CPUs – log_parallelism (9i, in 10g _log_parallelism) – Single log buffer, but... – Multiple redo allocation child latches – Got in immediate mode – For log switch, all redo allocation latches have to be got 20/34
Tanel Põder
integrid.info
Redo Logging 2 • Dynamic parallelism in 10g – _log_parallelism_max – _log_parallelism_dynamic = true | false SQL> select name, gets,misses, immediate_gets ig, immediate_misses im 2 from v$latch_children where name = 'redo allocation'; NAME GETS MISSES IG IM -------------------- ---------- ---------- ---------- ---------redo allocation 2799 8 398127 4 redo allocation 136 0 0 0 redo allocation 77 0 0 0 redo allocation 207 0 0 0 redo allocation 57 0 0 0 redo allocation 57 0 0 0 ... redo allocation 24 0 0 0 31 rows selected. Tanel Põder
integrid.info
21/34
Redo Logging 3 • Private redolog strands (zero-copy redo) • No latching is needed for redo entry writing since log space is preallocated – _log_private_parallelism = true | false – _log_private_parallelism_mul (10% of sessions ?) – _log_private_mul (5% of logbuffer is private)
• Issues – Incompatibilities with redolog dependent tools – Logminer, Logical Standby, Streams – Currently used for benchmarks only Tanel Põder
integrid.info
22/34
In-Memory Undo (IMU) • Saving undo information in memory areas protected by separate In-memory undo latches, not in undo segments – No expensive buffer gets and changes – Can be downgraded to a regular transaction (flushed to undo segments) – Can also be used in conjunction with private redo strands • Could be set for recursive transactions as well – using hidden parameter – _recursive_imu_transactions (false in 10g) Tanel Põder
integrid.info
23/34
In-Memory undo 2 • _in_memory_undo – 10g NF, true by default • _imu_pools – In-memory undo pools • _db_writer_flush_imu – if false DBWR will not downgrade IMU txns SQL> select name, gets, misses, immediate_gets 2 from v$latch_children where name like '%undo%'; NAME GETS MISSES IMMEDIATE_GETS ---------------------- ---------- ---------- -------------undo global data 4618 0 0 In memory undo latch 2568 0 302 In memory undo latch 656 0 18 In memory undo latch 19 0 4 In memory undo latch 4 0 0 In memory undo latch 3 0 0
Tanel Põder
integrid.info
24/34
In-Memory Undo 3 SQL> select * from v$sysstat; NAME CLASS VALUE ----------------------------------------------- ---------- ---------user commits 1 100039 commit cleanouts 8 100217 commit cleanouts successfully completed 8 100215 IMU commits 128 97625 ... IMU Flushes 128 8 IMU contention 128 7 IMU recursive-transaction flush 128 0 IMU undo retention flush 128 0 IMU ktichg flush 128 0 IMU bind flushes 128 0 IMU mbu flush 128 0 IMU pool not allocated 128 2407 IMU CR rollbacks 128 0 IMU undo allocation size 128 102861548 IMU Redo allocation size 128 7136 IMU- failed to get a private strand 128 2407 Tanel Põder
integrid.info
25/34
Latching • A latch is a memory area which is used for atomic locking and concurrency control • Latches are more than just one-bit switches – They contain additional bits storing various control information
• Latches are aligned to match hardware cache line – A latch takes 32 bytes (platform specific) – Important on SMP systems – changing one latch state wont invalidate any other latch status in other CPU's cache Tanel Põder
integrid.info
26/34
Shared Latches • Feature actually available since 8.0 • Used in few locations (AQ) • Maintained using compare-and-swap (CAS) CPU instruction (or a separate CAS latch) – Zero holders – latch lock value is zero
L=0
– First shared mode get increments by 1
L=1
– Second shared get increments by 1
L=2
– Latch free decrements by 1
L=1
• Exclusive gets have to wait until L = 0 • Principle is similar to Unix semaphores Tanel Põder
integrid.info
27/34
Shared CBC Latches • Reduce cache buffers chains latch contention Locate hash bucket using ROWID & Block type Get CBC latch in shared mode Scan through hash chain to find best suitable buffer hdr. Upgrade CBC latch to exclusive mode Pin buffer header in shared mode Release CBC latch Copy buffer contents into PGA Get CBC latch in exclusive mode Unpin buffer header Release CBC latch Read and evaluate rows from buffer's consistent copy in PGA Tanel Põder
integrid.info
28/34
Shared CBC Latches 2 • Exclusive CBC latch gets are required despite “read-only” operation SQL> select * from v$sysstat where name like 'shared hash latch%'; STATISTIC# NAME VALUE ---------- ------------------------------------ ----------103 shared hash latch upgrades - no wait 2565999715 104 shared hash latch upgrades - wait 0
• Compare-and-swap latching can be emulated SQL> select latch#, name, gets, immediate_gets from v$latch 2 where name like '%cas latch%' 3 / LATCH# NAME GETS IMMEDIATE_GETS ---------- ---------------------- ---------- -------------207 cas latch 0 0 208 rm cas latch 0 0 Tanel Põder
integrid.info
29/34
SCN Updating • Normal SCN consists of 6 bytes – 4 bytes base – 2 bytes wrap#
• In 32bit systems it allowed latchless SCN incrementation (CAS update) – When 4 bytes capacity has been used, a latch is got, wrap is updated and base is zeroed
• Modern Intel 32bit chips allow latchless 64bit atomic updates – Probably other platforms as well Tanel Põder
integrid.info
30/34
SCN Updating 2 • Latchless updates not so latchless... :( – A 10000 single-row updates with commits in loop – One of the reasons is when > 255 changes are made within a block under same SCN, the SCN has to be increased (seq# in block header) NAME GETS MISSES SLEEPS ------------------------------ ---------- ---------- ---------mostly latch-free SCN 237937 14 14 lgwr LWN SCN 37566 0 0 redo on-disk SCN 117268 0 0
• redo on-disk SCN latch gets remains zero if – _disable_latch_free_SCN_writes_via_64cas = false (default) Tanel Põder
integrid.info
31/34
New Wait Events In 10g SQL> select name from v$event_name where name like 'latch:%'... NAME ---------------------------------------latch: In memory undo latch latch: KCL gc element parent latch latch: MQL Tracking Latch latch: cache buffer handles latch: cache buffers chains latch: cache buffers lru chain latch: checkpoint queue latch latch: enqueue hash chains latch: gcs resource hash latch: ges resource hash list latch: latch wait list latch: library cache latch: library cache lock latch: library cache pin latch: messages ... New events also for enqueues... 26 rows selected. Tanel Põder
integrid.info
32/34
Summary • Lots of new features get implemented each new release – Without our knowing – Undocumented, but here to stay
• Usage dependent on hardware and OS facilities – So, raw CPU speed might not be most important after all in some cases
• Oracle and Unix provide diagnostics utilities for testing each features impact out – v$views, dumps, truss, switchable params, etc.. Tanel Põder
integrid.info
33/34
Memory Management and Latching Improvements In Oracle9i and 10g Questions? Thank you! http://integrid.info
[email protected] Tanel Põder
integrid.info
34/34