Chapter 19 Database Recovery Techniques
19.1 p 611-617
Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Copyright © 2004 Pearson Education, Inc.
Note These slides have been edited (improved by adding figures and symbols, and in minor ways) by Dr. JDBrownsmith
Outline of 19.1 Recovery Concepts
19.1.1 Recovery Overview 19.1.2 Caching (Buffering) of Disk Blocks 19.1.3 Write-Ahead Logging 19.1.4 Checkpoints 19.1.5 Rollback
Chapter 19-4
19.1.1 Recovery Overview Purpose of Database Recovery • To bring the database into the last consistent state, which existed prior to the failure. • To preserve transaction ACID properties (Atomicity, Consistency, Isolation and Durability). Example: If the system crashes before a fund transfer transaction completes its execution, then either one or both accounts may have incorrect value. Thus, the database must be restored to the state before the transaction modified any of the accounts. p 612 Chapter 19-5
19.1.1 Recovery Overview Types of Failure The database may become unavailable for use due to: • Transaction failure: Transactions may fail because of incorrect input, deadlock, incorrect synchronization. • System failure: System may fail because of addressing error, application error, operating system fault, RAM failure, etc. • Media failure: Disk head crash, power disruption, etc.
Chapter 19-6
Failure Challenge 3 database
Application code
DBMS
memory
memory buffer
AFIM
Log
When a system failure occurs (e.g., a power outage without a backup power system), identify the parts of the system that are affected. Chapter 19-7
How to Recover from Catastrophic Failure tape archival storage DBMS
log
First, restore the back up copy of the database
Then, reapply the operations of committed transactions up to the point of failure.
What about the non-committed transactions?
p 630 Chapter 19-8
I tell you Mike, I can't understand why a committed transaction would have to be redone.
Chapter 19-9
How to Prepare for Recovery from Catastrophic Failure tape archival storage DBMS
log
Periodically copy the entire database to backup storage
Either stop processing and commit all transactions (not always possible) or backup the log p 630 Chapter 19-10
How to Recover from non-Catastrophic Failure
transaction log
DBMS
• Undo the operations that cause inconsistency • Redo operations to restore the database to a consistent state
Chapter 19-11
Transaction Log For recovery from any type of failure, data values prior to modification (BFIM - BeFore Image) and the new value after modification (AFIM – AFter Image) are required. These values and other information are stored in a sequential file called Transaction log.
Sample Transaction Log T ID T1 T1 T2 T1 T1 T3 T1
Back P Next P Operation Data item Begin 0 1 1 4 Write X Begin 0 8 2 5 W Y 4 7 R M 0 9 R N 5 nil End
BFIM
AFIM
X = 100
X = 200
Y = 50 Y = 100 M = 200 M = 200 N = 400 N = 400
Back P and Next P point to the previous and next log records of the same transaction. Chapter 19-12
19.1.1 Recovery Overview
WHEN
• Immediate Update: As soon as a data item is modified in cache, the disk copy is updated. • Deferred Update: All modified data items in the cache are written either after a transaction ends its execution or after a fixed number of transactions have completed their execution.
WHERE
Database Data Update Techniques
• Shadow update: The modified version of a data item does not overwrite its disk copy but is written at a separate disk location. • In-place update: The disk version of the data item is overwritten by the cache version. Chapter 19-13
19.1.1 Recovery Outline Database Data Update Techniques • Immediate Update: As soon as a data item is modified in cache, the disk copy is updated.
database
memory
3.
memory buffer
4.
data catalog
Chapter 19-14
Recall: Introduction to Transaction Processing 2. 3. 4. 5.
write_item(X) command includes the following steps: Find the address of the disk block that contains item X. Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer). Copy item X from the program variable named X into its correct location in the buffer. Store the updated block from the buffer back to disk (either immediately or at some later point in time). 1.
Application code memory
database
DBMS 3.
memory buffer
2.
data catalog
4. Chapter 19-15
Write_item Challenge #14 1.
Application code memory
database
DBMS 3.
memory buffer
2.
data catalog
4.
What types of application activities would cause a memory write to the memory buffer? • SELECT - FROM - WHERE • INSERT INTO ... • DELETE FROM ... • UPDATE ... • CREATE TABLE ... Chapter 19-16
Immediate Update • Immediate Update:
As soon as a data item is modified in cache, the disk copy is updated.
Application code memory
3.
DBMS
database
memory buffer
data
4.
catalog
Something is missing from this picture. What is missing? Chapter 19-17
Immediate Update Immediate Update:
As soon as a data item is modified in cache, the disk copy is updated.
Application code memory
3.
DBMS
database
memory buffer
data
"Force writing" of the transaction log
5. 4.
catalog
Log Chapter 19-18
Deferred Update Deferred Update:
All modified data items in the cache are written to the database either after a transaction ends its execution or after a fixed number of transactions have completed their execution. Application code memory
3.
DBMS
database
memory buffer
data
During transaction commit, write the transaction log
5. 4.
catalog
Log Chapter 19-19
Immediate Update (when) with In-Place Update (where) Immediate Update: As soon as a data item is modified in cache, the disk copy is updated.
In-place update: The disk version of the data item is overwritten by the cache version. database
Application code
DBMS
memory
memory buffer
4. "Force writing" of the transaction log 4.
Log
AFIM 5. Where data is written to
What prevents other transactions from getting this data before the transaction is committed? Chapter 19-20
Immediate Update (when) with Shadow Update (where) Immediate Update: As soon as a data item is modified in cache, the disk copy is updated.
Shadow update: The modified version of a data item does not overwrite its disk copy but is written at a separate disk location. Application code
DBMS
memory
memory buffer
4. "Force writing" of 4. the transaction log
Log
database
5.
BFIM AFIM
Where data is written to
Is a write_lock needed on the BFIM? Chapter 19-21
Deferred Update (when) with In-Place Update (where)
Deferred Update: All modified data items in the cache are written to the database either after a transaction ends its execution or after a fixed number of transactions have completed their execution.
In-place update: The disk version of the data item is overwritten by the cache version. database
Application code
DBMS
memory
memory buffer
4. "Force writing" of the transaction log
4.
Log
AFIM 5. Where data is written to Chapter 19-22
Deferred Update (when) with Shadow Update (where) Deferred Update: All modified data items in the cache are written to the database either after a transaction ends its execution or after a fixed number
Shadow update: The modified version of a data item does not overwrite its disk copy but is written at a separate disk location.
Application code
DBMS
memory
memory buffer
4. "Force writing" of 4. the transaction log
Log
database
5.
BFIM AFIM
Where data is written to Chapter 19-23
DBMS Components
DBMS
Buffer Manager/ Cache Manager Recovery Manager Lock Manager The cache manager manages in-memory buffers. We previously called this the Buffy Manager. Chapter 19-24
19.1.2 Caching of Disk Blocks Data Caching Data items to be modified are first stored into database cache by the Cache Manager (CM) and after modification they are flushed (written) to the disk. The flushing is controlled by Modified and Pin-Unpin bits. Cache Manager
main memory buffers database
Disk page DBMS cache
data catalog
For simplicity, we assume that a buffer contains a single disk page. Chapter 19-25
19.1.2 Caching of Disk Blocks cache directory disk page address, buffer location, dirty bit, pin-unpin bit • Pin-Unpin bit: if pinned, the page can not be written to disk yet. • Modified bit: Indicates the AFIM of the data item. Cache Manager
main memory buffers database
Disk page DBMS cache
data catalog
Chapter 19-26
19.1.3 Write-Ahead Logging (WAL) When in-place update (immediate or deferred) is used, then a log is necessary for recovery and it must be available to recovery manager. This is achieved by the Write-Ahead Logging protocol.
memory buffer "force writing" of the transaction log
Log
p 614 Chapter 19-27
Immediate Update (when) with In-Place Update (where) and WAL Immediate Update: As soon as a data item is modified in cache, the disk copy is updated.
In-place update: The disk version of the data item is overwritten by the cache version. database
Application code
DBMS
memory
memory buffer
4. "Force writing" of 4. the transaction log (for Undo)
Log
AFIM 5. Where data is written to
BFIM Chapter 19-28
19.1.3 Write-Ahead Logging Write-Ahead Logging states that: Before a data item’s AFIM is flushed to the database disk (overwriting the BFIM) its BFIM must be written to the log and the log must be saved on a stable store (log disk). This data is needed for Undo. (See previous slide). Before a transaction executes its commit operation, all its AFIMs must be written to the log and the log must be saved on a stable store. This data is needed for Redo. (See next slide). p 614 Chapter 19-29
Deferred Update (when) with In-Place Update (where) and WAL
Deferred Update: All modified data items in the cache are written to the database either after a transaction ends its execution or after a fixed number of transactions have completed their execution.
In-place update: The disk version of the data item is overwritten by the cache version. database
Application code
DBMS
memory
memory buffer
4. on commit, "Force writing" of the transaction log AFIM for Redo
4.
Log
AFIM 5. Where data is written to
BFIM, AFIM Chapter 19-30
Recovery Challenge 12 (Using immediate, in-place update) database
Application code memory
DBMS 3.
4. on commit, "Force writing" of the transaction log AFIM for Redo
memory buffer 4.
Log
AFIM 5. Where data is written to
BFIM, AFIM
...Assume only one transaction is in progress...
T107 at step 3,4,5 - update the buffer, log BFIM, database AFIM T107 at step 4 commit - write the log (AFIM for T3) T107 at step 4 commit - write to log completed ---- System Failure ---What does the DBMS do upon startup? Chapter 19-31
Recovery Challenge 12 (Using immediate, in-place update) ...Assume only one transaction is in progress...
T107 at step 3,4,5 - update the buffer, log BFIM, database AFIM T107 at step 4 commit - write the log (AFIM for T3) T107 at step 4 commit - write to log completed ---- System Failure ---What does the DBMS do upon startup?
T107 - Although the log says committed, the DBMS does not know that the crash did not happen after the log was written and before the database was updated. Thus it must undo this transaction (to remove the database AFIM), and will re-do (completely) it from the data in the log. Chapter 19-32
19.1.3 Steal/No-Steal and Force/No-Force Possible ways for flushing database cache to database disk: Steal: Cache can be flushed before transaction commits. Reason: Buffer Manager needs a free buffer. No-Steal: Cache cannot be flushed before transaction commit. Force: Cache is immediately flushed (forced) to disk. No-Force: Cache is deferred until transaction commits. Typical database systems employ a steal/no-force strategy p 614 Chapter 19-33
java.nio.MappedByteBuffer A direct byte buffer whose content is a memory-mapped region of a file. public final MappedByteBuffer force() Forces any changes made to this buffer's content to be written to the storage device containing the mapped file. If the file mapped into this buffer resides on a local storage device then when this method returns it is guaranteed that all changes made to the buffer since it was created, or since this method was last invoked, will have been written to that device. If the file does not reside on a local device then no such guarantee is made. If this buffer was not mapped in read/write mode (FileChannel.MapMode.READ_WRITE) then invoking this method has no effect. Chapter 19-34
19.1.4 Checkpoints Checkpointing From time to time (randomly or under some criteria) the database flushes its buffer to database disk to minimize the task of recovery. The following steps defines a checkpoint operation: 2. Suspend execution of transactions temporarily. 3. Force write all modified buffer data to disk. 4. Write a [checkpoint] record to the log, save the log to disk. 5. Resume normal transaction execution. During recovery redo or undo is required to transactions appearing after [checkpoint] record. p 615 Chapter 19-35
Transaction Log Sample Transaction Log T ID T1 T1 T2 T1 T1 T3 T1
Back P Next P Operation Data item Begin 0 1 1 4 Write X Begin 0 8 2 5 W Y 4 7 R M 0 9 R N 5 nil End
BFIM
AFIM
X = 100
X = 200
Y = 50 Y = 100 M = 200 M = 200 N = 400 N = 400
Problem: A failure has occurred. How far back in the transaction log do we search for transactions to redo?
Chapter 19-36
Transaction Log Sample Transaction Log T ID T1 T1 T2 T1 T1 T3 T1
Back P Next P Operation Data item Begin 0 1 1 4 Write X Begin 0 8 2 5 W Y 4 7 R M 0 9 R N 5 nil End checkpoint T1,T2,T3
BFIM
AFIM
X = 100
X = 200
Y = 50 Y = 100 M = 200 M = 200 N = 400 N = 400
The checkpoint log entry identifies all the transactions (Ts) that are active. On subsequent failure, redo those Ts active at the last checkpoint and any subsequent Ts which have a start and a commit record in the log. Undo all Ts which were active at the time of crash. Chapter 19-37
19.1.5 Transaction Rollback Transaction Roll-back (Undo) and Roll-Forward (Redo) To maintain atomicity, a transaction’s operations are redone or undone. Undo: Restore all BFIMs onto disk (Remove all AFIMs). Redo: Restore all AFIMs onto disk. Database recovery is achieved either by performing only Undos or only Redos or by a combination of the two. These operations are recorded in the log as they happen. p 616 Chapter 19-38
19.1.5 Transaction Rollback Roll-back We show the process of roll-back with the help of the following three transactions T1, and T2 and T3.
T1 read_item (A) read_item (D) write_item (D)
T2 read_item (B) write_item (B) read_item (D) write_item (A)
T3 read_item (C) write_item (B) read_item (A) write_item (A)
p 617 Chapter 19-39
19.1.5 Transaction Rollback Roll-back: One execution of T1, T2 and T3 as recorded in the log. A 30 [start_transaction, T3] [read_item, T3, C] * [write_item, T3, B, 15, 12] [start_transaction,T2] [read_item, T2, B] ** [write_item, T2, B, 12, 18] [start_transaction,T1] [read_item, T1, A] [read_item, T1, D] [write_item, T1, D, 20, 25] [read_item, T2, D] ** [write_item, T2, D, 25, 26] [read_item, T3, A]
B 15
C 40
D 20
12 18
---- system crash ---* T3 is rolled back because it did not reach its commit point. ** T2 is rolled back because it reads the value of item B written by T3.
25 26
p 617 Chapter 19-40
19.1.5 Transaction Rollback Roll-back: One execution of T1, T2 and T3 as recorded in the log. T3
READ(C) WRITE(B)
BEGIN
READ(B)
T2 BEGIN
READ(A) WRITE(B)
READ(D) WRITE(D) READ(A) READ(D) WRITE(D)
T1 BEGIN
Time system crash
Illustrating cascading roll-back p 617 Chapter 19-41
Figure19.1 Illustrating cascading rollback (a process that never occurs in strict or cascadeless schedules). (a) The read and write operations of three transactions. (b) System log at point of crash.
Chapter 19-42
Notes
Chapter 19-43
Checkpoint Challenge 6 Log ... checkpoint T21
database
T22 write X T23 write Y commit T22
memory failure! buffer wiped out!
data catalog
Show why it is necessary to redo T22. What is to happen to T21? Why must T23 be undone? How does it get started again? During recovery, another failure occurs. Now what? Chapter 19-44
Notes
Chapter 19-45