Backup and Recovery Optimization
Stephan Haisley Center Of Expertise Oracle Corporation Copyright Oracle Corporation, 2005. All rights reserved.
Objectives •
Provide a short introduction to Recovery Manager (RMAN)
•
Explain and demonstrate factors that influence speed of: – Backups – Restorations – Recoveries
•
2
Give you some ideas at what to look at to make backup and recovery faster
Copyright Oracle Corporation, 2005. All rights reserved.
What is RMAN? • •
Introduced in 8.0
•
Built into RDBMS kernel so can take advantage of features (e.g. block checking)
•
Can back up datafiles, controlfile, archivelogs and SPFILE
•
Offers image copy or backupset
Allows DBA to manage backup and recovery operations with ease
– Image copy: byte for byte copy – Backupset: multiplexed files together into proprietary format 3
Copyright Oracle Corporation, 2005. All rights reserved.
Incremental Backups Day of the week Sun
Mon
Tues
Wed
Thr
Fri
Sat
0
1
1
1
1
1
0
Incremental backup level
Differential
4
Day of the week Sun
Mon
Tues
Wed
Thr
Fri
Sat
0
1
1
1
1
1
0
Incremental backup level
Cumulative
Copyright Oracle Corporation, 2005. All rights reserved.
Differential vs Cumulative •
5
Backup speeds: Backup#
Type
Level
1
Base
0
2
Diff
3
#blocks
Time (secs)
CPU (secs)
778112
626
227.20
1
42375
312
82.93
Diff
1
42370
312
82.65
4
Diff
1
42369
312
82.45
5
Base
0
778112
628
226.09
6
Cumu
1
42371
314
80.61
7
Cumu
1
49605
315
83.70
8
Cumu
1
60176
321
85.33
Copyright Oracle Corporation, 2005. All rights reserved.
Differential vs Cumulative •
•
6
Restore speeds: Type
#Backup Sets restored
Base level 0
1
626.67
210.85
Differential
3
98.67
23.00
Base level 0
1
629.33
209.21
Cumulative
1
43.00
11.05
Time (secs)
CPU (secs)
Extra time on backup can save significant time on recovery! Copyright Oracle Corporation, 2005. All rights reserved.
Backup and Restore Performance •
Backup & Restore times can be influenced by: – Channel configuration – Size of memory buffers (read & write) – Speed of backup devices – Amount of data being backed up – Amount of block checking features enabled – Use of compression
7
Copyright Oracle Corporation, 2005. All rights reserved.
Channel Configuration •
Match up the number of channels to each backup device – Manually allocate channels – Use automatic channel parallelism
•
Avoid Media Management Layer (MML) multiplexing of backup sets – Increase restore times
•
8
Leave some devices available for emergency restorations which won’t upset the other backup schedules Copyright Oracle Corporation, 2005. All rights reserved.
Channel Configuration •
Reducing filesperset can decrease speed of single file restores: Filesperset
9
BS Size (blks)
Restored file (blks)
Time (secs)
CPU (secs)
8
702320
97727
132
39.42
4
658221
97727
110
36.92
2
132773
97727
82
29.92
1
97730
97727
74
25.62
Copyright Oracle Corporation, 2005. All rights reserved.
Read and Write Memory Buffers
Datafiles input Buffers (4 per datafile)
10
Output Buffers (4 per channel)
Copyright Oracle Corporation, 2005. All rights reserved.
Backup Device
Size of Read Buffers •
Allocated according to MAXOPENFILES channel parameter: MAXOPENFILES
MAXOPENFILES ≤ 4 4 > MAXOPENFILES ≤ 8 MAXOPENFILES > 8
• 11
Buffer Size Each buffer = 1Mb, total buffer size for channel is up to 16Mb Each buffer = 512Kb, total buffer size for channel is up to 16Mb. Numbers of buffers per file depends on number of files Each buffer = 128Kb, 4 buffers per file, so each file will have 512Kb buffer
Let’s see how that looks in real life… Copyright Oracle Corporation, 2005. All rights reserved.
Size of Read Buffers •
Read buffer allocation for backups: MAXOPENFILES
•
12
Buffer Size (Kb)
Total Buffer size (Mb)
#Buffers per file
2
1024
8
16
4
512
8
16
8
512
4
16
10
128
4
5
Default values seem adequate, and will also limit the amount of memory used for input buffer memory Copyright Oracle Corporation, 2005. All rights reserved.
Size of Write Buffers •
Allocates four buffers per channel – Disk = 1Mb per buffer – SBT = 256Kb per buffer
•
SBT is smaller due to slower speed of tape devices
•
Can see increased performance when increasing size of tape buffers… Total buffer size (Kb)
13
I/O Count
I/O Time (secs)
128
60564
617.4
1024 (default)
7571
595.9
2048
3786
505.3
Copyright Oracle Corporation, 2005. All rights reserved.
Where is buffer memory allocated from? •
PGA if not using I/O slaves (use async I/O) – tape_asynch_io – disk_asynch_io
•
Shared Pool if using I/O slaves (use if OS does not support async I/O) – backup_tape_io_slaves – dbwr_io_slaves
• 14
Large Pool if size > 0 and using I/O slaves
Copyright Oracle Corporation, 2005. All rights reserved.
Speed of Backup Devices •
Maximum speed of backup: min(disk read Mb/s, tape write Mb/s)
•
Monitor v$backup_async/sync_io for effective_bytes_per_second where input is output or input – If transfer rate slower than device is capable of, look at OS level data, CPU statistics, MML settings (compression?), device settings (block size)
•
Can slow down speed of backup to reduce loading on I/O system: RMAN> configure channel device type sbt rate=1M;
15
Copyright Oracle Corporation, 2005. All rights reserved.
Amount of data being backed up •
Put static data into Read-Only tablespace and backup one time only – Make sure backup not purged from MML catalog
•
Use differential incrementals and monitor v$backup_datafile to identify files not changing frequently – Reduce their backup frequency
•
Avoid using datafiles with large amounts of freespace – The whole datafile is scanned for a backup
16
Copyright Oracle Corporation, 2005. All rights reserved.
Block Change Tracking • •
Fast Incremental backups introduced in 10g
• •
Size of tracking file ~1/30,000 size of database
•
Performance gain for backups make this bearable:
Uses change tracking file to store bitmaps representing ranges of blocks in datafiles Overhead on database performance ~3% (in my TPCC tests)
Fast Incrementals?
17
#Blocks in DB
#Blocks read
#Blocks in backup
Time (secs)
No
404160
404160
36567
156
Yes
404160
72832
37215
35
Copyright Oracle Corporation, 2005. All rights reserved.
Amount of Block Checking Features Enabled •
Each type of block checking will increase time and CPU usage for backup and restoration: –
Head and Tail sanity check – Makes sure key structures in head match tail
–
Block Checksums – Calculated and compared with existing
checksum –
Logical structure checks – Checks various block structures for consistency
•
Tests showed time for database backup increased ~1% and CPU usage by ~8% –
18
BUT extra checks confirm if database good on backup and then on restore Copyright Oracle Corporation, 2005. All rights reserved.
Backup Compression • •
Backupset compression introduced in 10g Can reduce size of backupset by 80-90% – Saves space on backup media space – Reduces amount of network traffic if backup device not local
•
Increases CPU and time (as expected) for backup and restore
•
Do NOT use along with MML compression – Time both types of compression and use most suitable
19
Copyright Oracle Corporation, 2005. All rights reserved.
Recovery Performance
•
Recovery times can be influenced by: – Number of archivelogs/incrementals being applied – Number of datafiles needing recovery – If archivelogs available on disk – If using parallel recovery – General database performance
20
Copyright Oracle Corporation, 2005. All rights reserved.
Number of archivelogs/incrementals being applied •
RMAN will choose to use incrementals over archivelogs – My tests showed restoring the incremental was ~17 times quicker than applying 20 archivelogs – Mileage will vary depending on backup / restore speeds as previously discussed
21
•
Previous slide showed cumulative being faster than differentials
•
The higher the number of logfiles / incrementals required, the slower the recovery Copyright Oracle Corporation, 2005. All rights reserved.
Number of datafiles needing recovery •
For each datablock that needs recovery, it first needs to be read into the buffer cache and then written back to disk by DBWR after redo is applied to it
•
By reducing the number of files that are recovered, reduce overall work in the database = speed up recovery – Only restore and recover the files that NEED recovering
• 22
If recovery due to corruption, consider Block Media Recovery… Copyright Oracle Corporation, 2005. All rights reserved.
Block Media Recovery (BMR) •
RMAN will restore and apply recovery to the specified blocks only, leaving rest of datafile in tact for normal use
•
Significant increase in recovery time over the whole datafile: Datafile recovery time (secs)
#Corrupt Blocks
• 23
BMR Time (secs)
10
941
145
99
925
155
991
937
219
5000
922
616
10000
938
1156
Can be too much of a good thing! Copyright Oracle Corporation, 2005. All rights reserved.
Archiveslogs available on disk? •
Avoid the RMAN restore times for archivelogs and keep n days worth on disk – Depends on incremental strategy – Depends on available disk space
•
Backup most recent archivelogs to disk and then to tape at a later time – Take a backup of a backup (from 9i onwards)
24
Copyright Oracle Corporation, 2005. All rights reserved.
Parallel Recovery •
By default Oracle will use a single process to carry out recovery, unless using parallel_automatic_tuning – Oracle will decide if best to use parallel recovery and how many slave processes
25
• •
Single coordinator process reads the archivelogs
•
Will increase CPU usage and need for DBWR to perform well
•
Watch for waits on ‘PX Deq’ events
Reading of datablocks and applying redo is split up amongst slave processes, each working on a range of blocks
Copyright Oracle Corporation, 2005. All rights reserved.
General Database Performance •
Recovery happens within the database, so a badly performing database will not help with recovery times
•
Areas to look for improvement: – I/O → read and write intensive – DBWR performance → look for ‘free buffer waits’ – use async. IO or DBWR slaves – CPU → make sure it doesn’t become starved during recovery – parallelism won’t help you!
26
Copyright Oracle Corporation, 2005. All rights reserved.
Helpful views
27
•
v$session_longops → shows currently running backup, restore, recovery with RMAN
•
v$backup_async/sync_io → shows RMAN performance information
• •
v$session_wait → session wait information v$backup_set, v$backup_piece, v$backup_datafile etc. → shows sizing information for backups
Copyright Oracle Corporation, 2005. All rights reserved.
Summary •
Explained factors that influence speed of: – Backups – Restorations – Recoveries
28
•
Gave you something to think about when looking at backup, restore and recovery time windows
•
Make sure you test any alterations with production volume FIRST!
Copyright Oracle Corporation, 2005. All rights reserved.