Oracle Data Guard
Ensuring Disaster Recovery for Enterprise Data
Wei Hu
[email protected]
Oracle’s High Availability (HA) Solution Stack System Failure Unplanned Downtime
Data Failure & Disaster
Human Error
Planned Downtime
2
System Maintenance Data Maintenance
Real Application Clusters Continuous Availability for all Applications
Data Guard Zero Data Loss
Flashback Query Enable Users to Correct their Mistakes
Dynamic Reconfiguration Capacity on Demand without Interruption
Online Redefinition Adapt to Change Online
Oracle Data Guard Focus y Data Failures & Site Disasters: – – –
Data Protection Data Availability Data Recovery All 3 are important!
Data is the core asset of the enterprise!
• Also addresses human errors & planned maintenances
3
What Is Oracle Data Guard? y Database software infrastructure that automates the creation and maintenance of a duplicate, or standby copy, of the production (or primary) database y If the primary database becomes unavailable (disasters, maintenance), the standby database can be activated and can take over the data serving needs of the enterprise
4
Data Guard Architecture Overview Clients
Clients
Primary Site
Standby Site
5
Data Guard Broker
Primary Database
Standby Database
Broker Agent
Broker Agent
Data Changes
How Does It Work? y As primary database is modified, redo data is propagated to standby databases y Standby databases kept synchronized with primary y Primary database is open and active; standby database is either in recovery or open read-only / read-write y Standby database can be transitioned to the primary role as necessary
6
Data Guard Configuration Standby Site A
Primary Site
Standby Site B Standby Database Primary Database
Standby Database
y Managed as a single configuration y Primary and standby databases can be Real Application Clusters or single-instance Oracle y Up to nine standby databases supported in a single configuration
7
Oracle Data Guard Architecture Physical Standby Database
Sync or Async Redo Shipping
Production Database
Backup Redo Apply
Network
DIGITAL DATA STORAGE
DIGITAL DATA STORAGE
Broker Optional Delay
Transform Redo to SQL
Optional Delay
8
Logical Standby Database
SQL Apply
Open for Reports
Additional Indexes & MVs
Oracle Data Guard Process Architecture Physical/Logical Standby Database
Oracle Net
Transactions
LGWR (Synchronous/Asynchronous)
RFS
Affirm/ NoAffirm
MRP/ LSP
Online Redo Logs
FAL
Primary Database
Standby Redo Logs
ARCH
ARCH
Backup / Reports Transform Redo to SQL for SQL Apply
(Synchronous)
Archived Redo Logs
9
Archived Redo Logs
Data Guard Redo Apply Data Guard Broker Physical Standby Database
Primary Database Optional Delay
Backup
Network Sync or Async Redo Shipping
y y y y
10
DIGITAL DATA STORAGE
Redo Apply
Physical Standby Database is a block-for-block copy of the primary database Uses the database recovery functionality to apply changes Can be opened in read-only mode for reporting/queries Can also perform backup, offloading production database
Data Guard SQL Apply Additional Indexes & Materialized Views
Data Guard Broker
Primary Database
Optional Delay
Logical Standby Database
Continuously Open for Reports
Network Sync or Async Redo Shipping
y
Logical Standby Database is an open, independent, active database
y y
11
Transform Redo to SQL and Apply
Contains the same logical information (rows) as the production database Physical organization and structure can be very different Can host multiple schemas
Can be queried for reports while logs are being applied via SQL Can create additional indexes and materialized views for better query performance
Standby Databases Are Not Idle Standby Server
Read-Only / Read-Write Reporting
Standby Database
Tape
Backups
Standby database can be used to offload the primary database, increasing the ROI 12
Cascaded Redo Log Destinations y
Standby database receives its redo data from another standby database and not from the original primary database
y
Primary database sends a set of redo data to only selected standby databases and not to all standby databases
y
Reduces the load on the primary system, and also reduces network traffic and use of valuable network resources around the primary site
Primary Database
13
Redo Data
Physical Standby Database
Retransmitted
Physical Standby Database
Protection from Human Errors and Data Corruptions Standby Site
Primary Site
Standby Database
Production Database Optional Delayed Apply
14
y
The application of changes received from the primary can be delayed at standby to allow for the detection of user errors and prevent standby to be affected
y
The apply process also revalidates the log records to prevent application of any log corruptions
Switchover and Failover y Primary and Standby role transitions y Switchover – – –
Planned role reversal No database reinstantiation required Used for maintenance of OS or hardware
y Failover – –
Unplanned failure (e.g. disasters) of primary Primary database must be reinstantiated
y Initiated using simple SQL / GUI interface y Data Guard automates the processes involved
15
Failover Example
16
Flexible Data Protection Modes Protection Mode
Risk of Data Loss
Redo Shipment
Maximum Protection
Zero Data Loss Double Failure Protection
Synchronous redo shipping to 2 sites
Maximum Availability
Zero Data Loss Single Failure Protection
Synchronous redo shipping
Maximum Performance
Minimal data loss – usually 0 to few seconds
Asynchronous redo shipping
Balance cost, availability, performance, and transaction protection
17
Maximum Protection Mode
y y y y y
Protection Mode
Risk of Data Loss
Redo Shipment
Maximum Protection
Zero Data Loss Double Failure Protection
Synchronous redo shipping to 2 sites
Highest level of data protection Configuration: LGWR SYNC, SRLs Enforces protection of every transaction If last standby is unavailable, processing stops at primary Good for financial systems where no data loss is acceptable
ALTER DATABASE SET STANDBY TO MAXIMIZE PROTECTION;
18
Maximum Availability Mode
y y y y
Protection Mode
Risk of Data Loss
Redo Shipment
Maximum Availability
Zero Data Loss Single Failure Protection
Synchronous redo shipping
Enforces protection of every transaction Configuration: LGWR SYNC, do not need SRLs If last standby is unavailable, processing continues at primary When the standby becomes available again, synchronization with the primary is automatic
ALTER DATABASE SET STANDBY TO MAXIMIZE AVAILABILITY;
19
Maximum Performance Mode
y y y y y
Protection Mode
Risk of Data Loss
Redo Shipment
Maximum Performance
Minimal data loss – usually 0 to few seconds
Asynchronous redo shipping
Highest level of performance Configuration: LGWR ASYNC, or ARCH Protects from failure of any single component Least impact on production system Useful for applications that can tolerate some data loss
ALTER DATABASE SET STANDBY TO MAXIMIZE PERFORMANCE;
20
Automatic Gap Resolution & Resynchronization y Network connectivity problems may cause gaps in the sequence of log files in the standby y Data Guard automatically takes care of these gaps
21
–
Automatic Gap Handling
–
FAL (Fetch Archive Log) Gap Handling
GAP Resolution y Automatic –
ARCH process idling away on the primary ‘pings’ all enabled standbys on a regular basis to see if they are missing any redo data
–
If so it sends them the missing redo data
y FAL
22
–
Gap discovered during apply process in physical standby
–
Based on FAL_SERVER and FAL_CLIENT settings, primary notified, and it sends missing redo data
Oracle Data Guard Broker y Distributed management framework that automates and centralizes the creation, maintenance, and monitoring of Data Guard configurations y Management operations can be performed locally or remotely through the Broker's easy-to-use interfaces: – –
23
GUI-based Oracle Data Guard Manager Data Guard command-line interface
Data Guard Broker Architecture Job Service
Event Service
Security Service
Discovery Service
Oracle Management Server Data Guard Manager OEM Agent
Primary Database
24
Data Guard Broker
OEM Agent
Data Guard Broker
Physical Standby Database
OEM Agent
Data Guard Broker
Repository
Logical Standby Database
Data Guard Manager
y Simple, easy-to-use management and monitoring interface
25
Local and Remote Standby Databases y Oracle Data Guard configuration can support both local and remote standby databases y Local standby database – – – –
Human error and data corruption protection Appropriate for highest data protection modes LAN links are cheap, reliable, have high bandwidth and low latency Switchover operations are very fast
y Remote standby database – – –
26
Best solution for disaster recovery WAN links are generally more expensive, less reliable, have lower bandwidth and higher latency than LAN links Suitable for highest performance asynchronous data protection mode
Usage Examples Chicago
Example B
Dallas
Primary Database
Standby Database
Standby Database
Primary Database
Maximize primary and standby resources
Example A
Standby machine must be powerful enough to support multiple production instances after switchover / failover
Primary Site A
Primary Database
Primary Site B
Primary Database
Primary Site C
Primary Database
Standby Database
Standby Database
Standby Database
Standby Site
27
Usage Examples Primary Site
Standby Site A
Standby Database
Synchronous transport LAN attached Used to offload backups First choice for switchover candidate
Standby Site B
Primary Database
Synchronous transport LAN attached Used to offload reporting Standby Database
Example C
28
Asynchronous transport WAN attached Delayed apply Provides DR and data protection
Standby Site C
Standby Database
Data Guard and RAC y Data Guard and Real Application Clusters are complementary and should be used together for a Maximum Availability Architecture y Real Application Clusters provides high availability – –
Provides rapid and automatic recovery from node failures or an instance crash Provides increased scalability
y Data Guard provides disaster protection and prevents data loss – – –
29
By maintaining transactionally consistent copies of primary database Protects against disasters, data corruption and user errors Does not require expensive and complex HW/SW mirroring
Data Guard and Streams y
Streams and Data Guard are independent features of Oracle Database Enterprise Edition, based on some common underlying technology
y
Data Guard: Disaster Recovery & Data Protection – – – –
y
Streams: Information Sharing/Distribution – – – –
y
30
Transactionally consistent standby databases Zero data loss Automated switchover/failover Various data protection modes
Fine granularity and control over what is replicated Bi-directional replication Data transformations Heterogeneous platforms
Because of business requirements, customers may choose to use Streams for DR/HA, and Data Guard SQL Apply for information distribution
Financial Services Company Using Data Guard & Streams Streams Master Database
for information distribution
Data Feed
Data Guard
Data Transformation
for DR Product Delivery Databases for Client Access
Physical Standby Database
31
Data Guard and Remote Mirroring y Remote Mirroring is another way to protect enterprise data y Host-based and storage based y Is a physical bit-for-bit copy y The copy can be remote
y Is this a good substitute?
32
Data Guard and Remote Mirroring y
Better protection –
y
Greater efficiency –
y
33
Only redo is transferred instead of entire disk block (7x bandwidth savings, 27x network I/Os)
Cheaper –
y
Redo is validated logically
No reliance on specialized hardware
Remote mirroring is useful for non-Oracle data
Why Oracle Data Guard? 1.
Disaster Recovery & High Availability –
2.
Complete data protection –
3.
Automatic archive gap detection and resolution with no manual intervention
Centralized and simple management –
34
Flexible data protection/synchronization modes
Automatic resynchronization after restoration of network connectivity –
6.
Standby databases can be used for reporting, backups, queries
Balance data availability against performance –
5.
Enables zero data loss, safeguard against data corruptions
Efficient utilization of system resources –
4.
Easy failover/switchover between primary and standby databases
Graphical interface for management and monitoring
Resources y HA Portal on OTN: http://otn.oracle.com/deploy/availability/ y Maximum Availability Architecture (best practice recommendations on Data Guard + RAC configuration): http://otn.oracle.com/deploy/availability/htdocs/maa.htm
y Disaster Recovery page on OTN: http://otn.oracle.com/deploy/availability/htdocs/dr_overview.html
y Data Guard Technical White Paper on OTN: http://otn.oracle.com/deploy/availability/pdf/DG92_TWP.pdf
y Data Guard Technology Overview Presentation on OTN: http://otn.oracle.com/deploy/availability/pdf/DataGuardTechnologyOverview.pdf
35