Jonas Mason, Principal Architect
Oracle Data Guard has long been Oracle’s recommended solution for providing DR protection for Oracle instances. If your production Oracle databases are few, the scale of effort to implement and maintain Data Guard might be manageable, though finding and retaining a qualified DBA can be difficult and expensive. If your shop is large with many Oracle databases, the management and coordination of Data Guard during failovers, switchovers and failbacks can be onerous, and can proceed at a different rate than application servers. Given the increased license costs and maintenance overhead associated with implementing Data Guard, businesses often ask us about other options.
Virtualization and array based replication technologies provide alternatives to Data Guard that push DR protection further down the stack. This is good news for businesses, as they can embrace a single DR solution for both application and database servers that is managed by infrastructure. Instead of depending on a more expensive Oracle DBA resource for Data Guard tasks, you can depend on a less expensive Oracle DBA to confirm successful startup of an instance brought up on a DR server.
As with any DR solution, the business has to be clear on Service Level Agreements (SLAs) for Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Making DR hardware and software decisions without having established these benchmarks is a waste of time and money. Staff resource availability on the weekend, and the complexity of your DR strategy, both have a direct impact on RTO and must be considered.
Array-based replication has been used to protect application servers at DR sites for years. Although application servers don’t generate the storage deltas that databases are capable of, the concept is no different; block changes are shipped from a volume or LUN in one data center to maintain an identical volume or LUN in another datacenter. As long as these blocks are applied in a consistent write ordered fashion on the DR volume, the instance can be started just as it would be at the primary site.
Volume, LUN, datastore, and VMDK configurations at the primary site are obviously important to insuring that the write consistency of datafiles is maintained and propagated to DR. Placing all datafiles in a single volume is simple and guarantees write consistency. When multiple devices are being replicated, as is the case with some ASM disk/disk group setups, a consistency group must be placed around these devices to maintain write consistency
If you are considering DR for the first time after having established your primary datacenter, you may not have configured your storage appropriately for an array based DR strategy. We encounter this quite a bit; storage devices underlying VMs often have to be reconfigured.
VMware vSphere Replication
VMware’s vSphere Replication replicates VMDKs underlying a VM to a remote DR site, providing an alternative to array replication.
Oracle datafiles on an NFS share can be dismounted from a production server, and attached to a DR server. However, this isn’t a true DR solution, unless storage devices underlying the NFS mount are replicated to the DR site first, thereby insuring another copy exists.
If you have a physical production Oracle server with a single instance to protect, you will need a corresponding physical server for DR purposes. This method isn’t ideal however, as it is expensive and incurs maintenance overhead at the standby site on a server that may sit idle. In this scenario, it is also imperative that you keep the two physical servers in sync in terms of patching and parameter changes.
To use this physical server for Oracle DR purposes without Data Guard, array replication is required. Volumes replicated to the DR site would need to be attached to the physical server.
In this scenario, licensing the Oracle DR server would be required, but you would save on the Data Guard license.
Virtualization, coupled with an NFS and/or array-based replication strategy, provides significant advantages over a physical server DR implementation. An entire VM running Oracle can be replicated to the DR datacenter along with other associated application servers on additional replicated volumes. Another approach would be to mount a replicated volume to a VM at the DR site.
This consistent approach to protecting virtual application and database servers increases infrastructure’s responsibilities and reduces DBA dependencies. The organization is also able to reduce standby server setup, cost, and maintenance with a simple, robust and scalable solution.
Physical hosts at the DR site for VMs running Oracle database software would require licensing. Given the significant cost liability that can be incurred by not licensing these or other servers you may migrate or clone VMs running Oracle database software to, we do not recommend proceeding without internal and external vetting.
In cases where several production database instances are linked together by database links, and require synchronous failover, this approach is advantageous.
DR Software Options
DR solutions provided by Veeam, Zerto, and VMware’s Site Recovery Manager integrate with the arrays, physical hosts, and virtual machines for DR management purposes. Our experience with these products has been excellent in terms of protecting production Oracle servers and their databases.
Block corruption is handled well by Data Guard and guarantees that block corruption is not shipped to the standby. The best practice of using RMAN for backup purposes will flag corrupt blocks, so this isn’t a sticking point.
A potential downside of array-based replication is that more bandwidth is consumed versus Data Guard as block deltas are shipped, instead of redo. One method of determining the change rate is to take a snapshot of a VM and determine the storage delta after 24 hours. Based on the storage delta captured, and your WAN bandwidth, you can determine if this will be an issue.
There are viable alternatives to Data Guard when it comes to protecting Oracle database servers and their hosted instances. For more detail and background on this topic, read the blogs below or contact us.