Blog

The Oracle RAC Dilemma – Part I

House of Brick Staff

May 8, 2020
Oracle, VMware, vSphere

Dave Welch (@OraVBCA), Chief Evangelist

Part I: Four Criteria for Introducing or Keeping RAC

VMware vSphere High Availability can provide significant levels of high availability for many workloads. VMware HA does so without the complexity, fragility, or cost of Oracle Real Application Clusters™ (RAC).

Oracle RAC can introduce significant complexity, expense, and risk into a system stack. There are many shops that experience net worse HA with RAC than with single instance Oracle databases on VMware HA.

When is Oracle RAC the Right Choice?

Your organization may benefit from introducing, or maintaining, RAC if you can identify with at least one of the following evaluation criteria.

You have an explicit SLA that the database is down for no more than four minutes in an emergency
RAC’s rolling upgrade capability is needed
You have configured and/or programmed to RAC’s Oracle Notifications Services (ONS)
The database load outstrips vSphere’s 128 vCPU limit per virtual machine

Let’s look at each of these scenarios more closely.

1) You have an explicit SLA that requires the database be down for less than four minutes in an emergency – VMware HA needs about four minutes to restart a virtual machine and have the database ready for connections. RAC can have a failed node’s transactions failed over to a surviving node and the RAC cluster un-paused in about one minute. A corollary criterion is whether applications can provide a scheduled maintenance window.

2) RAC’s rolling upgrade capability is needed – Because a RAC cluster involves multiple instances, many patches can be done with no downtime to the database. However, major patch sets usually update the database data dictionary forcing downtime for the entire RAC cluster. Oracle Critical Patch Updates also almost always require database downtime.

One of our many clients, who leads their vertical worldwide, brought us in years ago for a statistical assessment of their various workloads’ need for RAC. They confessed that up to that point they had handed out RAC to business units based on request rather than metrics. Although they had assumed that their uptime requirements could not be met without RAC, it was determined that 80% of their RAC implementations were unnecessary, and single instance on VMware HA provided more than adequate downtime for the required patching windows. Accordingly, those RAC implementations were reconfigured as single instance on VMware HA.

3) You have configured and/or programmed to RAC’s Oracle Notifications Services (ONS) – ONS allows application middle tiers to take code branches in response to various RAC cluster notifications. Fast Connection Failover is also a capability of ONS. It, and other capabilities, can be leveraged through the insertion of code hooks and dependent logic, as well as through middle tier configuration (in many cases). RAC provides superior capabilities for application stacks to monitor and respond to load and high availability events. However, applications that run on RAC, which also leverage ONS, would appear to be exceptions. Despite my enthusiasm for this capability, and promotion of ONS in the Oracle University RAC classes House of Brick led years ago, I have only ever heard of two organizations leveraging it.

4) The database load outstrips vSphere’s 128 vCPU limit per virtual machine – After all appropriate performance tuning, the database load outstrips vSphere 6’s 128 vCPU limit per virtual machine. In informal observations, easily 98 percent of Tier 1 clients’ workloads can fit within 128 vCPUs with scalability to spare. vSphere’s continuous march toward more per-VM compute power is rendering this criterion increasingly irrelevant.

Stability and Data Corruption

A RAC cluster can become unstable with a configuration oversight in the wrong place. An unstable RAC cluster can reduce availability compared to single instance rather than increase it. That being said, in the hands of qualified delivery partners, a RAC cluster can always be stabilized.

It is worth noting that a clustered database is inherently at higher risk for data corruption than a single instance database. RAC provides no incremental data corruption protection, or corruption recovery mechanisms, compared to a single instance database. If data corruption occurs, the entire cluster may go down while the data is being repaired, and the database will definitely be down during a restore/recovery or database flashback operation.

Inherent advantages to Single-instance Oracle on VMware

Single-instance Oracle on VMware HA may be a reasonable alternative to RAC when considering high availability. Single-instance Oracle on VMware HA is:

Far less complex
Inherently capable of being more stable
Far more approachable for a wider array of less expensive technical staff
Considerably less expensive
Cloneable in vSphere (without working around shared storage)

VMware HA is not RAC

Single-instance Oracle on VMware HA is not equivalent to RAC, however. Various IT industry conversations note RAC capabilities in excess of those offered by single instance Oracle on VMware HA. The issues listed here do not include those previously addressed in this post:

Listener crash or accidental shutdown
Oracle instance crash or accidental shutdown
Listener IP failure
Oracle instance out of memory
Oracle session crash
ORA-600 errors
Deletion of the Oracle binaries

Bottom Line

We at HoB love the challenge of RAC. We’ve only ever encountered two kinds of RAC clusters: extremely stable and extremely unstable. We’ve never met a RAC cluster we couldn’t stabilize. Fifteen years ago, RAC relatively stood alone as the ultimate HA option. Now, we no longer recommend RAC for easily two thirds of workloads, for which RAC would have been the only solution for back in the day, as single instance on VMware HA can handle them with far less complexity, operational overhead, and expense.

Coming soon:

The RAC Dilemma Part II: Four RAC Operational Best Practices
The RAC Dilemma Part III: The HA Feat RAC Will Never Pull Off

clustered database, code branches, code hooks, data corruption, database downtime, failed node, Fast Connection Failover, High Availability, Oracle CPU, Oracle Critical Patch Update, Oracle Notifications Services (ONS), Oracle RAC, Oracle Real Application Clusters™ (RAC), restore/recovery, rolling upgrade capability, scheduled maintenance window, single instance Oracle databases on VMware HA, single-instance Oracle on VMware HA, stabilizing Oracle RAC, unstable RAC cluster, VMware HA, VMware vSphere High Availability, When is Oracle RAC the Right Choice

House of Brick Staff

All Posts

Oracle

Oracle E-Business Suite Platform Migration Challenges

Understand the challenges of migrating Oracle EBS across platforms, including endian issues, export methods, and application tier reinstallation.

June 17, 2025

VMware

Effectively Managing VMware/Broadcom Concerns & Risks with House of Brick & Opscompass

Manage VMware/Broadcom risks with House of Brick & Opscompass. Reduce costs, track drift, and optimize licensing across cloud & on-prem platforms.

June 3, 2025

Java

Which Java Versions Do I Have To Pay For? (2025 Update)

An explanation for the new release of Java, version 17 and Oracle’s new No-Fee Terms and Conditions license for these new Java versions.

May 21, 2025

Resources

Oracle E-Business Suite Platform Migration Challenges

Guide

The Ultimate Guide to Oracle License Management in 2025

Datasheet

Managed Opscompass for Oracle

EBook

7 Oracle Audit Defense Strategies to Survive an Audit Notice

About House of Brick

Blog

The Oracle RAC Dilemma – Part I

House of Brick Staff

When is Oracle RAC the Right Choice?

Stability and Data Corruption

Inherent advantages to Single-instance Oracle on VMware

VMware HA is not RAC

Bottom Line

House of Brick Staff

Table of Contents

Related Posts

Oracle E-Business Suite Platform Migration Challenges

Effectively Managing VMware/Broadcom Concerns & Risks with House of Brick & Opscompass

Which Java Versions Do I Have To Pay For? (2025 Update)

Solve Your Most Complex Cloud and Operational Challenges with Experts by Your Side.