House of Brick Oracle The Oracle RAC Dilemma – Part I

The Oracle RAC Dilemma – Part I

Oracle, VMware, vSphere

Dave Welch (@OraVBCA), Chief Evangelist

Part I: Four Criteria for Introducing or Keeping RAC

VMware vSphere High Availability can provide significant levels of high availability for many workloads. VMware HA does so without the complexity, fragility, or cost of Oracle Real Application Clusters™ (RAC).

Oracle RAC can introduce significant complexity, expense, and risk into a system stack. There are many shops that experience net worse HA with RAC than with single instance Oracle databases on VMware HA.

When is Oracle RAC the Right Choice?

Your organization may benefit from introducing, or maintaining, RAC if you can identify with at least one of the following evaluation criteria.

You have an explicit SLA that the database is down for no more than four minutes in an emergency
RAC’s rolling upgrade capability is needed
You have configured and/or programmed to RAC’s Oracle Notifications Services (ONS)
The database load outstrips vSphere’s 128 vCPU limit per virtual machine

Let’s look at each of these scenarios more closely.

1) You have an explicit SLA that requires the database be down for less than four minutes in an emergency – VMware HA needs about four minutes to restart a virtual machine and have the database ready for connections. RAC can have a failed node’s transactions failed over to a surviving node and the RAC cluster un-paused in about one minute. A corollary criterion is whether applications can provide a scheduled maintenance window.

2) RAC’s rolling upgrade capability is needed – Because a RAC cluster involves multiple instances, many patches can be done with no downtime to the database. However, major patch sets usually update the database data dictionary forcing downtime for the entire RAC cluster. Oracle Critical Patch Updates also almost always require database downtime.

One of our many clients, who leads their vertical worldwide, brought us in years ago for a statistical assessment of their various workloads’ need for RAC. They confessed that up to that point they had handed out RAC to business units based on request rather than metrics. Although they had assumed that their uptime requirements could not be met without RAC, it was determined that 80% of their RAC implementations were unnecessary, and single instance on VMware HA provided more than adequate downtime for the required patching windows. Accordingly, those RAC implementations were reconfigured as single instance on VMware HA.

3) You have configured and/or programmed to RAC’s Oracle Notifications Services (ONS) – ONS allows application middle tiers to take code branches in response to various RAC cluster notifications. Fast Connection Failover is also a capability of ONS. It, and other capabilities, can be leveraged through the insertion of code hooks and dependent logic, as well as through middle tier configuration (in many cases). RAC provides superior capabilities for application stacks to monitor and respond to load and high availability events. However, applications that run on RAC, which also leverage ONS, would appear to be exceptions. Despite my enthusiasm for this capability, and promotion of ONS in the Oracle University RAC classes House of Brick led years ago, I have only ever heard of two organizations leveraging it.

4) The database load outstrips vSphere’s 128 vCPU limit per virtual machine – After all appropriate performance tuning, the database load outstrips vSphere 6’s 128 vCPU limit per virtual machine. In informal observations, easily 98 percent of Tier 1 clients’ workloads can fit within 128 vCPUs with scalability to spare. vSphere’s continuous march toward more per-VM compute power is rendering this criterion increasingly irrelevant.

Stability and Data Corruption

A RAC cluster can become unstable with a configuration oversight in the wrong place. An unstable RAC cluster can reduce availability compared to single instance rather than increase it. That being said, in the hands of qualified delivery partners, a RAC cluster can always be stabilized.

It is worth noting that a clustered database is inherently at higher risk for data corruption than a single instance database. RAC provides no incremental data corruption protection, or corruption recovery mechanisms, compared to a single instance database. If data corruption occurs, the entire cluster may go down while the data is being repaired, and the database will definitely be down during a restore/recovery or database flashback operation.

Inherent advantages to Single-instance Oracle on VMware

Single-instance Oracle on VMware HA may be a reasonable alternative to RAC when considering high availability. Single-instance Oracle on VMware HA is:

Far less complex
Inherently capable of being more stable
Far more approachable for a wider array of less expensive technical staff
Considerably less expensive
Cloneable in vSphere (without working around shared storage)

VMware HA is not RAC

Single-instance Oracle on VMware HA is not equivalent to RAC, however. Various IT industry conversations note RAC capabilities in excess of those offered by single instance Oracle on VMware HA. The issues listed here do not include those previously addressed in this post:

Listener crash or accidental shutdown
Oracle instance crash or accidental shutdown
Listener IP failure
Oracle instance out of memory
Oracle session crash
ORA-600 errors
Deletion of the Oracle binaries

Bottom Line

We at HoB love the challenge of RAC. We’ve only ever encountered two kinds of RAC clusters: extremely stable and extremely unstable. We’ve never met a RAC cluster we couldn’t stabilize. Fifteen years ago, RAC relatively stood alone as the ultimate HA option. Now, we no longer recommend RAC for easily two thirds of workloads, for which RAC would have been the only solution for back in the day, as single instance on VMware HA can handle them with far less complexity, operational overhead, and expense.

Coming soon:

The RAC Dilemma Part II: Four RAC Operational Best Practices
The RAC Dilemma Part III: The HA Feat RAC Will Never Pull Off

clustered database, code branches, code hooks, data corruption, database downtime, failed node, Fast Connection Failover, High Availability, Oracle CPU, Oracle Critical Patch Update, Oracle Notifications Services (ONS), Oracle RAC, Oracle Real Application Clusters™ (RAC), restore/recovery, rolling upgrade capability, scheduled maintenance window, single instance Oracle databases on VMware HA, single-instance Oracle on VMware HA, stabilizing Oracle RAC, unstable RAC cluster, VMware HA, VMware vSphere High Availability, When is Oracle RAC the Right Choice

House of Brick Staff

All Posts

Stop Guessing About Your Database Estate

Get continuous visibility into database sprawl and licensing risk across hybrid environments.

Oracle

How to Configure Continuous Database Inventory for Audit Readiness

Learn best practices for configuring continuous database inventory with automated discovery, unified tracking, and historical snapshots to eliminate audit surprises.

March 26, 2026

Oracle

Oracle Database Feature Usage is Your Single Biggest Audit Trap

Oracle feature usage can trigger massive audit penalties. Learn how to detect, track, and avoid licensing risk before it’s too late.

March 24, 2026

Diagram showing the AWS database visibility gap: AWS infrastructure tools see EC2 and RDS instances but cannot see database-level details like Oracle and SQL Server editions, feature usage, or license compliance status

AWS

You Can’t Address Database Sprawl Without Knowing What You Have

AWS tools see instances, not databases. Learn why fixing Oracle and SQL Server sprawl requires visibility that connects infrastructure data to database-level compliance information.

February 27, 2026

Popular Keywords

Categories

About House of Brick