Oracle RDS Best Practices for Backup

Nick Walter, Principal Architect

I recently worked with a client who was using AWS RDS Oracle for the first time. They, like a lot of organizations familiar with managing Oracle databases in on-premises environment, had a fairly robust backup methodology in place already. Oracle tools like RMAN or Data Pump export are frequently used, in conjunction with offline or offsite storage for safety and encryption for security. Trying to adapt those same tools and techniques to an Oracle database in Amazon RDS was causing them a lot of grief. Luckily I was able to help them stop trying to pound the square peg into the round hole and implement a whole new way of doing backups for RDS that offered the same levels of reliability and security.

Backup and restore methodologies differ significantly between on-premises Oracle databases and Oracle RDS databases. The traditional on-premises Oracle backup techniques and tools, such as file system cold backups or Oracle Recovery Manager (RMAN) backups aren’t applicable to the RDS environment as direct user access to the AWS managed Oracle instance isn’t permitted. Instead backups must be performed using AWS provided tools. Luckily, RDS has built-in backup and restore tooling that makes it fairly simple to take consistent snapshot backups of the database to S3 storage either on-demand or on a scheduled basis. If unspecified at creation, a random daily backup time will be assigned to an RDS database with a default retention period for backups of 1 or 7 days, depending on the database engine and the method of creation.

While these built-in backup tools are fairly handy, House of Brick’s best practice for any production database, RDS or not, call for backup retention periods well in excess of 7 days. Organizations in highly regulated industries may need to retain monthly or annual backups over a period of several years in order to meet audit requirements, so we’ve had to counsel multiple clients on solutions beyond the built-in automated tools.

The first step for any production RDS database is to configure, at a minimum, an appropriate automated daily snapshot with the maximum allowed daily backup retention period of 35 days. It is important to use the built-in automatic backup tool because it’s the only tool that also backs up transaction logs, which allow for point-in-time recovery. The backup retention period selected for the automated backup limits how far back in time a point-in-time recovery is allowed. So, the maximum 35-day setting allows a point-in-time recovery of the database to any point in the last 35 days, which can be critical when dealing with logical corruption issues.

Once the automated backups are established, thought should be given to the best method for retaining individual manual database snapshot backups for a longer period. Because the AWS RDS console and APIs expose a manual snapshot mechanism, all of the methods revolve around creating manual snapshots and putting a framework around managing them appropriately. The manual snapshots are stored in AWS S3 storage and are easily accessible to authenticated users. They can be retrieved from S3 and stored in whatever fashion, and for whatever period, an organization deems necessary. As an extreme example, House of Brick has seen an organization with strict regulatory requirements contemplate preserving RDS snapshots by simply downloading the manual snapshots from S3, writing them to encrypted tape storage, and shipping them to off-site secure storage. While probably overkill for most organizations, this example illustrates that Oracle RDS databases can be backed up with the same level of security and rigor as traditional on-premises Oracle databases.

The House of Brick best practice for proper backup retention is to use manual snapshots to store snapshots of the database in S3 on a monthly and annual basis. Using an encrypted S3 bucket is important for ensuring the security of the backup data. Because running these backups manually would be tedious and error prone, using a scheduled Lambda function is highly recommended for this purpose. The following example python 2.7 code illustrates a Lambda function that can be used for monthly snapshots for the instance OracleRDStest1 in the US-EAST-1 region.

import botocore
import datetime
import re
import logging
import boto3

region='us-east-1'
rds_instances = ['OracleRDStest1']
def lambda_handler(event, context):
     source = boto3.client('rds', region_name=region)
     for instance in rds_instances:
         try:

             timestamplabel = str(datetime.datetime.now().strftime('%Y-%m-%d-%H-%-M-%S')) + "monthly-snap"
             snapshot = "{0}-{1}-{2}".format("mysnapshot", instance,timestamplabel)
             response = source.create_db_snapshot(DBSnapshotIdentifier=snapshot, DBInstanceIdentifier=instance)
             print(response)
         except botocore.exceptions.ClientError as e:
            raise Exception("Could not create snapshot: %s" % e)

 

House of Brick also recommends configuring Lambda functions to clean up the monthly manual snapshots after they expired, usually after twelve months, but maintain at least one as an annual snapshot (to be retained up to the limit of an organizations chosen or regulation-mandated retention period). Typically this is the last monthly snapshot of a calendar or fiscal year.

Once appropriate retention is established, backup reliability and availability also needs to be considered. It is possible for an entire AWS region to become unavailable, which has happened in the past. Keeping the backups in a S3 bucket in the same region as the RDS instance creates the potential nightmare scenario of simultaneously losing access to the database and all database backups. To combat this, House of Brick recommends mirroring the monthly, annual, and most recent daily backup to an encrypted S3 bucket in another region and on a different AWS account. The latter point is key, as it offers protection of the backup data from technical failure, human error, and malicious intruders. If the protected bucket is properly configured to allow versioned read/write access on a cross-account basis to the production account hosting the RDS instance, then the production account can never damage or delete retained backups, even if the production account is compromised to the root account level.

Using all of these techniques allows RDS databases to have the same levels of secure backup protection as even the most rigorous on-premises backup practices.

As always, House of Brick is happy to talk shop with anyone running business-critical databases in a cloud environment. If you are concerned about getting bullet-proof backup and recovery processes in place for your cloud based workloads, feel free to reach out.

Please note: this blog contains code examples provided for your reference. All sample code is provided for illustrative purposes only. Use of information appearing in this blog is solely at your own risk. Please read our full disclaimer for details.

Table of Contents

Related Posts