by Cameron Cameron, Senior Consultant
AWS Relational Database Service (RDS) is a managed database platform-as-a-service (PaaS) by AWS that offers customers the option of running one of several database products such as Oracle, PostgreSQL, MySQL, Microsoft SQL Server, or MariaDB without having to worry about OS administration or database administration. Any of these database products could also be provisioned manually in an Elastic Compute Cloud (EC2) instance, but the RDS platform offers greatly simplified provisioning, administration, and operation for a small cost premium over EC2.
Amazon RDS Aurora, built for the cloud, is a variant on the standard RDS platform. Unlike standard RDS, it supports only a subset of the database engines. Currently only MySQL and PostgreSQL are supported in RDS Aurora. Many House of Brick customers struggle to understand the differences between regular RDS and Aurora RDS, so to alleviate that confusion this article was written to shed some light on the key differences between RDS MySQL and RDS Aurora MySQL.
MySQL
MySQL is an open source relational database that has wide acceptance in the industry. MySQL has over 20 years of community-backed development and support. Amazon RDS supports MySQL Community Edition versions 5.5, 5.6, 5.7, and 8.0, which means the code, applications and tools you already use today can be used with Amazon RDS.
The MySQL Community Edition includes:
- SQL and NoSQL for developing both relational and NoSQL applications
- MySQL Document Store including X Protocol, XDev API, and MySQL Shell
- Transactional Data Dictionary with Atomic DDL statements for improved reliability
- Pluggable Storage Engine Architecture (InnoDB, NDB, MyISAM, etc.)
- MySQL Replication to improve application performance and scalability
- MySQL Group Replication for replicating data while providing fault tolerance, automated failover, and elasticity
- MySQL InnoDB Cluster to deliver an integrated, native, high availability, solution for MySQL
- MySQL Router for transparent routing between your application and any backend MySQL Servers
- MySQL Partitioning to improve performance and management of large database applications
- Stored Procedures to improve developer productivity
- Triggers to enforce complex business rules at the database level
- Views to ensure sensitive information is not compromised
- Performance Schema for user/application level monitoring of resource consumption
- Information Schema to provide easy access to metadata
- MySQL Connectors (ODBC, JDBC, .NET, etc.) for building applications in multiple languages
- MySQL Workbench for visual modeling, SQL development, and administration
AWS RDS
The RDS platform offers a fully managed layer of abstraction around a traditional database architecture. Inside RDS, the database platform is being built just as one would manually do it in EC2. An EC2 instance is provisioned from an appropriate Amazon Machine Image (AMI), Elastic Block Store (EBS) storage is provisioned and attached to the instance, an appropriate Subnet Group and Security Group are attached to the instance, etc. Architecturally, RDS offers little that is novel, but the operational benefits of this managed service are impressive. Using RDS allows a database platform to be provisioned in a secure and performant fashion with only a few clicks of a button. Once provisioned, RDS automates the maintenance of the platform going forward, with backup/restore and patching all handled automatically.
From the perspective of a MySQL database, running in an RDS environment is indistinguishable from running in a manually provisioned cloud, or in an on-premises environment. MySQL on RDS follows traditional database architectural principles:
- Database writes to “local” EBS storage volumes for both transaction logs and database datafiles
- Databases are all about I/O Committed transactions result in log records with before and after images being stored in a write-ahead log (WAL) on durable storage
– Checkpoint operations are accomplished by flushing modified pages to disk
– Performance limited by I/O bandwidth and I/O Operations Per Second (IOPS)
– Tuning requires increasing I/O bandwidth, or decreasing the number of I/Os
AWS Aurora
Amazon Aurora extends the RDS platform by offering not only all the manageability features of traditional RDS, but also a special database-optimized storage subsystem. Using this specialized storage, built on top of NVMe SSD drives, instead of the traditional EBS storage utilized by RDS, offers considerable performance benefits.
The Aurora storage subsystem offers the following benefits:
- Scales automatically to keep up with your applications.
- Makes six copies of your data, distributed across multiple locations and continuously backed up to Amazon S3.
– Offers greater than 99.99% availability
– Transparently recovers from storage failures
– Allows for instance failover that typically takes less than 30 seconds
– Provides the ability to quickly restore to a previous point-in-time
- Aurora can replicate data to multiple regions for fast local performance and disaster recovery.
Aurora reduces the I/O bottleneck by decoupling compute and storage operations. Instead of burdening the database instance with checkpointing tasks on “local” storage, Amazon Aurora uses a distributed storage fleet for continuous checkpointing. This distributed storage fleet helps Aurora outperform standard MySQL in terms of throughput, and also increases availability and durability.
Figure 1: I/O flow in Amazon Aurora storage node[1]
- Receive log records and add to in-memory queue and durably persist records
- ACK to the database
- Organize records and identify gaps in log
- Gossip with peers to fill in holes
- Coalesce log records into new page versions
- Periodically stage log and new page versions to S3
- Periodically garbage-collect old versions
- Periodically validate CRC codes on blocks
Notes:
- All steps are asynchronous
- Only steps 1 and 2 are in the foreground latency path
According to AWS, Aurora offers five times the throughput of standard MySQL, performance on-par with commercial databases, but at one-tenth the cost. It should be noted that these numbers are AWS marketing claims. House of Brick has found the cost claims to be roughly correct when compared with Oracle Enterprise Edition deployments, but in our experience the performance advantage of Aurora in real-world scenarios is closer to 30%
Pros and Cons
The regular RDS and Aurora RDS offerings for MySQL and PostgreSQL are more similar than they are different. When comparing the two platforms to select the best architecture for a particular application stack, the following should be taken into consideration.
- RDS offers a greater range of database engines and versions than Aurora RDS
- Aurora RDS offers superior performance to RDS due to the unique storage subsystem
- Aurora RDS offers superior scalability to RDS due to the unique storage subsystem
- The pricing models differ slightly between RDS and Aurora RDS, but Aurora RDS is generally a bit more expensive to implement for the same database workload
- Aurora RDS offers superior high availability to RDS due to the unique storage subsystem
Conclusion
Amazon Aurora RDS offers features and performance that make it competitive with commercial databases, but with a much more manageable price point. RDS Aurora MySQL can outperform RDS MySQL by up five times in synthetic benchmark tests, and offers noticeable performance improvements in real world workflows. There are no compatibility issues between RDS Aurora MySQL and RDS MySQL, since both are built using generally available MySQL open-source software. House of Brick has found Aurora RDS to be a superior platform in terms of performance, scalability, and availability for any application that can tolerate the more restrictive set of available versions.
[1] https://d1.awsstatic.com/events/reinvent/2019/REPEAT_Amazon_Aurora_storage_demystified_How_it_all_works_DAT309-R.pdf (slide 16)