Michael Stone (@), Lead Architect and CIO
Since June of 2014, with the first GA release of Red Hat Enterprise Linux (RHEL) v7, Extensible File System (XFS) has been the default file system. According to Wikipedia’s comparison of filesystems, XFS was first introduced in 1994 by Silicon Graphics, and can support a maximum volume size of 8 Exabytes (EiB). The maximum file size is the same as the maximum volume size. It was created to provide a highly performant file system that optimizes parallel writes in order to facilitate large scale processing of graphical data. XFS is an extent-based filesystem, meaning it grows similar to the way a database manages storage allocation. It is also a journaled file system, so it keeps track of changes to optimize writes, in a similar manner to a database, which makes it durable. Finally, it uses B+ trees to allocate and locate blocks of storage efficiently.
If none of that is particularly meaningful to you, just remember it uses a lot of the same tricks as modern RDBMS systems to maximize throughput and performance to its underlying storage, and does so in a robust / recoverable manner. It can also be grown and defragmented online to minimize operational disruptions. For a full run-down of features, history and availability in Linux, I refer you once again to Wikipedia.
The purpose of this post is to explore the applicability of XFS for database applications.
First the file-system of choice for Linux has traditionally been ext4, which continues to be a strong contender. However, as data sizes continue to grow, the maximum file size of 16TiB and the total volume size of 1 EiB may begin to be a limiting factor. As previously stated, XFS supports up to 8 EiB for both, making it a good choice for larger scale data needs.
Another key consideration is, of course, integrity. XFS has been around since 1994, and is a proven and reliable structure. Oracle, as well as Red Hat, SuSE, and most major Linux distributions support it. For these reasons, the risks associated with using XFS are comparable to ext4.
Next, we need to address the question of performance. Few will argue that databases usually perform best without a file-system layer. However, management of RAW disk volumes can often be cumbersome with high administrative overhead and risk. For that reason, most choose to run with either a volume manager such as Linux’s LVM or Oracle’s ASM. These introduce very light layers that facilitate SAN tooling and reduce the management burden compared to RAW. Nonetheless, filesystems can also bring benefits to operational simplicity, such as reduced risk and tooling.
Finally, many organizations are simply more comfortable when they can “see” their database storage using native tools. The good news is that both ext4 and XFS facilitate excellent performance for database systems. When properly tuned, both introduce very little impact to performance compared to RAW while bringing valuable features to bear. There are several benchmarks online attempting to compare XFS to ext4 with various RDBMS platforms and tools. Most of these benchmarks show XFS having at least comparable performance to ext4 and in some cases, significantly better performance. Regardless, it is evident that XFS lives up to its design purpose of facilitating parallel writes with low latency, and also demonstrates consistent behavior up to the limits of the underlying hardware.
The XFS filesystem allows for scaling capacity up to eight times that of ext4. It performs at least as well in the general case, and some cases (particularly with parallel workload patterns) it does significantly better. It is also tenured and widely supported in the Linux world. As such, XFS is a solid choice for databases that use file system storage.