Jim Hannan (@HoBHannan), Principal Architect
HugePages (also known as Large Pages) is something we at House of Brick are often asked about. There is significant information on the subject, however, the information is scattered throughout MetaLink and various other resources. I have attempted to consolidate some of this information, and have added some of our experiences and best practices for implementing HugePages.
Linux Memory Kernel Parameters
Before implementing HugePages, a check of kernel memory settings should be performed. Two key settings for successfully implementation are shmall and shmax.
Oracle makes use of one of the three memory management models to create the SGA during database startup, and it does this in the following sequence: First, Oracle attempts to use the one-segment model (most optimal). If this fails, it proceeds with the next one, which is the contiguous multi-segment model (contiguous smaller chunks). If that fails as well, it goes with the last option, which is the non-contiguous multi-segment model. This model can potentially cause memory fragmentation.
Reference: http://docs.oracle.com/cd/B12037_01/server.101/b10755/initparams166.htm
Oracle memory kernel parameters adjusted:
kernel.shmall = 1073741824
kernel.shmax = 4398046511104
Operating System HugePages
Using HugePages is a technique that allows the operating system to allocate and manage memory using a larger-than-standard page size. This term is synonymous with “Large Memory Pages.” As of ESX 3.5 the hypervisor is enhanced to support large pages. Before 3.5, ESX Server used small pages to emulate large pages. The default page size for most systems is 4KB (4096 bytes). HugePages are typically 2MB or 4MB. Managing memory using a larger page size reduces the required number system calls by a factor of 500 or 1,000.
Due to dramatic improvements in hardware and memory performance, the smaller page size has not traditionally been a significant problem. However, as memory capacity has the potential to exceed 1TB, and databases can effectively manage hundreds of GBs of memory, it becomes significant. HugePages were introduced to address this issue. The performance impact associated with the system calls relating to memory management is particularly noticeable when looking at very large memory allocations under a hypervisor, because memory operations must be protected. Protected system calls require a degree of serialization and can experience a measurable performance penalty under a hypervisor. Therefore reducing the number of calls has a high return.
A 14% performance improvement has been shown under load for an Oracle database with a 24GB SGA (shared memory pool) when using HugePages as compared with standard pages. A 40% improvement was measured for a system with a 96GB SGA with no other changes.
Under The Hood of HugePages
The CPU’s Memory Management Unit (MMU) stores a cache of recently used memory address. This is called Translation Lookaside Buffer (TLB). TLB is a memory address table stored in the near CPU cache. Without TLB, a modern day OS does a “Page Walk” walking the physical memory to find the physical memory address, this process causes higher latency. With TLB the OS can check the TLB cache for the memory address for the required memory location, if successful this is called a “page hit”. A TLB miss is called “page fault”, which results in a Page Walk. When your application uses HugePages, a larger memory range is addressed, allowing for more page hits. This reduces the time it takes to find data in memory and the lowers the CPU cost.
Linux handles HugePages gracefully by reserving a pool of memory at system boot time. This amount of memory can be increased or decreased on the fly using sysctl, but an increase requires that the memory be available to be successful without a reboot. Using HugePages in Linux implies memory locking. User limits must accommodate locking of the entire memory segment for HugePages to be successful.
Database 11g HugePages
Enabling expanded Automatic Memory Management (AMM) in Database 11g prohibits use of Linux HugePages. This is true of all current Database 11g R1 and R2 releases and patches, including the latest (Database 11.2.0.2). Database 11g R1 expanded AMM to include the Program Global Area (PGA). AMM is not active by default in Database 11g; it is enabled via the parameters MEMORY_TARGET and MEMORY_MAX_TARGET. However, the Database 10g SGA_TARGET and SGA_MAX_SIZE parameters continue to work as they did previously.
AMM does not use standard System V-style IPC shared memory. Instead, AMM uses memory mapped files in /dev/shm. The drawback is that memory mapped files do not support HugePages. This is a fundamental problem with AMM if HugePages are desired.
Of the two features, it is strongly recommended that HugePages take priority, regardless of the workload characteristics and database instance memory size. In general, HugePages is a performance feature, whereas AMM is a configuration feature.
Database 11.2.0.2 introduced the initialization parameter USE_LARGE_PAGES to provide more HugePages-related instance control. Its values are TRUE, FALSE, or ONLY. The default is TRUE and causes Oracle to behave as it always has. Set the parameter to ONLY to enforce the use of HugePages and prevent instance startup if sufficient HugePages are not available.
Finally, although all versions of Oracle since version 7 have supported HugePages, Database 11.2.0.2 is the first release that provides any meaningful output in the alert log relating to Oracle’s use of them.
Metalink References:
HugePages on Oracle Linux 64-bit [ID 361468.1]
HugePages on Linux: What It Is… and What It Is Not… [ID 361323.1]
Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB Configuration [ID 401749.1]
HugePages and Oracle Database 11g Automatic Memory Management (AMM) on Linux [ID 749851.1]
HugePages Implementation Example
Note: As stated previously, HugePages are not compatible with Oracle AMM.
alter system set sga_max_size = 44000M scope=spfile;
alter system set sga_target = 44000M scope=spfile;
alter system set pga_aggregate_target = 4000M scope=spfile;
alter system set pre_page_sga = TRUE scope=spfile;
Check to see if HugePages are currently configured:
grep Huge /proc/meminfo
If HugePages_total is > zero, then this is the number of HugePages currently configured. HugePages are allocated in 2MB chunks. Next, calculate the number of HugePages needed for a 44GB SGA, and add ten pages for overhead.
(44000/2) + 10 = 22010
The setting for HugePages is configured in /etc/sysctl.conf. To check the current setting, run the following command:
grep nr_hugepages /etc/sysctl.conf
Check /etc/security/limits.conf to make sure that Oracle can lock shared memory. This number is expressed in KB, set this to a number higher than 44000000 (44 GB expressed in KB)
grep oracle /etc/security/limits.conf
oracle   soft   memlock 50000000
oracle   hard   memlock   50000000
Note: This section must be done as root.
Add or modify the following line to /etc/sysctl.conf:
vm.nr_hugepages=22010
Add or modify the following two lines to /etc/security/limits.conf:
oracle   soft   memlock 50000000
oracle   hard   memlock   50000000
Add or modify the following two lines in /etc/sysctl.conf:
kernel.shmmax = 4398046511104
kernel.shmall = 1073741824
kernel.shmmni = 4096
kernel.shmax = 4398046511104
The system must be rebooted to have the HugePages setting take effect. After the reboot check the HugePages setting.
# grep Huge /proc/meminfo
HugePages_Total:Â 22010
HugePages_Free:Â Â 22010
HugePages_Rsvd:Â Â Â Â Â 0
Hugepagesize:Â Â Â Â 2048 kB
If the settings are correct then start the database. Confirm HugePages are allocated with the command below.
grep Huge /proc/meminfo
HugePages_Total:Â 22010
HugePages_Free:Â Â Â Â Â 9
HugePages_Rsvd:Â Â Â Â Â 0
Hugepagesize:Â Â Â Â 2048 kB