HugePages: A Reference

posted September 6, 2012, 10:58 AM by

Jim Hannan (@HoBHannan), Principal Architect

HugePages (also known as Large Pages) is something we at House of Brick are often asked about. There is significant information on the subject, however, the information is scattered throughout MetaLink and various other resources. I have attempted to consolidate some of this information, and have added some of our experiences and best practices for implementing HugePages.

Linux Memory Kernel Parameters

Before implementing HugePages, a check of kernel memory settings should be performed. Two key settings for successfully implementation are shmall and shmax.

Oracle makes use of one of the three memory management models to create the SGA during database startup, and it does this in the following sequence: First, Oracle attempts to use the one-segment model (most optimal). If this fails, it proceeds with the next one, which is the contiguous multi-segment model (contiguous smaller chunks). If that fails as well, it goes with the last option, which is the non-contiguous multi-segment model. This model can potentially cause memory fragmentation.

Reference: http://docs.oracle.com/cd/B12037_01/server.101/b10755/initparams166.htm

Oracle memory kernel parameters adjusted:

kernel.shmall = 1073741824
kernel.shmax = 4398046511104

Operating System HugePages

Using HugePages is a technique that allows the operating system to allocate and manage memory using a larger-than-standard page size. This term is synonymous with “Large Memory Pages.” As of ESX 3.5 the hypervisor is enhanced to support large pages. Before 3.5, ESX Server used small pages to emulate large pages. The default page size for most systems is 4KB (4096 bytes). HugePages are typically 2MB or 4MB. Managing memory using a larger page size reduces the required number system calls by a factor of 500 or 1,000.

Due to dramatic improvements in hardware and memory performance, the smaller page size has not traditionally been a significant problem. However, as memory capacity has the potential to exceed 1TB, and databases can effectively manage hundreds of GBs of memory, it becomes significant. HugePages were introduced to address this issue. The performance impact associated with the system calls relating to memory management is particularly noticeable when looking at very large memory allocations under a hypervisor, because memory operations must be protected. Protected system calls require a degree of serialization and can experience a measurable performance penalty under a hypervisor. Therefore reducing the number of calls has a high return.

A 14% performance improvement has been shown under load for an Oracle database with a 24GB SGA (shared memory pool) when using HugePages as compared with standard pages. A 40% improvement was measured for a system with a 96GB SGA with no other changes.

Under The Hood of HugePages

The CPU’s Memory Management Unit (MMU) stores a cache of recently used memory address. This is called Translation Lookaside Buffer (TLB). TLB is a memory address table stored in the near CPU cache. Without TLB, a modern day OS does a “Page Walk” walking the physical memory to find the physical memory address, this process causes higher latency. With TLB the OS can check the TLB cache for the memory address for the required memory location, if successful this is called a “page hit”. A TLB miss is called “page fault”, which results in a Page Walk. When your application uses HugePages, a larger memory range is addressed, allowing for more page hits. This reduces the time it takes to find data in memory and the lowers the CPU cost.

Linux handles HugePages gracefully by reserving a pool of memory at system boot time. This amount of memory can be increased or decreased on the fly using sysctl, but an increase requires that the memory be available to be successful without a reboot. Using HugePages in Linux implies memory locking. User limits must accommodate locking of the entire memory segment for HugePages to be successful.

Database 11g HugePages

Enabling expanded Automatic Memory Management (AMM) in Database 11g prohibits use of Linux HugePages. This is true of all current Database 11g R1 and R2 releases and patches, including the latest (Database 11.2.0.2). Database 11g R1 expanded AMM to include the Program Global Area (PGA). AMM is not active by default in Database 11g; it is enabled via the parameters MEMORY_TARGET and MEMORY_MAX_TARGET. However, the Database 10g SGA_TARGET and SGA_MAX_SIZE parameters continue to work as they did previously.

AMM does not use standard System V-style IPC shared memory. Instead, AMM uses memory mapped files in /dev/shm. The drawback is that memory mapped files do not support HugePages. This is a fundamental problem with AMM if HugePages are desired.

Of the two features, it is strongly recommended that HugePages take priority, regardless of the workload characteristics and database instance memory size. In general, HugePages is a performance feature, whereas AMM is a configuration feature.

Database 11.2.0.2 introduced the initialization parameter USE_LARGE_PAGES to provide more HugePages-related instance control. Its values are TRUE, FALSE, or ONLY. The default is TRUE and causes Oracle to behave as it always has. Set the parameter to ONLY to enforce the use of HugePages and prevent instance startup if sufficient HugePages are not available.

Finally, although all versions of Oracle since version 7 have supported HugePages, Database 11.2.0.2 is the first release that provides any meaningful output in the alert log relating to Oracle’s use of them.

Metalink References:
HugePages on Oracle Linux 64-bit [ID 361468.1]
HugePages on Linux: What It Is… and What It Is Not… [ID 361323.1]
Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB Configuration [ID 401749.1]
HugePages and Oracle Database 11g Automatic Memory Management (AMM) on Linux [ID 749851.1]

HugePages Implementation Example

Note: As stated previously, HugePages are not compatible with Oracle AMM.

alter system set sga_max_size = 44000M scope=spfile;
alter system set sga_target = 44000M scope=spfile;
alter system set pga_aggregate_target = 4000M scope=spfile;
alter system set pre_page_sga = TRUE scope=spfile;

Check to see if HugePages are currently configured:

grep Huge /proc/meminfo

If HugePages_total is > zero, then this is the number of HugePages currently configured. HugePages are allocated in 2MB chunks. Next, calculate the number of HugePages needed for a 44GB SGA, and add ten pages for overhead.

(44000/2) + 10 = 22010

The setting for HugePages is configured in /etc/sysctl.conf. To check the current setting, run the following command:

grep nr_hugepages /etc/sysctl.conf

Check /etc/security/limits.conf to make sure that Oracle can lock shared memory. This number is expressed in KB, set this to a number higher than 44000000 (44 GB expressed in KB)

grep oracle /etc/security/limits.conf
oracle    soft    memlock 50000000
oracle    hard    memlock    50000000

Note: This section must be done as root.

Add or modify the following line to /etc/sysctl.conf:

vm.nr_hugepages=22010

Add or modify the following two lines to /etc/security/limits.conf:

oracle    soft    memlock 50000000
oracle    hard    memlock    50000000

Add or modify the following two lines in /etc/sysctl.conf:

kernel.shmmax = 4398046511104
kernel.shmall = 1073741824
kernel.shmmni = 4096

kernel.shmax = 4398046511104

The system must be rebooted to have the HugePages setting take effect. After the reboot check the HugePages setting.

# grep Huge /proc/meminfo
HugePages_Total:  22010
HugePages_Free:   22010
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

If the settings are correct then start the database. Confirm HugePages are allocated with the command below.

grep Huge /proc/meminfo
HugePages_Total:  22010
HugePages_Free:      9
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

Share with your networkTweet about this on Twitter
Twitter
Share on LinkedIn
Linkedin
Share on Facebook
Facebook
Digg this
Digg
Email this to someone
email

1 Comment

  • anon says:

    Can you update this to show how the kernel settings were calculated and how oracle is affected by these parameters if you are planning to use huge pages for the SGA ? Is there a setting with 10g / 11g that tells it to use huge pages? Also, how to make sure you give enough to each to ensure that the database wont tell you it doesn’t have enough memory when starting up?

    Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *

Icon URL Target
1

This site uses Akismet to reduce spam. Learn how your comment data is processed.

WANT TO LEARN MORE?