House of Brick Principal Architect
This week while going through some old VMware documentation, I discovered a reference to performance gains while using Windows Large-Page allocations. It is located on page 11 of the Performance and Scalability of Microsoft SQL Server on VMware document (http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf). However, while it claims performance of SQL Server improves when Large Pages are enabled, the document is from the ESX 3.5 era. What happens to the performance on a modern system like vSphere 5.0 and SQL Server 2008R2? If there are benefits, is it worth the drawbacks? Talk to any seasoned SQL Server DBA and most will tell you to avoid Large Pages unless Microsoft explicitly tells you to enable it, but most do not have solid reasons or experience backing their opinions. Why the FUD?
What are Large Pages?
First, let’s talk about Windows memory. Windows uses the Virtual Address Space to manage its memory map. These Virtual Addresses are stored and managed by Windows in a structure called Page Tables. Each Virtual Address has a corresponding Page Table Entry in the Page Table. This table has to be searched when the OS references memory pages. To increase performance of this lookup process, the processor maintains a cache called the Translation Look-Aside Buffer (TLB). Memory is stored in the Virtual Address Space in two block sizes – Large and Small. For this conversation, I’m ignoring the Itanium platform, which has different block sizes. Small Pages are 4KB in size. Large Pages are 2MB in size. When Large Pages are enabled, the number of entries in the TLB is dramatically reduced. More translation is required to reference Small Pages than Large Pages, and a penalty in performance is felt.
Large Pages and VMware
VMware ESX and ESXi hosts can accommodate large page requests of a guest, and this feature has been supported and enabled by default since ESX 3.5. The host CPU must support hardware-assisted memory virtualization (i.e. Intel EPT Hardware Assist and AMD RVI Hardware Assist). VMware has published studies describing Large Page performance improvements of guest workloads at the following URLs.
VMware claims up to a 19% performance gain in SQL Server 2008 by enabling Large Pages in SQL Server and enabling hardware-assisted memory virtualization in the CPUs. Reference http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf for more details.
Large Pages and SQL Server
Trace flag 834 allows SQL Server to use Large Pages for the Buffer Pool. A few restrictions apply.
- Windows Server and SQL Server must both be 64-bit.
- The service account that SQL Server is running under must have the “Lock Pages in Memory” privilege set.
Previously, Enterprise edition was required to support Large Pages natively. However, trace flag 845 on appropriately patched servers (reference MS KB 970070 for more details) allows Standard Edition SQL Servers to join the party.
To add the trace flag into the SQL Server startup parameters list, start the SQL Server Configuration Manager and add the following entry into the SQL Server instance Advanced Startup Parameters property: ;-T834.
Make sure to set the maximum buffer pool memory by configuring the ‘Maximum server memory’ setting to prevent this problem from ever occurring.
Enabling Large Pages might give a boost to performance but does not come without downsides. First, VMware’s Transparent Page Sharing (TPS) is rendered ineffectual for this virtual machine. A 2MB memory block is almost guaranteed to be unique, rather than a 4KB block, so do not count on TPS to save any memory on the host. If a host is overcommitted, this could reduce performance of other guest VMs because the balloon driver could be more aggressive in ballooning for memory reclamation. However, best practices dictate that a host not be overcommitted for business critical systems so this should be a nonissue.
Second, verify the boot time of your SQL Server when Large Pages are enabled. SQL Server must claim and zero out the memory it will be using before the service can completely start up. This process takes time. It can potentially take quite a while. This configuration can take a SQL Server startup time from a minute or two to over 30 minutes, depending on your environment. Be very careful that by enabling Large Pages you do not inadvertently violate your SLA during an unplanned outage. Remember, VMware HA requires a restart of a server on the new host if a host fails. This reboot time could be substantial if large amounts of memory are allocated.
Another problem can come if the service needs to restart and the memory it now requires to be contiguous is no longer that. If so, SQL Server will fail to start and you will see the following error in the ERRORLOG file.
Check back soon for our performance summary of SQL Server 2008 R2 on vSphere 5.0 Large Pages performance!
So, in summary:
Minimum guest memory allocation of 8GB
‘Lock Pages in Memory’ permissions granted to service account
64-bit Windows and SQL Servers only
Max SQL Server buffer pool memory setting must be configured
Potential measurable performance improvement (YMMV)
Potential significant delay in service startup time
Eliminates transparent page sharing for this VM on VMware host
SQL Server service restarts success rate could be an issue
Keep in mind that Large Pages is not a VMware feature – it’s a Windows feature. VMware simply extends it. Large Pages can be enabled whether the workload is physical or virtual, but all workloads should be virtualized, right?