House of Brick Principal Architect
VMware system administrators, listen up! Your DBAs need direct access into the VMware vCenter Server Console that manages the virtual machines underneath their database servers. I consider it a requirement for organizations to grant at least read-only access for performance statistics monitoring. Some organizations and their VMware administrators can be hesitant or fearful every time I bring up this topic (fear is the path of the dark side and we all know how that ended). In this blog post I will explain why you have nothing to fear and how this level of access can even lighten your workload.
Note: While I may significantly reference VMware as the hypervisor, these topics apply to all hypervisors on the market. The only things that will change are brands, buzzwords, and the location of the technologies in question.
Fear, Uncertainty, and Doubt (FUD)
Database administrators, like most technologists, like to be in control of their environments. They are also usually the types of people that love numbers and metrics. However, in most cases, most of the server infrastructure underneath their databases is a black box, out of their control or direct access, frequently due to organizational silos. Most DBAs are bothered by this lack of transparency, and that black box simply contributes to distrust between the silos. Adding a virtualization layer into their worlds introduces yet another layer that becomes yet another black box. Most of the time virtualization is thrust on the DBAs as well, sometimes even without consulting them, so this new layer is immediately distrusted and feared. If anyone runs servers on an infrastructure that is hidden, the only certainty to the organization is the FUD surrounding those layers.
To the DBAs, this fear over virtualization will manifest in a number of ways. At the first signs of a performance problem anywhere in the environment — whether it is subjectively perceived or objectively measured — the DBAs will immediately blame and doubt the hypervisor. The ‘blame game’ goes back and forth between the silos, and this doubt can persist. The longer the doubt lingers, the more difficult it is to remove from the organization.
I’m sure you have seen this lingering doubt before:
The CIO that was burned by a failed virtualization attempt many years ago.
The DBA could have read some of the misinformation available on the Internet and won’t let it go.
The mainframer who instinctively distrusts anything new.
Whatever the case or person may be, presenting access to the raw metrics underneath servers and then providing education on how to interpret the output will help the doubters resolve their own FUD at their own pace.
Granting access to vCenter performance statistics will take care of the most obvious task — gaining performance data for their systems. In the event of a suspected performance problem, a DBA will dive through their database server performance metrics, Windows Perfmon or other OS-level performance tools. This will absolve them of any guilt if the performance problem proves to be underneath the database stack. Accessing vCenter performance statistics will help the DBA have a more complete picture of the performance of each component underneath their business-critical systems.
During a real-time investigation, vCenter statistics can be used to quickly see performance levels of these servers. These statistics almost always objectively show that the virtualization layer should not be suspected. The investigation will then continue down the system stack. These statistics can even be used to increase the speed of resolution by possibly highlighting obvious performance anomalies when compared against system baselines.
You do baseline your system stack, don’t you? Well, don’t you?
I rarely find an environment that maintains full running performance baselines in their environment. vCenter provides a very high-level running performance baseline of basic metrics for the physical servers and all virtual machines running on them.
Host CPU Usage
Host Host Datastore Latency
Host Disc Throughput
Host Network Throughput
The current capacity remaining of these servers can very quickly be referenced and, most importantly, cataloged. Use vCenter to view the last six months of the physical server performance statistics, and you can begin to predict when you will need to introduce more physical computer nodes into your VMware cluster. Use these predictions with hard physical evidence and you should have a much easier time budgeting for and completing the approval process with greater success.
If you choose to monitor the VM-level statistics and performance trends as well, you can begin to better understand the workload patterns of your servers. Are some VMs only very busy at night while others are busy during a normal business day? Are some servers maxed out on CPU utilization only on particular days of the week? If you understand the workloads in your world, you will start to better understand your environment as a whole. To me, baselines are so critically important that I stress baselines on almost all of the virtualization projects that I work on, and almost half of my SQL Saturday ‘Health and Performance’ presentation is dedicated to this topic.
The biggest challenge with granting this access to your DBAs is organizational education. The responsibility to educate the DBAs on how to interpret these statistics lies within the organization. Infrastructure metrics are probably vastly different from the items that DBAs are used to working around day-to-day.
Education is the key. Whenever I am able to grant the DBAs this level of access, I educate them before giving them the access. I educate them on topics including (but not limited to):
Common myths and misconceptions
Physical host resources (CPU, RAM, disk, network) and their shared nature
CPU Ready Time
Datastores and storage layer concepts
Memory, over-commitment, TPS, and ballooning
Resource pools and prioritization
Paradigm shifts such as vMotion and storage vMotion, DRS, VM isolation, snapshots, cloning and templates
Educate the DBAs, grant them access to see what the virtualization layer does (and doesn’t do), and let them see the truth behind enterprise virtualization. Virtualization works, and works well, even for business-critical database servers. Educate everyone, eliminate the FUD, and grow your advocacy base. This group of internal advocates can only help you further the push towards total virtualization.