VSP3116 – VMware vSphere 5.0 Resource Management Deep Dive

For those coming here from Yellow Bricks, thanks for stopping by and see all my other VMworld 2011 posts (many other sessions).

Frank Denneman & Valentin Hamburger. Denneman = rock star. Hamburger is Technical Account Exec but has VCAP-DCD so knows his stuff. No Q/A….go buy the book if you do. 😉

Summary at the top = yes, this is Install and Configure class stuff but goes much deeper. Really good session…get the PowerPoint too.

  • What is Resource Entitlement?
    • dynamic task within the ESXi host
    • influenced by particular parameters
    • Dynamic Entitlement – Active use of CPU and memory, Level of contention
    • Static Entitlement – Shares, Reservations, Limits
    • Only matters when there is contention?
  • What is contention?
    • Short-term contention
      • Load Correlation — relationship between loads running in different machines. Ex. web server stack.
      • Load Synchronicity —  caused by load correlation but can also exist due to user activity. Ex. Login hours/boot storms/login storms/patch deployment.
    • Long-term contention
      • Serious over-provisioning
  • Life without DRS — oh, the agony. Let’s talk about shares, reservations, and limits.
    • Shares = relative priority – relative value with siblings. Initially related to virtual hardware config.
      • Levels = Low (1), Normal (2), High (4). Also Custom Shares.
      • Share value is diluted while adding more VMs.
    • CPU Reservations – VM-level
      • Guarantees certain level of resources to a VM
      • Influences HA admission control at Power On.
      • Not as bad as often referenced…doesn’t claim CPU when VM is idle (“refundable”)
      • Caveat = CPU reservation doesn’t always equal priority.
    • Memory Reservations – VM-level
      • Guarantees certain level of resources to VM and influences HA admission control
      • It *is* as bad as often referenced — “non-refundable” once allocated. Windows also zeroes out every bit of memory during startup so….that rots.
      • Caveats = will drop consolidation ratio (big impact), may waste resources (idle memory can’t be reclaimed), introduces more complexity for capacity planning.
    • Limits – VM-level
      • apply even when enough resources/no contention so limits use of available physical resources.
      • Often more harmful than helpful – guest-OS and application size are acting according to configured memory (and don’t know about limit)
      • Can have tremendous impact on performance.
      • Reducing configured CPU/memory is a far a better approach than applying limits.
      • If a memory limit is active, ESXi provides the additional consumed memory from the VMS vswap file (i.e. VM swapping occurs)
      • When a CPU limit is active, ESXi de-schedules the VM even if CPU resources are available.
  • Host-local scheduler — how all the things above get calculated and enforced.
    • CPU and memory scheduler – computes resource entitlement for powered-on virtual machines.
    • Admission control (related to HA) determine if VMs can be powered on)
  • Without DRS….more contention and more complexity.
    • We have to do so many things manually….we really do NOT want to be here.
  • Life with DRS…it’s about clusters and resource pools.
    • Cluster = one big host. Local ESXi scheduler is still responsible for allocating resources to it’s child objects.
    • Resource Pools – easily model resources across multiple hosts to make possible multiple tiers/SLAs.
    • DO NOT use resource pools as a folder structure. Even if not configured, they affect resource entitlement….and actually do so in an uncontrolled (seemingly “random”) way.
      • Personal note: I had a customer a while back where I had to undo this…..was causing weird performance issues.
    • Same share levels, same relativity to sibling resource pools.
    • Never create a VM outside and next to a resource pool at the same level (VM could get the same levels of resources as the entire Resource Pool). VM gets shares based on virtual hardware configuration which could be much higher than the Resource Pool.
      • Internally Resource Pool is considered a (4) vCPU, 16 GB RAM VM for default shares.
    • If using Resource Pools, use them entirely….not halfway.
    • Resource Pool Shares
      • Share levels are easy but imprecise in showing priority..call it a guideline.
      • Cluster resources get divided between resource pools before between children.
  • Resource Pool Scenarios — how to estimate the right # of shares per resource pool in 4 steps…
    • Match defined performance SLAs to resource pools.
      • Ex. 70% Production, 20% Test, 10% Development.
    • Try to make up a shares per VM model and then add together to apply to Resource Pool.
      • Prod: 70 shares, Test: 20 shares, Dev: 10 shares.
    • Based on number of vCPUs, set the shares on the resource pools….this one could backfire.
      • Prod: 10 vCPUs – 700 shares
      • Test: 5 vCPUs – 100 Shares
      • Dev: 20 vCPUs – 200 Shares
    • Introduce a scheduled task which sets the shares per resource pool, based on number of VMs/vCPUs they contain.
      • Ex. PowerShell script which runs daily and takes corrective actions.
      • See Denneman’s blog for a script like this.
  • Reservations on an Resource Pool (RP)
    • Dynamic Entitlement is key, allocation differs based on utilization.
    • Refundable and shared when not needed.
    • Useful to grant resources which are required by the child VMS of the resource pool
    • Can set VM-level reservations on a VM inside an RP — useful for Java VMs, SAP, etc. etc.
    • Overhead – can’t control the memory overhead reserved for ESX memory overhead on a VM, depends on size of virtual machine.
      • VMkernel reserves memory to run any VM.
      • Non-shareable by design.
      • Include this during consolidation calculations and RP-level reservation sizing.
      • Overhead reservation is used when calculating HA slot sizes.
      • With vSphere 5 VM overhead is greatly reduced — small enough we can ignore it!!
    • Expandable Reservations
      • Used for reservations on child objects in pool, virtual machine memory overhead reservations, prevent a “configured contention” situation”.
      • Search for unreserved resources is vertical not horizontal – RP’s never share resources at the same level.
  • Limits on Resource Pools — use with care.
    • boundaries for resources, can’t be expanded.
  • DRS Affinity Rules – control dynamic vMotion.
    • prevent/enforce running a pair of VMs on a single host.
    • prevent/enforce running a group of VMs on a group of hosts
    • Must Run On can’t be violated even when HA is in progress.
    • Should run on can be violated if an HA event is in progress or DRS/DPM is in danger to lose proper functioning.
    • Mandatory rules remain active even when disable DRS – big deal.
  • Distributed Power Management (DPM)
    • Part of DRS and the same scheduler — no DPM without DRS.
    • Same affinity rules.
    • Resource entitlement DPM sets hosts to “standby” or “wakeup”.
    • Uses IPMI, iLO or “Wake on LAN”.
    • Scheduled tasks can change cluster’s power/DPM settings….I really like this..keep more capacity online during business hours but less during off hours.
    • DPM is very conservative and will do its best not to impact performance.

Life with DRS is a lot easier and certain environments aren’t possible without it.

One thought on “VSP3116 – VMware vSphere 5.0 Resource Management Deep Dive

  1. Pingback: VMworld – Day 1 » Yellow Bricks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s