VSP2884 What’s New in Performance for VMware vSphere 5.0

Summary at the top = incredibly dense session on vSphere 5 performance….fantastic session.

Getting right into it…I love that in a session….

  • Great performance maximums in vSphere 5 – 32 vCPUs, 1,000,000+ IOPs per host, etc.
    • 92-97% of native performance up to 32 vCPUs
    • CPU Scheduler Improvements — increases as much as 30%
    • vNUMA – allows NUMA-aware guest OSes to use underlying NUMA hardware more efficiently.
  • vCenter — 75% increase in management ops/minute.
  • HA cluster – configure 9x faster than 4.1 for a 32 host ESX cluster, 60% more VMs failover in same time.
  • Network I/O Control (NIOC)
    • Traffic management for Distributed vSwitch
    • Enhancements are User-defined network resource pools, New host based replication traffic type, QoS tagging.
    • Example w/out NIOC how vMotion takes up bandwidth and impacts NFS, VM, & FT traffic.
    • Example with NIOC showing vMotion having no impact.
    • Can add QoS tags (i.e. 802.1p) so goes end to end in network environments.
  • SplitRXMode
    • greatly reduces packet loss
    • new way of doing network packet receive processing in the vmkernel
      • splitting the cost of receive packet processing to multiple contexts
    • Before w/24 VMs could see up to 40% packet loss, with this now less than 10% percent.
      • Enable on a per vmnic basis.
  • Multicast improvements for throughput and efficiency.
  • TCP IP Stack Improvement — higher throughput with small messages & better IOPs scaling for software iSCSI
  • Netflow — supported in vSphere 5.
    • Performance – monitor application performance over time
    • Capacity Planning
    • Visibility into Virtual Infrastructure Traffic
  • vMotion — 25-30% improvement over 4.1, another 50% better using 2 NICs (multi-NIC vMotion)
  • Storage vMotion — Live Migration & I/O Mirrorning
    • Live Migration — not dirty block tracking anymore.
      • Zero downtime maintenance.
      • Manual and automatic storage load balancing.
      • Live storage migration
  • Memory Management
    • Last year with 4.1 did wide NUMA support and memory compression.
    • This year in 5.0 offering
      • vNUMA – lets know about underlying NUMA so can use fastest memory (local to the processor the VM is running on).
        • On by default for larger VMs — (8) vCPU or more.
        • Not needed for smaller VMs as ESX scheduler already keeps smaller VMs in the same NUMA domain.
      • Host Cache – use SSD storage as a cache location for VM swapped memory pages when under memory pressure.
    • New hierarchy in VMware’s memory overcommit technology
      • Transparent Page Sharing
      • Ballooning
      • Memory Compression
      • Host Cache (using SSDs)
        • Roughly 30% improvement in performance when used with other vSphere memory technologies.
      • VMs vSwap file (as a last resort)
  • Storage Improvements
    • NFS support for Storage I/O Control
    • Provides…..
      • Increased Storage performance
      • Limits performance fluctuations during periods of I/O congestion
      • Increases throughput & decreases storage latency.
    • Datastore clusters
      • Group of data stores into a group.
      • Can provision VMs into the datastore cluster.
    • Storage DRS
      • Requires datastore clusters and load balances between them based on capacity and I/O.
      • What it does….
        1. Initial placement of VMs and VMDKs based on available space and I/O capacity.
        2. Load balancing between data stores in a datastore cluster via Storage vMotion based on storage space utilization.
        3. Load balancing via Storage vMotion based on I/O metrics, i.e. latency.
      • Affinity/Anti-Affinity Rules for both VMs and VMDKs.
    • VAAI — array integration
      • New primitives = thin provisioning monitoring and dead space reclamation.
      • 3 primitives last year are the big performance gains….but these are still really big deals for performance.
      • Up to 95% time reduction! 😉
      • Also saves CPU and memory resources on ESX host (so more ESX scalability).
      • Dead Space Reclamation — VM not using space (like after a Storage vMotion). VAAI gives that back to the array via a standard SCSI command.
    • 1,000,000 IOPs from a single vSphere server against an 8 engine VMAX array with a boatload of drives.
    • 10 GigE FCoE is now on par with 8 Gig FC
      • (4) vSphere hosts with dual 10 GigE Intel adapters
      • VNX Array
      • 16 streams of FCoE traffic
      • 100% sequential, 100% read, 1 MB IO size
      • 10 Gigabytes/second.
  • Tier 1 Applications
    • HPC Performance
      • Need high CPU counts, lots of memory, NUMA aware, etc.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s