CIM1331 Session Notes – Design and Deploy Optimized SQL Server on vSphere – VMware Partner Exchange

Wendy He presenting, worked with MS products for 15 years or so.

Summary = very good content, tendency to read from the slides but hard to argue with that given it was good stuff on the slides (getting the slide deck would be recommended).

More after the jump…

  • Tier 1 Workloads on vSphere – SQL Server of course.
    • Customers asking if vSphere can handle it — basically no concerns with current vSphere.
    • Scale Up – lots of 1-2 vCPU VMs you get 92% of native performance.
    • Scale Out – throughput increases linearly as add VMs.
      • Once over-committed get 1.5x improvement still (pretty cool….using up the unused CPU cycles).
    • EMC Study – 24 vCPU, 64 GB Memory VM on VNX 5700
      • Shows real impact of FAST VP and also FAST Cache – search for h8987 on Google to find out.
      • Started out around 7k IOPs and 30,00 TPM….by end of FAST VP and FAST Cache those more than tripled.
  • 3 Types of Workloads – different response to peak contention vs. sustained activity.
    • OLTP – lots of small queries,
    • Batch/ETL – lots of parallel threads
    • DSS
  • VM Resource Allocation
    • vCPU – start at minimum and grow when needed, use HotAdd
      • Avoid pCPU overcommit w/Tier 1 prod workloads
      • If do overcommit, wathc %RDY, %MLMTD & %CSTP
      • Watch NUMA node size of course.
    • vMem – right-size of course.
      • Don’t overcommit.
        • If you do, watch SWAP /MB: r/s w/s MCTLSZ
        • Basically we’re watching balloon driver stuff.
        • Set memory reservation to match provisioned memory.
      • vSphere 5 gives memory compression….this matters.
    • Storage
      • Gotta map out the I/O requirements.
      • MPIO please.
  • NUMA in vSphere 5
    • Same vNUMA stuff as in other sessions.
    • vNUMA is on by default for 8+ vCPU multi-core VMs.
      • I guess smaller than that we’re counting on hexa-core processors so it doesn’t matter.
      • For smaller VM’s can enable with numa,vcpu.min=4 in the .vmx file.
    • vNUMA is off by default and not supported if have CPU hot-add turned on.
      • Hmmm….wonder why this is….does contradict earlier recommendation….can’t do Hot-Add and vNUMA (basically adding resources on the fly vs. higher performance).
    • SQL Server is automatically NUMA aware.
  • Storage
    • Partition alignment — oh yes….stating this is the most important performance impact area.
    • When using VMFS, uses eagerzeroedthick (specifically for database and log files)
      • Screenshot for this shows it as “Thick Provision Eager Zeroed” in vSphere 5 GUI.
      • In vSphere 4 GUI you need to check the “Support clustering features such as Fault Tolerance” option.
    • Use multiple vSCSI adapters and evenly distribute target devices.
    • Shared storage of course….lots of spindles.
  • VMFS vs. RDM
    • We’re basically at performance parity — no real impact on IOPs.
    • RDM does enforce 1:1 mapping with LUNS…that could be helpful.
    • Max LUN limit of 256 is a deciding factor.
    • Need RDM for SQL Server failover clustering and/or storage vendor integration tools.
    • RDM also useful for migrating physical to virtual when the physical server already has its LUN on the SAN.
    • Database Layout
      • For Tier 1 maintain 1:1 mapping between VM and LUN (VMFS or RDM)
      • Same guidelines as physical – separate LUNs for data and redo/log
      • Must be joint layout between DBA, VMware, and Storage admins.
    • Some audience discussion – still benefit for RDM as greater than 2 TB individual disk (GPT of course)….vmdk’s on VMFS5 are still 2 TB – 512 bytes.
      • Also if have RDM in virtual mode have to be careful about Storage vMotion — prob keep in physical mode for safety.
  • SQL Server In-Guest Memory Tuning Practices
    • Use large pages in the guest – start SQL server w/Trace flag -T834
    • Use Lock Pages in Memory Privilege for SQL Server Service account — keeps OS from randomly paging SQL memory out.
    • Set “Max Server Memory” and “Min Server Memory”
      • SQL Max Memory = VMMem – Threadstack – OS Mem – VM Overhead
      • Threadstack = NumOfSQLThreads & ThreadStackSize
      • ThreadStackSize = look it up…matters a bit.
  • In-Guest Storage Best Practices
    • Same as regular SQL Server stuff listed here – http://technet.microsoft.com/en-us/library/cc966534.aspx
    • Plan for performance of course.
    • More on partition on alignment – could hit performance up to 50%.
      • Really “sector alignment”, Windows 2008 takes care of this for.
    • Pre-allocate auto-grow – if auto-grow happens everything SQL stops during that growth.
    • Use multiple data files for data and tempdb – start with 1 file per CPU core.
      • Affects how many threads SQL Serve starts up.
    • tempdb, data, log go on separate LUNs — different I/O characteristics for each. Also from a monitoring perspective helps pinpoint where a bottleneck is.
    • Routine maintenance please – index rebuilding/reorg, dbcc checkdb
  • In-guest SQL Server Monitoring
    • Perfmon – SQL server specific counters: SQLServer:**
    • VMware tools adds some specific counters to perfmon that come from the ESX host — 2 of them.
  • Host level Monitoring – normal stuff…client GUI, esxtop/resxtop.
  • Key Metrics to Monitor for ESX
    • CPU – %USED %RDY %SYS
    • Memory – Swapin, Swapout, MCTLSZ
    • Disk – READs/s WRITEs/s DAVG/cmd KAVG/cmd GAVG/cmd
    • MbRX/s MbTX/s PKTRX/s PKTTX/s %DRPRX %DRPTX
    • Audience question – what are best practices? I think this is vCenter Operations….hard to know what’s unique to the environment and not. Not sure I’d trust metrics from VMware frankly….it’s more about my environment.
  • Scale-out vs. Scale-out – traditional approach is Scale-Up
    • Scale-Up – multiple DB’s or instances per VM, fewer ESX VMs, Single point of failure, less resource isolation, Larger VM, SMP overheads, OS bottleneck especially for 32 bit.
      • Does save on licensing.
    • Scale-out – single DB per VM, Better SQL and workload isolation – can keep DSS and OLTP separate, Easier change management, DRS is more effective, faster migration/vMotion
      • Downtime windows = not having to coordinate maintenance windows with many departments/customers is a big deal.
      • Drawback = more licensing…..but if do Datacenter SQL Server license might be able to make the licensing costs work.
      • Also lets you do more with guaranteeing resources to DBA’s nervous about migrating from physical to virtual.
  • Licensing consolidation via faster processors and licensing underlying sockets when virtualized.
    • Showing a case study with 70% license savings.
    • Two likening models – Server/CAL or Per-Processor
      • If customer already owns CALs, Server-based licensing is usually cheaper.
      • Per-Processor licensing can be expensive
      • This is for SQL Server 2008 – 2012 will be different.
    • vMotion Impact
      • If running Standard, can vMotion every 90 days. You may need additional license to cover you for vMotion.
      • If Enterprise or Datacenter, unlimited mobility.
    • Consolidation Rules
      • Enterprise Edition – 4 VMs per license without SA, unlimited with SA.
      • Datacenter Edition – Unlimited VMs per license (per socket license)

Business-Level Approach – we should start by asking about availability, speed, etc. etc.

  • Do you need app availability awareness?
  • VMware HA with DB Mirroring can give higher than physical availability.
    • RTO in few seconds – very cool.
    • Protects against HW/SW failures and DB corruption.
    • VMs that die for whatever reason drop out and then can rejoin.
  • MS Failover Clustering
    • Shared disk with multiple instances of SQL server binary.
    • Can also pair this up with VMware HA.
    • See VMware KB 1037959
    • Seamless integration…very simple.
  • Big Matrix for everything Microsoft Clustering in VMware — see KB 1037957
    • Look this up….it matters but way too much to type.
    • Do need RDM disk for any shared volumes (how it’s always been with MSCS).
  • Rolling Patch Upgrades
    • OS and SQL Server rolling patch upgrade using standby VM
    • Can do this with very little downtime.
    • Stop SQL connections, shut down instance, remove vmdk’s, attach to another server, startup SQL, etc. etc.
    • App keeps trying to reconnect so if you’re fast enough the app doesn’t go down.
    • I’d recommend scripting this….but manual might work in a pinch.
  • SRM with Database Mirroring – there’s integration here.
    • Can also work with vSphere replication if you want to.

One thought on “CIM1331 Session Notes – Design and Deploy Optimized SQL Server on vSphere – VMware Partner Exchange

  1. Pingback: Think Meta » Session Notes Compendium – VMware Partner Exchange

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s