In high-performance SQL Server environments, how you "slice" your CPU resources is just as important as how many cores you have. We recently tackled a case where a customer was plagued by high SOS_SCHEDULER_YIELD and CXPACKET waits. The solution wasn't adding more power... it was restoring balance.
The Symptom: Heavy Waits and Uneven Load
The customer reported a sluggish system where CPU waits weren't just high... they were inconsistent. Upon investigation, we noticed that some nodes were working six times harder than others. This imbalance was causing a massive surge in:
- SOS_SCHEDULER_YIELD: Signalling that tasks were being forced to give up the CPU because they couldn't finish their quantum.
- CXPACKET: Indicating that parallel threads were stuck waiting for their "unbalanced" counterparts to catch up.
The Investigation: The "Imperfect" VM Slice
The physical host was a powerhouse with 64 CPUs (arranged in a 2 x 32 layout). However, the Virtual Machine (VM) was allocated 60 CPUs, presumably to leave 4 cores for the hypervisor.
While this seemed logical for the hypervisor, it created a "math problem" for SQL Server’s Soft-NUMA:
- The VM presented 4 nodes of 15 CPUs.
- SQL Server’s Soft-NUMA tried to further optimize this, resulting in 8 nodes with an uneven split: alternating between 8 and 7 CPUs per node.
This slight asymmetry having nodes of different sizes created a "scheduling friction" where the SQL OS scheduler couldn't distribute work evenly, leading to the massive disparity in wait times across nodes.

The Fix: Simplicity Over Complexity
Since the underlying VM configuration was already segmented into 4 nodes, we decided that the additional layer of Soft-NUMA was doing more harm than good. We recommended a return to a simpler, symmetrical topology.
We disabled Soft-NUMA using the following command:
ALTER SERVER CONFIGURATION SET SOFTNUMA = OFF;
GO
The Results: Stability in Symmetry
After restarting the service to apply the change, the 60 CPUs were reorganized into 4 clean, identical nodes of 15 CPUs each. The impact was immediate:
- SOS_SCHEDULER_YIELD: Dropped by over 40%.
- CXPACKET: Decreased by approximately 12%.
- Load Distribution: The "6x difference" between nodes vanished, replaced by a balanced, even distribution of work.
Key Takeaway
More features aren't always better. Soft-NUMA is a powerful tool for modern high-core CPUs, but if your VM vCPU count doesn't divide cleanly into symmetrical nodes, it can create "ghost" bottlenecks. When in doubt, check your node alignment. Symmetry is often the key to sub-millisecond performance.