SQL: Session CPU wait count

Symptoms : The wait time for the SQL session CPU is high.

Impact : Medium

This metric represents the average number of queries that SQL reports that are in a wait state for CPU activity. It is an indication of degradation in server performance.

Expected behavior :

There is no standard metric for wait count. You should set your own thresholds after reviewing past behavior.

Threshold settings:

Metrics are accumulated over a specified time span.  Default settings are:

CRITICAL :  300
MEDIUM :  200
LOW : 100
Time span : more than 5 minutes

Possible causes

Poor coding   Priority : Medium
Best practices for coding should be strictly enforced, to ensure that all queries optimally consume and then release resources.
Recommended action :
Using AimBetter monitor, identify any queries that have remained open for an inordinate length of time. Recover the query plan, and see whether any improvements are available.

Index missing or corrupted  Priority : Medium
Missing or corrupted indexes can lead to queries working inefficiently and consuming resources.
Recommended action :
See our recommendations regarding memory optimization here.

Statistics not up to date Priority : Medium
This will cause the SQL engine choose the wrong index and that will cause eventually high CPU usage.
Recommended action :
See our recommendations regarding memory optimization here.

Background

There may be conditions in which SQL is showing very high levels of wait times for CPU, while CPU activity has not risen significantly or at all.  There are three different kinds of CPU wait to be accounted for:

SOS_SCHEDULER_YIELD and CPU (Signal Wait) which both measure when a session is ready but waiting for a CPU, and CPU (Consumed) which reports actual time spent scheduled on a CPU running instructions or waiting for main memory reads and writes (because thread waits are so short they don’t yield when reading and writing memory ).

SQL Server’s thread schedulers allocate a uniform timeslice of 4 milliseconds (referred to as a quantum). Threads requiring less than a quantum will relinquish the CPU and, if the system is not busy, will almost immediately reacquire it. Therefore a thread yielding and another task being assigned to a thread is a common occurrence, and a large count of SOS_SCHEDULER_YIELD waits is not significant in itself.

Since long-running queries will hit the 4 millisecond thread execution limit often, they contribute greatly to the occurrence of high SOS_SCHEDULER_YIELD wait counts. Remember that some tasks simply require more time than others – as long as everything is optimized correctly there is nothing inherently wrong with having long-running queries.

Using SOS_SCHEDULER_YIELD waits alone illustrate the point that many performance conditions cannot be diagnosed on the basis of single measurements from individual tools.  In a situation where there are long wait states as this metric indicates, it would be the combination of high CPU usage from the performance monitor and high signal waits that would be a primary clue. This is why using AimBetter’s alert function is the best mechanism, since all of the relevant factors are being taken into account.

A THREADPOOL wait should rarely be seen on a healthy system. A THREADPOOL wait occurs when there a no threads in the thread pool for assignment to a runnable task. This may occur if the configured maximum number of worker threads is too small for the workload. Before adjusting the configuration value, however, you should use the AimBetter history to find out whether this is a common condition or has only occurred during a rare period of exceptionally high usage. The maintenance of threads comes at a cost, and the thread maximum should not be adjusted for conditions that occur rarely.

CXPACKET waits can occur when SQL Server is trying to synchronize multiple threads for a query with a parallel execution plan. The response to CXPACKET waits depends on the query itself.

If most of the queries are waiting on CXPACKET, review the setting for “Max degree of parallelism” and “Cost degree of parallelism” and check if they are set as per best practice recommendations. Look for AimBetter alerts relating to SQL Exceptions, especially about any flaws in the query that might cause problems, such as inappropriate data types. If the query seems OK and the indexes seem suitable, it may be possible to reduce CXPACKET waits by adjusting the degree of parallelism.

Conclusion

Wait stats relating to CPU usage rarely provide a direct indicator of a single problem. At best, CPU wait stats can call attention to the range of possible causes, but only further investigation of the queries involved and the server workload can determine whether problems actually exist and, if so, what action is to be taken.