Symptoms : Average wait states for running queries is high, indicating poor performance.

Impact : Medium

AimBetter monitor is reporting on the average length of waits over the indicated time span.  Waits (which are pauses in the execution of a query) are not abnormal, since they are part of the normal SQL Server behavior but long wait times are generally indications of shortage or unavailability of resources.

Default settings are:

CRITICAL : 
MEDIUM :  100ms
LOW :
Time span : more than 15 minutes

Expected behavior : There is no standard metric for wait states, but an increase indicates problems with a key resource, such as CPU, memory, storage or network bandwidth.

Possible causes

Queries are waiting for data that depends on locked resources  Priority : High
Another transaction has locked and prevented this transaction from being committed
Recommended action : 
Where possible, use “WITH (NOLOCK)” in reading transactions. Re-prioritize processes according to needs.

All network bandwidth is consumed  Priority : High
Recommended action : 
Check what traffic occupies the bandwidth the most. Where possible, reschedule non-essential activity (e.g virus scans, backup etc.) to hours of lowest SQL demand.

Storage space is not available  Priority : High
Recommended action : 

Check available space – for more information, see our post about host disk space here.

Insufficient free memory  Priority : High
Insufficient RAM can force excessive page file swapping, which reduces overall speed to disk-access times and increases wait times dramatically.
Recommended action : 
Using AimBetter Observer, identify the processes that are occupying most RAM. If possible, achieve optimal balance between OS RAM and page file allocation. If no other possibility, add RAM.

Insufficient CPU resources Priority : High:

For more information, see our post on CPU utilization here and on CPU Process Load here.

Incorrect buffer pool allocation  Priority : Medium
As free memory is consumed, pages residing in the memory are swapped out to disk to be replaced by needed data. Wait times will increase substantially.
Recommended action :
Correct the buffer pool allocation, or add memory. See our explanation of page life expectancy here and buffer cache hit ratio here.

Locks or blocks due to incorrect code   Priority : Medium
Queries that run with a large number of iterations, or are susceptible to blocks and deadlocks, will hold onto memory and force page swapping. See our explanation of coding problems causing deadlocks here.
Recommended action :
Optimize code. Possible actions :

  • Redesign program to maximize use of indexed data.
  • Redesign table structures to match the requirements of the programs by building indexes.
  • Make use of temporary tables.

SQL not releasing memory to the operating system  Priority : Medium
SQL server is set to wrong maximum level.
Recommended action :
Optimize memory allocation.

Missing or corrupt indexes  Priority : Medium
Missing indexes will cause extensive data searching from disk, resulting in page swapping.
Recommended action :
See our explanation for missing/corrupt indexes here.

Background

Each query runs in a thread. A thread that is actively working is running until it needs to wait for a resource (or has consumed the allowed quantum of CPU time). It then is moved to a list of threads that are suspended. If the resource that a suspended thread was waiting for becomes available (or a CPU quantum becomes available), it becomes runnable and is put on the bottom of the runnable queue. The runnable thread at the top of the queue then resumes running whenever a running thread is suspended.

For many reasons, query execution threads may have to wait. In normal operations, waits are quite routine and do not indicate any problem.  Attention is needed if there is a substantial increase in average wait times suddenly.

There are a many kinds of waits and they all indicate different resources being waited for. For example, a PAGEIOLATCH_EX wait means a thread is waiting for a data page to be read into the buffer pool from disk. An LCK_M_X wait means a thread is waiting to be granted an exclusive lock on something.

Waits always occur – it’s the way the scheduling system works to optimize execution. SQL Server keeps track of the time that elapses between leaving and resuming a running state.  This is called the wait time. The time spent on the runnable queue is called the signal wait time and is a measure of how long the thread needed to wait for the CPU after the required resource became available. The resource wait time – called the suspended wait time –  is calculated by subtracting the signal wait time from the overall wait time. The values of these metrics are important gauges of performance.

The most common causes of waits, that typically account for more than 80% of total waits, are:

CXPACKET – when parallel query plans are trying to synchronize the query processor exchange iterator.
PAGEIOLATCH_XX – when a task is waiting on a latch for a buffer that is in an I/O request. Long waits may indicate problems with the disk subsystem.
ASYNC_NETWORK_IO – occurs on network writes when the task is blocked behind the network. Long waits may indicate problems with the network.
WRITELOG – while waiting for a log flush to complete.
BROKER_RECEIVE_WAITFOR – thread has issued a request to a Service Broker queue and is waiting for a message to enter the queue
MSQL_XP – a task is waiting for an extended stored procedure to end.
OLEDB – thread is waiting for data from an OLE DB Provider
BACKUPIO – a backup thread is waiting for a read from or write to a buffer of a backup file to complete. Long waits may indicate problems with the disk subsystem or network.
LCK_M_XX – when a task is waiting to acquire an Intent Exclusive (IX) lock.