Symptoms : Backup failed or ran slow

Impact : Critical

Where normal operations are interrupted and data needs to be restored, the further back in time the last backup was done, the more complicated and longer the recovery will be.

Expected behavior :

SQL Full backup should execute successfully at least once per day. Transactional log and differential backups should execute regularly in intervening periods, and often enough to minimize the time needed in case restore is needed.

Possible causes of failure

Hardware failure  Priority : Critical
Most of the causes and failure conditions for backup hardware are the same as for other kinds of hardware.
Recommended action :
Track the failure back through the device chain, starting from the source server (through the network if using remote devices) to the backup hardware. Repair/replace any faulty components, or shift backups onto different resources.

Network failure  Priority : High
Backing up over a network increases overall efficiency by reducing the number of backup devices. However, it also introduces another point of failure into the backup process.
Recommended action :
Check and restore network connections on both server and backup device. Replace any failed components. If necessary, shift backups onto local hard-wired resources.

No available disk space  Priority : High
Space on the drive where backups are stored has run out. A common cause is that the database grew. Because of this growth, it needs more space for the backup than is available. Other causes are choosing to create a separate new backup file for every backup, so that multiple copies reside on the same backup drive.
Recommended action :
Monitor available drive space, relative to database sizes. If historical versions are retained on the backup server, delete the older backup files. If necessary, add drive space.

Possible causes of slow backup

Large size of backup  Priority : High
>In the case of full backup, a common cause is database growth. In case of transactional backup, a long gap since the previous full backup increases size and time to complete.
Recommended action :
Requires analysis and immediate action – preferably while the backup is still running to get all available statistics and metrics.

Slow network  Priority : High
Recommended action :
When backing up across the network, there can be all sorts of contentions and bottlenecks. Review our posts on Network Latency and Network Jitter metrics here.

Background

Backups usually comprise at least two elements. In the first place, a point-in-time copy of primary data taken on a repeated cycle (daily, monthly or weekly). This is followed by backups of all subsequent transactions (differential backup) or of the logs(transactional backup).
In the event that primary data storage is lost or becomes unusable, the data can be restored from the full backup and then brought up-to-date to the point where access was lost, by applying either the most recent differential backup or transaction log backup (or both if available).
Backup should execute as quickly as possible, to avoid clashes where one backup has failed to complete and another is scheduled to start. As well, backup process is highly I/O intensive and will affect normal SQL server response times while running.
There are three different types of backup for SQL databases, which should be used in combination. This blog does not deal with the differences between ordinary backups and snapshots.
Full database backup – provides a complete copy of the database at a single point-in-time, to which the database can subsequently be restored.
A differential backup performs the same operations as a full backup, but only captures the data that has changed or been added since the previous full backup. It is cumulative, and successive differential backups after a full backup will include all the data stored in the previous differential backup plus subsequent changes. It therefore will increase in size as more data is changed or added, until the next full backup.
Transaction Log backups are performed in a sequence, with each link capturing changes since the prior transaction log backup. Any subsequent full or differential backups will not break the log chain and the next transaction log backup will be from the last transaction log backup and not the last full backup.

Restoring data to a backup copy may be required in the following scenarios:
Logical corruption: – Data can become corrupted through application software bugs, storage software errors, hardware failure such as a server crash.
Human error: – An administrator may delete a file or directory, a user could erase set of emails or even records from an application etc.
Hardware failure: – Failure scenarios can include hard disk drive (HDD) or flash drive failure (multiple failures can cause data loss even when RAID is used), server failure or storage array failure.
Catastrophic Hardware loss: – Possibly the worst scenario is an event such as fire that renders hardware inoperable and permanently unrecoverable.