SQL Server Disaster Recovery

Introduction

After completing the process of installing and configuring Microsoft SQL Server, any database administrator is entitled to sit back and take a well-deserved break. However, it's important to remember that the database administration process does not end when your database is humming along at a comfortable pace answering user requests. You considered a multitude of issues during the database design process -- performance, access controls, replication strategies, normalization and indexing to name a few.
Have you taken the time to develop a contingency plan to cover unforeseen catastrophes that might threaten your database in the future?

Disaster recovery is all-too-often ignored until it's too late. Busy database administrators can easily push this task onto the back burner with promises to themselves that they'll "take care of backups right after I finish this big project." However, take a moment and consider the value of your data to your organization? What would be the result if an hour's worth of database changes were lost? A day's worth? What about a complete loss of the database?

More than likely, your answers to the previous questions illustrate the need for a comprehensive disaster recovery plan. They'll also form the starting point for the development of a plan tailored to the unique needs of your organization. In the remainder of this section, we'll look at the other factors to consider when developing a strategy for your organization.
The next two sections will examine two prominent disaster recovery mechanisms offered by Microsoft SQL Server (and many other databases) -- database backups and transaction logs.

The first step in developing a solid disaster recovery plan is to develop an idea of what constitutes an acceptable loss for your organization.
First, consider the impact of losing data stored in your database. Would you be able to recover from the loss of an hour's worth of data? If you're managing a human resources database, chances are that you could deal with this situation by instructing your personnel to reenter data entered during that period. If you're running the database supporting a financial institution, the loss of an hour's data could bring the instant wrath of clients and industry regulators along with significant monetary losses. Protection against these types of losses is provided by backups of the database and use of transaction logging. We'll discuss these in further detail.

Second, consider the loss of access to the database itself. What would be the ultimate result if your end users were not able to access information for an extended period of time. The loss of access to our human resources database would likely result in frustration, but minimal loss to the business. On the other hand, if doctors at a hospital were unable to access test results and laboratory findings in a timely manner it could result in the loss of life or limb. If you find that these issues are a concern in your organization, you may want to consider the use of failover clustering to ensure the continuous availability of your database.
This concept will be discussed in a future article.

Once you've determined the level of acceptable loss for your organization and received buy-in from the users you support and your management, it's time to begin developing a strategy to minimize the impact of a catastrophic event on your database. Our first step is to develop a comprehensive backup strategy. We'll discuss this in the next section of this article. Read on!

Microsoft SQL Server 2005 offers two primary database backup mechanisms -- full backups and differential backups. You'll need to weigh the costs and benefits of each when scheduling backups of your organization's data.

Full backups do exactly what the name implies. They store a full copy of the database on any supported backup media. SQL Server 2005 supports the Microsoft Tape Format (MSTF) for backups to tape, disks and named pipes.

Full backups can be performed without taking the database offline, but they consume a large amount of system resources and may have a noticeable impact on database response times.

Differential backups are designed to compensate for the large amount of time required to perform a full backup. They utilize a similar mechanism to store a copy of only the data that has changed since the last full backup. Most databases contain a large amount of information that changes infrequently. In these cases, differential backups will consume significantly fewer resources than a full backup and may even be able to process without significantly impacting database performance.

It is critical to remember that differential backups only contain data modified since the last full backup. In the event of a database failure, a differential backup alone will be useless. For example, imagine that the XYZ Company performs a full backup each Friday at the close of business and differential backups on Monday through Thursday evenings.

If the database were to fail on a Wednesday morning, the database administrator would first restore data from the previous Friday's full backup and then apply Tuesday evening's differential backup to bring the database to its most recently backed-up state.

Once again, the exact mix of full and differential backups depends upon a variety of factors unique to your organization.

Be sure to consider the length of time required to perform the backups and the impact that might have on your database performance. If your database is only used during business hours, you may be able to schedule a full backup to run each evening after the office closes. On the other hand, if you are supporting a 24-hour operation, you may only be able to sneak in a full backup on a slow Sunday afternoon and use nightly differential backups as a supplement.

Other technologies new to SQL Server 2005 are partial backups (which ignore read-only filegroups, copy-only backups (which allow you to create a backup file without disrupting the differential backup cycle), and mirrored backups (which allow you to have multiple backup sets for added redundancy).

When developing a disaster recovery plan you should also consider the storage of database backup media. If you backup to tape and then store the tapes in the server room, they're not going to do you any good in the event of a fire that destroys the server room. It's best to keep a copy of the tapes in an accessible, secure offsite location. If you're located in an area prone to earthquakes, hurricanes or other natural disasters, you may also wish to consider sending a copy of your backups to a location in another city.



We've discussed the methods used to backup the contents of your database. Ideally, we'd be able to perform a full database backup each time the database was altered. Unfortunately, this is a pipe dream for most database implementations -- the large amount of time and resources required to perform a backup are prohibitive and limit us to scheduling differential backups on a more infrequent basis -- perhaps daily or even hourly depending upon the circumstances. However, SQL Server does provide a mechanism to compensate for database changes made during the period of time between backups -- the transaction log. We'll discuss transaction log backups in the next section of this article. Read on!

In the previous section of this article we explored the data protection provided by full and differential backups. In most circumstances, the cost of performing these large-scale backups prohibits us from doing them during high-use periods, such as business hours. However, the failure to perform backups during business hours could result in the loss of an entire day's data if the database fails before the backup is performed. Fortunately, SQL Server provides us with the capability to backup transaction logs -- a feature that provides an efficient mechanism to decrease the amount of potential data loss.

During the course of normal operations, SQL Server utilizes a transaction log to track all of the modifications performed within a database. This log ensures both that the database is able to recover when abruptly interrupted (such as a loss of power) and that users are able to undo (or "rollback" in database lingo) the results of a database transaction. Unfinished transactions are maintained in the log before they are permanently stored (or "committed") in the database. The transaction log backup functionality enables us to store copies of the database's transaction log on a backup device. These backups generally consume far fewer resources than a full or differential database backup and therefore are suitable for even high-use periods.

Each time the transaction log is backed up, SQL Server removes all of the committed transactions in the log and writes them to the backup media. Due to this incremental process, transaction logs are not cumulative and we must maintain a complete set of transaction logs reaching back to the most recent full or differential backup. For example, let's return to the organization described in the previous section of this article. Recall that the XYZ Company performed full backups at the close of business every Friday and differential backups every Monday through Thursday evening. Let's take this strategy a step further and include hourly backups of the transaction log during business hours. Assume a database failure at 11:05AM Wednesday. Under the previous strategy, we used Friday's full backup and Tuesday's differential backup to restore the database to it's state at the close of business Tuesday. This resulted in a loss of two hours of data (9-11AM Wednesday). Using the new strategy that incorporates transaction log backups, we can apply the 9AM and 10AM transaction log backups after the Tuesday differential backup to restore our database to it's state at 11AM Wednesday. Now we've managed to restore all but five minutes of database activity, clearly a vast improvement.

The previous example highlights the importance of a comprehensive backup strategy. XYZ Corporation's judicious use of database and transaction log backups results in a disaster recovery model where no more than an hour's data should be lost after a catastrophic event with a minimal impact on system performance. You can take these examples and modify them to meet the requirements you defined for your organization after reading the first section of this article.

Now that you've been introduced to the basic concepts behind SQL Server disaster recovery, it's time to begin developing a disaster recovery plan for your database. Remember to involve both your internal customers and management team in this process. If you're stumped and need advice or just want to run your plan by your peers, stop by our forum and join in the ongoing discussion.