We have all heard the talk and warnings about disaster recovery before: “backup your data to our cloud, or else…”. Thankfully, this is not one of those “the sky is falling” articles! But before we get into it, take a moment to ask yourself this one simple question: “If your business was about to be destroyed by fire (flood, falling meteors, etc.), and you had one minute to save one file, what would it be?”
Your answer is probably not that picture of Fluffy the cat; but most likely your payroll data or customer order list. A Disaster Recovery Plan defines not only which data you will save first, but what will be available during planned and unplanned downtimes as well as how much downtime your business can survive.
Complexity vs. Costs
When you create your Disaster Recovery plan, you’ll need to weigh the trade-offs between complexity vs. costs. What data can you afford to be without? For how long? What data, if lost, would destroy your business forever? Understand, this is a “you get what you pay for” solution: more complex DR plans carry greater costs. Like insurance, you’ll be glad you have it when you need it!
Backups or Business Continuity
Let’s take a moment to define these terms:
Backup: this involves making a copy of your files. Maybe it’s your users’ home drives, or your Accounting files or CAD drawings. A backup copies files so they can be restored if deleted or corrupted.
Business Continuity: this is creating a second environment which mirrors your production datacenter. You can “failover” to here to get your systems up and running in the event of a complete catastrophe (think fire, flood, theft, alien attack, etc.,)
When deciding on a Disaster Recovery Plan, the first step is to determine which solution is most appropriate for your business. In many cased a combination of backups and Business Continuity may be what you need: a backup of files plus a failover solution to mitigate and minimize downtime should disaster strike.
Four parts of a disaster recovery plan
Before we start getting into the nitty-gritty, let’s be clear: the odds of losing a file or even a server are far more likely than losing an entire office or datacenter. This is why any DR plan should feature both a quick recovery component to get a file back or to restore a single failed server as well as a full site recovery solution.
Planning for a disaster involves a lot of forethought. To get started, begin thinking about the following:
- Recovery Point Objective (RPO). RPO defines how much data you are willing to lose. For instance, if you are backing up the Accounting database every night at 8pm and the database file becomes unusable at 5pm the next day (just as you are processing payroll), you can only recover from the previous night, so any work done today is lost.
- Recovery Time Objection (RTO). RTO weighs how long you are willing to be down. Depending on your business, you might decide that you can lose up to two days of business operation, or maybe you can only tolerate two hours of downtime. A shorter time will create higher costs. This is the time to consider your options carefully.
- Regulatory constraints. Is your business subject to regulatory compliance? How will you make sure you are covered?
- Critical data. Which data is critical to your business? What are the dependencies between different areas of the business?
Create a Plan
Be cautious when vendors boast that “you can recover in the cloud” quickly; it is never that simple! If, after evaluating and planning your Disaster Recovery scenario, you decide that a quick failover is the right solution, then there is a lot of prep work that has to be done:
- The cloud environment has to be prepped. This typically means taking the following steps:
- Creating a dedicated connection from your local datacenter to the cloud provider’s datacenter. This can be a site-to-site VPN, or, if you need fast connections to replicate data in near-real-time, then consider Express Route for Azure.
- Setting up your Active Directory (all of your computer and user accounts + passwords) to replicate to the failover site. Without this, those servers that you want to replicate will not be able to function, and your DR site will be pretty much useless)
- Setting up database replication. If you are using Microsoft SQL, then creating an Always-On Availability Group between your local SQL server and a virtual server at the failover site will allow for your database-dependent applications to work when you do failover
- Configuring email replication. If you are not already on Microsoft Office 365 for email (and why not?), but are using on-premises Exchange, you should configure a Database Availability Group (DAG) to replicate mailbox databases to your DR site. Load balancers can help with automatic failover, of you can manually activate the mailbox servers in the DR site as needed
- Use Azure Files + File Sync for your shared files. Azure Files allows you to create a file share in the cloud and sync it to your local server. This way users on your network can still access the files normally, but they are also available for remote users to connect to and, more importantly, will be available from your failover network. Office 365’s SharePoint Online and OneDrive for Business provide a great location to store user’s home drive files
- Create a Recovery Plan. OK, you have a disaster at your main site and want to failover. What needs to happen and in what order? Azure Site Recovery Services (ASR) gives you the ability to create a recovery plan that helps to automate failover in the correct steps. You choose which server(s) to failover to and in what order and can make the necessary changes to their network settings to function in this new environment.
- Backups: Sure, you now have failed over to this remote datacenter, and everything is working, but you still need to backup your data just as you were doing before. Do not overlook this critical step! Users could still inadvertently delete those Accounting files during a failover event.
Test, Test, Test (and train)
Often companies will create a plan, and then leave it on the shelf. They don’t fully test the plan or consider multiple scenarios. When a disaster hits, whether it’s cybercrime or a hurricane or a rogue sprinkler system, the plan fails. The New York Stock Exchange had a plan before Hurricane Sandy, but they didn’t follow it when disaster hit. That meant closing the stock exchange for two days.
Testing your plan two to three times a year is a good way to make sure the plan is up-to-date, and that failover works as expected. You certainly don’ t want to find out that things are not as promised after it is too late!
Your resources and business needs will change over time. This includes your location, personnel, and data. If your business goals change, so should your plan.
Once you have a plan in place, you’ll need to train all personnel. For a higher chance of success, ensure that senior management endorses the plan. Be sure to encourage training for all employees.
Get help to create a plan
A cloud solution can help you find a good balance between cost and complexity. With Azure Site Recovery, you can create disaster recovery plans in the Microsoft Azure portal. The disaster recovery plans can be as simple or as advanced as your business requirements demand.
Your IT partner is there to help you with all stages of strategy, planning, and implementation. If they are not up to the challenge (not asking all the right questions), then reach out for help!
You can also download the free Azure Backup and DR eGuide for more information.