Manage Learn to apply best practices and optimize your operations.

AWS redundancy, DR set up no piece of cake

It is now affordable -- and recommended -- for IT operations of all shapes and sizes to back up data and workloads in the cloud, but recreating a data center requires a careful approach.

Before the cloud era, only a few organizations -- generally the biggest and best-funded -- could afford to have...

a second data center for business continuity or disaster recovery. The costs of hardware, space and personnel were too prohibitive for others.

With the cloud, however, adding capacity is relatively easy and dramatically less expensive. With multiple cloud locations and AWS availability zones, enterprises have the ability to build applications that can be more scalable and available than tradition on-premises apps. However, turning AWS redundancy into a fully functioning secondary off-site data center is not necessarily easy.

For example, planning for traditional disaster recovery (DR) is often direct: duplicate the infrastructure and apply some method of replication to copy data from the primary site to the secondary site, noted Ahmed Abdalla, a cloud architect at ADAPTURE, an IT consulting company. In a highly virtualized environment, cloud DR is equally simple. IT teams replicate VMs in an on-premises data center to an infrastructure as a service provider, like AWS. These VMs can be powered on in the event of a declared disaster.

"This requires a replication suite that supports virtual-to-cloud migration, such as Veeam or Zerto, as well as network configurations to appropriately shift traffic," Abdalla said.

The use of cloud services for DR has multiple facets. Business risks, such as single points of failure, systems outages or even full data center outages, can be mitigated through cloud-based DR. Traditional on-premises disaster recovery requires significant investment in storage, server and network architecture to replicate production infrastructure, or an acceptable subset of production.

"Using cloud-based disaster recovery, much of that upfront capital expenditure is reduced, as servers may only be turned on as needed, and the underlying network is only a charge to the customer based on use," Abdalla said.

The use of cloud services for DR has multiple facets. Business risks, such as single points of failure, systems outages or even full data center outages, can be mitigated through cloud-based DR.

As businesses create applications in a paradigm in which underlying infrastructure is less important, entire sections of applications, availability regions in the cloud or on-premises data centers can be added or removed to the resource pool and used dynamically.

Netflix uses this methodology for testing AWS redundancy with its Simian Army. Entire sections of the Netflix infrastructure may be randomly and arbitrarily deleted, but AWS redundancy and DR are built into the software, which is expected to recover autonomously.

"Over the next few years, I believe this is the level of business automation and systems orchestration that will be commonplace in the industry." Abdalla said. While businesses must develop and test DR and business continuity plans, the underlying technology to support those business needs will change drastically.

But don't underestimate the challenges of setting up duplicate infrastructure in the cloud.

"To have a robust system with disaster recovery and business continuity and to then move that from one on-premises or colocation facility to Amazon, or from one Amazon region to another, is really not that simple," said Laith Al-Saadoon, a cloud architect at AWS premier partner, CorpInfo. "There are a great many complexities to take into consideration."

Those factors include the need to replicate the system from a given hypervisor or a physical server to a cloud-based image. Although there are tools that can help, it remains a tricky proposition.

The other big issue is continuous replication to the DR environment. "How do you control that so it is low overhead and not impacting production or saturating your network -- or sacrificing performance?" Al-Saadoon asked.

There also are technical issues like encryption and compression, and especially deduplication. "You should make sure you are not sending a bit of data you have already sent before," he said.

Data center duplication marches on

Fortunately, the process of setting up and duplicating data centers, or at least big parts of data centers, is evolving quickly.

"Once you have a production environment in AWS, it is far simpler to create a multi-region disaster recovery or business continuity system, because you have services like S3 [Simple Storage Service] for cross-region replication and EBS [Elastic Block Store] and RDS [Relational Database Service] snapshots that you can send from on region to another," Al-Saadoon said. With AWS, the barriers to duplicating to and within the cloud are lower. However, while it may seem deceptively simple, it is not as simple as pushing a button. Enterprises seek DR or business continuity efforts that can bring a business back into operation in less than 10 minutes, but it's hard to achieve.

Enterprises looking for business and disaster recovery through AWS should evaluate its capabilities. For example, an IT team can replicate RDS and EBS snapshots and Amazon Machine Images. Those capabilities may or may not meet an enterprise's recovery time objective and recovery point objectives.

AWS also provides multiple availability zones for production workloads, which may be enough for many organizations looking for AWS redundancy.

Next Steps

Hybrid cloud with AWS makes sense to users

Enterprise wins with AWS disaster recovery

Managing a hybrid cloud takes elbow grease

AWS pushes into enterprise success

This was last published in June 2016

Dig Deeper on AWS disaster recovery

Join the conversation

3 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

How do you ensure business continuity in AWS?
Cancel

Alan,

Great article and I fully agree with everything said, with one small exception: The statement further down the article by Al-Saadoon that “it is not as simple as pushing a button.”

My response to this is “it depends”:

DO IT YOURSELF
If anybody wants to do this themselves, it is very complex to orchestrate. Reason being that AWS regions are highly isolated constructs, but now you must manage across them and reliably orchestrate a failover in case of disaster. Absolutely non-trivial.

CLOUD SERVICE – SAP HANA
My company Ocean9 specializes on enterprise workloads on AWS and Azure. And – to stay on topic – one of our key selling points is the “one-click” failover for DR amongst regions. Without redundant cloud infrastructure, hence IaaS cost neutral, we can achieve an overall availability of 99.50% with very low RTO and RPO; this is far more than most enterprises require.

For critical business seasons like Xmas for a retailer, we can additionally configure two or three redundant systems and achieve 99.99% availability.

Not sharing to market ourselves, just to help educate what is possible in the cloud today. Just in case, our URL is www.ocean9.io.

Best,

Swen Conrad

Cancel
In a word, Zerto.  Zerto recovers applications not servers, it replicates in near realtime with an RPO of typically around 10 seconds, and recovers applications in about 15minutes.  It is transparent to most applications (clusters being an exception) and outperforms most database replication.  Zerto is also a full managed environment, so metrics and status are easy to view, and failovers literally require only clicking the 'easy' button.  The only complexity of note is the requirement to update DNS with the new failover site location IP addresses.  But if your apps use FQDN instead of the very bad practice of hard-coded IP addressing, they are available as soon as DNS re-registers the FQDNs. 
Cancel

-ADS BY GOOGLE

SearchCloudApplications

TheServerSide.com

SearchSoftwareQuality

SearchCloudComputing

Close