Load balancing is an effective way to improve the availability and scalability of certain applications. Amazon...
Web Services Elastic Load Balancing reroutes traffic to ensure that systems function despite server failures or outages. The following are some basic guidelines for using AWS load balancing to maintain availability, as well as information on how to combine it with Auto Scaling to further stabilize AWS workloads.
Load balancers help maintain availability by routing traffic away from failed devices. They also create scalability by distributing work to servers that can process additional tasks. Load balancing places an extra server between client devices and back-end servers to distribute workloads among back-end servers. The load balancer then routes processing results back to clients.
In a simple case, a load balancer directs 20% of all incoming traffic to five back-end servers. If one of the five servers fails, the load balancer detects this and stops routing traffic to that failed server. The remaining functioning servers then each get 25% of the workload. A more sophisticated load balancer might monitor the load on each server and distribute incoming work away from heavily loaded servers to those with idle capacity.
Load balancing supports different sets of protocols to meet different needs. Basic load balancing can use HTTP from the client to the load balancer and from the load balancer to the back-end client. Secure communications between client devices and a website can use HTTPS between the client and the load balancer and HTTP from the load balancer to the back-end server. The load balancer handles encryption and decryption; if end-to-end encryption is required, use HTTPS from client to load balancer and from load balancer to back-end server.
AWS Elastic Load Balancing features
For high availability in a cloud environment, services must continue to run -- even if there is a server failure or if connectivity to an availability zone is lost. Admins can register AWS Elastic Compute Cloud (EC2) instances with a load balancer that's configured to perform health checks -- typically a ping to HTTP port 80. When a health check fails, AWS Elastic Load Balancing (ELB) reroutes traffic to other instances registered to the same load balancer. EC2 instances can reside in multiple availability zones; if a single availability zone is unavailable, traffic can be routed to registered instances in another availability zone.
AWS recommends using two subnets in different availability zones to ensure high availability. As with other AWS resources, it's important to configure security controls to meet your needs. With AWS load balancing, you likely will need to create a security group that contains rules to govern the flow of traffic to the load balancer.
On its own, AWS ELB can improve the availability of applications. Admins can also use AWS load balancing in conjunction with AWS Auto Scaling not only to distribute workloads but also to monitor and adjust compute resources as needed.
Auto Scaling adjusts the number of servers according to the workload and uses templates, known as launch configurations, to create EC2 instances. It also uses a policy, known as a scaling plan, to determine conditions that trigger a change in EC2 capacity. Auto Scaling manages groups of instances, which could also be registered to a load balancer. Combining load balancing and autoscaling can help improve availability and scalability.
This approach is not suitable for all applications, however. If your back end depends on an application that is not easily distributed among servers, such as certain relational databases, then load balancing may not help improve availability. In such a case, a failover server may be a better option. Another option is to use an AWS-managed database service such as Relational Database Service to avoid some database management overhead.
More on load balancing in the cloud
How Microsoft Azure autoscales cloud applications
How to improve AWS app performance