microworks - Fotolia
Organizations building applications in AWS must be concerned with planning for application scalability. Scalability impacts an organization's application performance and customer experience as well as the cost for AWS. Fortunately, AWS has tools and built-in options to use for effective Auto Scaling. As with most tools, however, some use cases may not be fully addressed. Organizations may need to resort to developing additional internal tools to enhance the AWS functionality to meet business needs.
For most application providers, when the application gets additional traffic or use, the number of instances or servers needed to process business functions increases. In the same manner, if the application experiences slower points during certain hours or days of the week, instances are removed. During planning, organizations need to determine what type of scaling is used based on use case scenarios, cost, performance impacts and infrastructure requirements.
In this article, I'll discuss AWS Auto Scaling and elastic load balancing as well as gaps or obstacles an organization may encounter and ideas for handling them.
Auto Scaling in AWS
The Auto Scaling Web service tool in AWS allows organizations to customize the required capacity in advance and set individualized conditions where an instance scales automatically based on the configured values and demand. An organization configures the value for when scaling up or down occurs. For example, an organization sets Auto Scaling to kick in when the average CPU utilization time reaches 65% so instances are added based on need. An organization also sets Auto Scaling to remove instances at 10%. Instances are added and removed automatically based on the configured settings. However, it's important to consider configuring a cool-down period which tells Auto Scaling to wait for a specified time before taking action or has it re-evaluate instances once more before adding or removing. Essentially, Auto Scaling is a Web service users can configure to launch or terminate cloud instances within AWS based on user-defined configuration settings.
Additionally, Auto Scaling also works if an organization sets the exact number of instances to run. By creating an Auto Scaling group with number of instances desired, organizations control the number of instances running at all times. Auto Scaling automatically adds or removes instances in the group based on the configured capacity values. Auto Scaling provides the ability to dynamically and predictably scale server resources based on application use. Setting the proper configuration values requires tracking application response over time via load testing scenarios or actually tracking live application response over time. In either case, an organization needs to accurately determine high and low values for Auto Scaling to get the most effective user response results.
Elastic load balancing in AWS
When enabled in AWS, elastic load balancing automatically distributes incoming application traffic across healthy server instances. In order to increase fault tolerance in an application, instances are grouped into separate Availability Zones so that application traffic spreads equally across available, healthy server instances.
Elastic load balancing runs a health check against instances and only routes traffic to healthy instances across availability zones. The tool itself is actively monitored within AWS. A distinct benefit of using the elastic load balancing option is that the tool manages loading balance and scaling without manual intervention and frees up organizational resources to be used elsewhere.
An important planning decision needs to be made around the health check execution the tool automatically runs against instances using the elastic load balancing service. An organization must set the execution value for the health checks and account for the time to re-route traffic in case an instance fails the health check. For example, if an organization sets the health check to run every 30 seconds, and sets the number of successful health checks to equal 10, then it will take approximately 300 seconds for traffic to be routed to that instance. Additionally, consider what the failure threshold value needs to be. If the failure threshold is set equal to 4, and the health check runs every 30 seconds, then it'll take 120 seconds before the failing instance is removed from access and traffic re-routed.
Elastic load balancing and Auto Scaling are interoperable, so it may be worth considering combining the two and taking advantage of both functionalities to increase application scalability.
Handling additional needs
Although AWS's scaling tools are proven to be effective in the majority of use cases, some companies have found the need to develop custom tools to add needed functionality. One example is Netflix's use of the internally developed tool called Scryer to supplement their use of Auto Scaling.
Scryer enhances Auto Scaling by providing an answer to the following gaps identified by Netflix:
- Rapid spike in demand
- Outages followed by a "retry storm"
- Variable traffic patterns by time of day
Organizations may find that developing a tool that runs complementary to AWS’s scaling tools is necessary to handle different business needs not currently addressed within the AWS tool set.
All organizations producing applications need to ensure they perform and scale in order to satisfy customers. After all, Internet application customers expect high rates of performance combined with security and functionality. In order for all the different options to come together into a seamless, fast, secure and positive customer experience, scaling must be planned.