Manage Learn to apply best practices and optimize your operations.

Performance testing tools spoil app deployment surprises

Pre-deployment testing is critical to maintain functionality. Developers can use AWS CloudWatch or other tools to measure KPIs and manage performance.

In the world of continuous integration, unit tests are crucial to ensure you don't break anything when adding functionality....

It's also important to perform full, end-to-end testing and track key metrics to not only ensure you maintain functionality without hurting performance. The key to pre-deployment testing is to ensure the test environment is as close to production as possible. Admins should also be certain that they have a full suite of unit tests, so there is no regression of functionality.

If you're using any form of Agile development -- Scrum, GTD or anything else -- you're likely familiar with unit testing and recognize how essential it is to continuous integration. Every programming language has at least one suite for unit testing.

In Python, for example, py.test allows you to write the tests however you want. With Node.js, you might want to use a combination of Mocha and Should. Whatever path you choose, be sure to run tests after every commit pushes to your repository. If a test fails, you will know you can't deploy that branch.

There are plenty of hosted pre-deployment testing tools available, but most have rigid structures around them, or support only certain languages. Jenkins-CI is an established way to run jobs based on triggers, and it can test any language. Jenkins fits with most workflows and sends email alerts if something fails. It will even show graphs of tests that worked -- and ones that didn't.

Measuring KPIs

Key performance indicators (KPIs) are simple, quantitative metrics used to determine the quality of new code. For example, companies may want to measure how long it takes to log into an application, search for content and click on that content. An organization that runs an e-commerce system, meanwhile, might want to know how long the checkout process takes and how many clicks a buyer must make to get through to the purchase point.

Once KPIs are determined, developers can use AWS tools such as CloudWatch to measure them. CloudWatch enables IT teams to automatically monitor server-level metrics -- server load and Elastic Load Balancer performance. They can also upload custom metrics to CloudWatch using the CloudWatch API, which means anything recorded from code can become a KPI. CloudWatch can then alert developers when a KPI metric reaches an unusual level.

Be sure to test your application under high levels of load. How many requests can a single instance handle? How does the latency increase at 10,000 hits per second compared with only a few hits per second?

Third-party performance testing tools such as New Relic offer built-in support for measuring KPIs through "key transactions." The idea is to identify the most important or most common task and then measure the performance of that task and track the effects of each deployment on it.

It's also possible to dump statistics to your logs and use Loggly, Splunk or a similar tool to generate graphs over time. The important thing is to monitor performance of a new system before you make it live.

Remember: KPIs don't need to be set in stone. If a customer complains about how long a particular task takes to perform, such as downloading a PDF, you could add that to your list of KPIs. The goal is to measure what end users care about and what affects your bottom line.

Planning selective rollouts

A common way to test a new update is to selectively roll it out to clients one at a time. This is similar to A/B testing, but instead of trying to see how customers react to a different layout, you're trying to determine how customers react to the new code. It's difficult to predict how someone will use an application, so selective, rolling deployments often help minimize the effect of changes.

Google often releases new features selectively to small groups and slowly enables them for all users. This may be done randomly, or it could be based on when a user signed up. It may not matter how a test group is selected as long as there's feedback from that group if something goes wrong.

If you're using Amazon Elastic Beanstalk or Amazon Code Deploy, set up separate environments or groups of servers for different clients. This allows IT teams to roll out changes to one environment while keeping the existing code on an old environment. Try splitting users into multiple environments so you don't always update the same group. Users who are particularly cautious should be kept in an environment that is updated last.

Next Steps

Four useful APM tools for AWS projects

There's no silver bullet when it comes to monitoring AWS performance cost

This was last published in March 2015

Dig Deeper on QA and testing

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

What pre-deployment performance testing tools do you use with AWS?
We do not currently use CloudWatch. We’re in the early stages of incorporating SOASTA into our continuous delivery pipeline for pre-deployment performance testing. We’ve successfully completed a down and dirty POC to show that our pipeline does work, and that we can include SOASTA in the pipeline for one of our AWS-based projects, and are now working to create a viable, sustainable implementation.
We use AWS CloudWatch - have not needed anything else thus far although I know there's a growing market.
We use AWS Cloudwatch to make sure key performance measures are reached during testing phases.  This helps us determine what metrics to use in production.

In the article you said, "Admins should also be certain that they have a full suite of unit tests, so there is no regression of functionality."  Typically I have heard of Developers or Testers working on unit tests, not Admins.  I wonder what you had in mind?

Also, I noticed you mentioned that load testing was important, but you didn't mention what sort of data you want to get from such a test.  Are you trying to see where the system breaks, what size AWS instance to use or to verify the current user load won't break new code?  I get that this could almost be a separate article, but it seems like it would have been helpful to at least mention.