BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Developers can run into challenges when it comes to application testing, particularly those that use serverless...
computing. But it's a necessary effort to stay ahead of the competition.
Dev teams can use canary deployment, a form of production deployment, to slowly test live changes on production systems. Think of it like inviting a small number of users to participate in a beta test, but without requiring them to opt in -- or even know that a test is ongoing.
To test code, it's best to actually run it in production. But rather than switch all traffic over to the code immediately, canary deployments enable developers to slowly route portions of the traffic over to the code and test it before the migration finishes. This process makes it easier and faster to catch issues, and it means that you might only impact 10% of your customers -- instead of 100%.
At re:Invent 2017, Amazon two ways to manage canary deployments with AWS Lambda. Lambda's traffic shifting feature lets developers specify two aliases that should each receive a certain percentage of the requests. This allows developers to use canary-style deployments without using Amazon API Gateway. Developers specify a gentle rollout of the function, such as 10% of the traffic every 10 minutes.
Developers can also execute a canary deployment directly through a option in API Gateway. They can set up a canary in any given stage of a deployment to route some percentage of traffic to the endpoints.
Canary deployments with API Gateway
Developers can create canary deployments through API Gateway on individual stages of the API. Create a canary deployment on a stage to get started, and then specify the percentage of requests to direct to the current version, as well as the deployment.
Additionally, you can specify to only override stage variables, which are name-value pairs defined as configuration attributes -- even if you don't want to change any code. For example, if a stage variable contained the name of a DynamoDB table for data storage, a developer could slowly shift over to a different DynamoDB table by creating a canary deployment and overriding that stage variable.
After creating the canary, developers deploy the API to the stage with the canary enabled. When initiating this deployment, AWS informs developers that this deployment will be pushed to the canary, not the production API.
Developers can then choose to manually increase traffic on the canary. After they fully test the deployment, they can promote the canary deployment to production, which sets it as the production endpoint.
Lambda traffic shifting
Use Lambda traffic shifting to handle deployments that aren't behind API Gateway. Some triggers don't happen in response to HTTP events, such as when an email comes in via Simple Email Service. Still, handle these events with care. Some events come directly into Lambda through Simple Notification Service, Kinesis Streams or another trigger, and Lambda traffic shifting is the best option for canary deployments with this application model.
Lambda traffic shifting works with Serverless Application Model to automatically route traffic according to deployment preferences, rather than having to configure the deployment via a visual interface, like in Amazon API Gateway. For example, push 10% of all traffic, increasing by 10% every 10 minutes, with this command:
There are three types of deployment: linear, which means multiple stages; canary, which means just two stages with a pause after the stage until the migration is complete; and all at once, which means everything updates immediately.
Developers should also add alarms to halt a bad deployment if something does go wrong. Alarms can reference any CloudWatch Alarm, such as a check for function errors or custom metrics. An example of a custom alarm might be to include response status codes to ensure there's not a high number of 5xx-level error codes being sent to customers. Configure alarms in the deployment preferences:
- !Ref AliasErrorMetricGreaterThanZeroAlarm
Finally, developers can add in pre- and post-deployment hooks, which benefit applications monitored by outside systems, like Relic. These simple Lambda functions can notify users of a deployment being processed or handle a more complex task, like setting up additional monitoring resources or spinning up automated tests.
When to cage the canary
Canary deployments make a lot of sense for back-end logic, such as email processing or logging analytics. But they are not a silver bullet, and they don't make sense for all scenarios. For example, if you want to know if the traffic going to the deployment works properly, you must have:
- a large amount of traffic at the time of deployment;
- tests and alerts for when something goes wrong; and
- proper rollback policies in the canary deployment model.
Note that canary deployment routes a specific percentage of traffic for each request to the deployment, and it might route a single customer to the code when they log in, but to the old code when they do a search. This means that the deployment must be backwards-compatible with old code. For example, you can't serve up an application front end that uses a API that doesn't exist on the old code. Canary deployment can create confusion for some end users, as features might be inconsistent from page to page.
If you're testing a minor, mostly unnoticeable change or a bug fix, a canary deployment might be just what you need. But stick to standard deployments for major feature releases or backwards-incompatible changes.