BACKGROUND IMAGE: DrHitch/stock.adobe.com
LAS VEGAS -- Amazon continuously rolls out new discounting programs and AWS cost management tools in an appeal to customers' bottom lines and as a hedge against mounting competition from Microsoft and Google.
Companies have grappled with nasty surprises on their AWS bills for years, with the reasons attributed to AWS' sheer complexity, as well as the runaway effect on-demand computing can engender without strong governance. It's a thorny problem with a solution that can come in multiple forms.
To that end, the cloud giant released a number of new AWS cost management tools at re:Invent, including Compute Optimizer, which uses machine learning to help customers right-size their EC2 instances.
At the massive re:Invent conference here this week, AWS customers discussed how they use both AWS-native tools and their own methods to get the most value from their cloud budgets.
Ride-sharing service Lyft has committed to spend at least $300 million on AWS cloud services between the beginning of this year and the end of 2021.
Lyft, like rival Uber, saw a hockey stick-like growth spurt in recent years, going from about 50 million rides in 2015 to more than 350 million a few years later. But its AWS cost management needed serious work, said Patrick Valenzuela, engineering manager.
An initial effort to wrangle AWS costs resulted in a spreadsheet, powered by a Python script, that divided AWS spending by the number of rides given to reach an average figure. The spreadsheet also helped Lyft rank engineering teams according to their rate of AWS spending, which had a gamification effect as teams competed to do better, Valenzuela said in a presentation.
Within six months, Lyft managed to drop the AWS cost-per-ride figure by 40%. But it needed more, such as fine-grained data sets that could be probed via SQL queries. Other factors, such as discounts and the cost of AWS Reserved Instances, weren't always reflected transparently in the AWS-provided cost usage reports used to build the spreadsheet.
Lyft subsequently built a second-generation tool that included a data pipeline fed into a data warehouse. It created a reporting and dashboard layer on top of that foundation. The results have been promising. Earlier this year, Lyft found it was now spending 50% less on read/writes for its top 25 DynamoDB tables and also saved 50% on spend related to Kubernetes container migrations.
"If you want to learn more about AWS, I recommend digging into your bill," Valenzuela said.
AWS cost management a perennial issue
While there are plenty of cloud cost management tools available in addition to the new AWS Compute Optimizer, some AWS customers take a proactive approach to cost savings, compared to using historical analysis to spot and shed waste, as Lyft did in the example presented at re:Invent.
Privately held mapping data provider Here Technologies serves 100 million motor vehicles and collects 28 TB of data each day. Companies have a choice in the cloud procurement process -- one being to force teams through rigid sourcing activities, said Jason Fuller, head of cloud management and operations at Here.
"Or, you let the builders build," he said during a re:Invent presentation. "We let the builders build."
Still, Here had developed a complex landscape on AWS, with more than 500 accounts that collectively spun up more than 10 million EC2 instances a year. A few years ago, Here began a concerted effort to adopt AWS Reserved Instances in a programmatic manner, hoping to squeeze out waste.
Reserved Instances carry contract terms of up to three years and offer substantial savings over on-demand pricing. Here eventually moved nearly 80% of its EC2 usage into Reserved Instances, which gave it about 50% off the on-demand rate, Fuller said.
The results have been impressive. During the past three-and-a-half years, Here saved $50 million and avoided another $150 million in costs, Fuller said.
Salesforce is another heavy user of AWS. It signed a $400 million infrastructure deal with AWS in 2016 and the companies have since partnered on other areas. Based on its 2017 acquisition of Krux, Salesforce now offers Audience Studio, a data management platform that collects and analyzes vast amounts of audience information from various third-party sources. It's aimed at marketers who want to run more effective digital advertising campaigns.
Audience Studio handles 200,000 user queries per second, supported by 2,500 Elastic MapReduce Clusters on AWS, said Alex Estrovitz, director of software engineering at Salesforce.
"That's a lot of compute, and I don't think we'd be doing it cost-effectively without using [AWS Spot Instances]," Estrovitz said in a re:Invent session. More than 85% of Audience Studio's infrastructure uses Spot Instances, which are made up of idle compute resources on AWS and cost up to 90% less than on-demand pricing.
But Spot Instances are best suited for jobs like Audience Studio's, where large amounts of data get parallel-processed in batches across large pools of instances. Spot Instances are ephemeral; AWS can shut them down upon a brief notice when the system needs resources for other customer jobs. However, customers like Salesforce can buy Spot Instances based on their application's degree of tolerance for interruptions.
Salesforce has achieved 48% savings overall since migrating Audience Studio to Spot Instances, Estrovitz said. "If you multiply this over 2,500 jobs every day, we've saved an immense amount of money."