CAMBRIDGE, Mass. -- Enthusiasts of Amazon Web Services met here this week to discuss everything from the architectural decisions that helped them achieve scale on AWS to the largely homegrown tools that have bailed them out of sticky situations.
The running theme at the meet up was the do-it-yourself nature of Amazon Web Services (AWS), which is only magnified when there are thousands of instances under management. Customers on a panel said they often relied on their own imaginations and coding skills to make large deployments work.
For example, Amazon's Relational Database Service (RDS) is not an option for Acquia Inc. -- an open source software company providing support for Drupal -- in a 6,000-instance AWS deployment.
This is because Acquia's staff needs to make fine-grained configuration changes to databases that RDS doesn't support. Some customers of Woburn, Mass.-based Acquia have standalone database servers, for example, while others combine a Web service and database on one box.
Some customers need custom entries in the MySQL configuration file. Then there are various types of replication verification and monitoring Acquia does with its database data.
"We can do whatever we want to it, whereas RDS gives you a certain set of configurability," said Acquia's Senior Architect Barry Jaspan.
Automation and geographic redundancy are key
How large-scale clouds are designed from the ground up can also be a major factor in their success, the panelists said -- especially in the cloud, where there's no one "right" way to do things.
For example, Stackdriver Inc., a Boston-based maker of a tool that performs application monitoring and management in AWS, processes 600 million messages per day from sensors in the cloud, and has incorporated three completely separate data "pipelines" to minimize failures affecting the whole infrastructure, according to Joey Imbasciano, cloud platform engineer at Stackdriver.
On the other hand, Acquia's Jaspan said a holistic approach can be beneficial when it comes to testing systems.
"All of the things our customers do, we implement tests for, not by isolating individual components and testing them, but testing the whole system [as] the customers will use it," he said. "I would say without that we would be totally dead, just because of the number of things that can go wrong that we would never know until months after we made a release."
Here, again, the test system is a homegrown application, loosely based on Ruby Test Unit.
For more on AWS
To learn how critical cloud monitoring and automation are for AWS customers, click here.
Proper design can sometimes come as the result of experiencing a pitfall, according to Greg Arnette, co-founder and chief technology officer of Sonian Inc., a cloud email archiving service provider based in Dedham, Mass. that processes 20 million documents per day from 12,000 business customers on a cloud infrastructure, including Amazon, Microsoft's Azure, IBM SmartCloud and multiple OpenStack clouds.
Sonian started running on Amazon six years ago -- before the kinks had been worked out with Elastic Compute Cloud and EBS performance -- and found out the hard way that this combination of services wasn't going to suit its needs.
"We had to go through a number of different architectural implementations to get to a point where now we feel really confident of reliable processing, no loss of data, and efficiency in economics that work well for our business model," Arnette said. "Right now we have pure stateless processing from a customer's email server directly onto [Amazon Simple Storage Service] with no middleman process at all in the way."
Moreover, Sonian also can failover between any Amazon region.
"It took several years of hard work to figure out how to make all that work correctly, at scale," Arnette said.
Amazon did not respond to requests for comment.
Dig deeper on Amazon EC2 (Elastic Compute Cloud) management