Launching instances in AWS is simple; getting an optimal configuration is not. Without sufficient evidence about...
CPU, I/O and memory use, it can be difficult to know if you're overspending on servers. Under-performing servers are easy to spot because application performance suffers. But spotting poor application performance doesn't address why the app isn't meeting your needs. There are two ways to address this: Use a performance monitoring service or conduct in-house experiments.
Cloud monitoring services feature specialized tools for collecting and analyzing performance data from Elastic Compute Cloud (EC2) instances. Administrators don't have to monitor their instances or write custom scripts to collect data or analyze logs. They do, however, have to give a third party read access to performance data about the servers. Some companies may have policies against this type of data exchange.
Using such monitoring services can also provide admins with forecasts about future costs based on past requirements. Companies that are comfortable designing regression models and writing code in languages such as R or SAS can do this type of monitoring in-house.
Cloud management tools such as Cloudyn and CloudCheckr can analyze performance and cost. Cloudyn, which works with AWS, Google and OpenStack, offers slightly different tools depending on which cloud you monitor. However, all of Cloudyn's monitoring options include use and price simulations as well as performance analytics.
CloudCheckr works specifically with AWS environments and offers customizable resource use and cost reports. The tool also analyzes EC2 instances, Simple Storage Service and Amazon Kinesis. CloudCheckr features an alert system that covers more than 300 criteria, including alerts for any Elastic Block Store (EBS) volumes that don't have snapshots, idle EC2 instances and Elastic Load Balancers with fewer than two healthy instances.
DIY cloud monitoring and experimenting
IT teams that want to tune a few isolated instances or aren't ready to bring in a third-party monitoring service can use some simple do-it-yourself methods for AWS app monitoring. The key to gathering useful information is to conduct experiments.
As with scientific experiments, try different combinations of variables. If, for instance, you want to find the smallest AWS instance to meet response time requirements, run the same simulated load on a variety of instance types while collecting data from each instance. You can use either AWS monitoring tools, operating system tools or a combination of the two.
AWS CloudWatch is a cloud-based monitoring tool used to monitor performance on numerous AWS offerings, including CloudFront, DynamoDB, ElastiCache, EBS, RedShift, Kinesis and Elastic MapReduce. CloudWatch metrics can include most resources from CPU use to disk reads.
One limitation with CloudWatch is that you have to specify what you want to monitor before actually monitoring it. An alternative is to interactively monitor an application through command-line utilities. If an unusual event occurs, such as a spike in disk I/O, you can use OS tools to delve into details of the event.
Most Linux distributions include a variety of command-line tools; other tools can be installed. Here are some to keep in mind for system monitoring:
- top -- displays processes in real time, such as CPU and memory use, cache size and buffer size.
- htop -- similar to top but includes shortcut keys and vertical and horizontal process views.
- iotop -- specifically monitors disk I/O.
- iostat -- shows storage input and output statistics.
- nethogs -- monitors network activity per application in real time.
- nmon -- monitors all resources and is used in real-time. Online mode or Capture mode store monitoring data in CVS format.
Many OS tools can save data to output logs for later analysis. Cloud-based and OS-based tools can help IT teams identify the optimal configuration. Because AWS has fixed configurations, admins may settle on an instance type with the right CPU resources but more memory than what is needed, or vice versa. Remember: It's important to have some excess capacity in case of peak demands. Keeping CPU use below 80% isn't unreasonable and leaves room for compute spikes. Use the same principal when sizing I/O and RAM.
Don't shirk network monitoring in the cloud
Managing cloud apps from birth through maturity