Identifying unused or unimportant alerts in AWS can be time-consuming and difficult, especially when an IT team...
creates hundreds of Amazon CloudWatch alarms.
IT teams must operate lean and efficiently. A single operations team needs to be able to automate provisioning and configuration for hundreds -- or thousands -- of machines. But, if done incorrectly, automation can create time-consuming tasks for IT teams. Events, such as automated alerts, which are intended to generate transparency into the environment, instead can create clutter and confusion.
The AWS Command Line Interface (AWS CLI) can help reduce the time and effort it takes admins to manually identify and remove AWS alerts.
Amazon CloudWatch alarms alert administrators when a metric falls outside of preconfigured levels. AWS administrators can create CloudWatch and be used with a variety of AWS utilities, including Amazon Simple Notification Service, Auto Scaling, AWS CloudTrail and Identity and Access Management. But while Amazon CloudWatch alarms can help admins detect underutilized Elastic Compute Cloud instances, for example, receiving too many alerts can be distracting. Finding and turning off noncritical alerts frees up valuable time.
There are three types of Amazon CloudWatch alarm states: OK, Insufficient and Alarm. The Insufficient state gives admins information about metrics and helps them identify unused or unimportant alerts.
For example, if an IT team sets dimensions based on an Auto Scaling group name and later deletes that group, those associated CloudWatch alarms will appear to be in an Insufficient state as a result of unknown data points. In this instance, the following AWS CLI command pinpoints which metrics are in an Insufficient_data state:
$ aws cloudwatch describe-alarms --state-value "INSUFFICIENT_DATA"
This command also provides useful information, such as alarm name, state, reason behind the metrics and where to notify the alert. Based on the state reason value, a developer can determine whether or not to delete certain metrics.
In the following example, admins can determine why a particular data point is unknown:
Insufficient data: three data points were unknown.
To delete that data point, the admin would use the following AWS CLI command:
$ aws cloudwatch delete-alarms --alarm-name awsec2-CPU-UTIL-HIGH
This command eliminates the alarm that sends alert notifications.
Create custom metrics to get the most out of CloudWatch logs
Use CloudWatch metrics to track AWS usage
Utilize CloudWatch logging to track resources
Dig Deeper on AWS CloudWatch and application performance monitoring
Related Q&A from Ofir Nachmani
Get a cloud expert's take on the technical factors involved in the Capital One data breach that exposed sensitive data of millions of the bank's ... Continue Reading
While Amazon CloudFront can make traffic spikes more manageable, IT teams still need to carefully prepare their environment for these increases in ... Continue Reading
Some AWS users should consider a third-party tool to find better visibility into their network infrastructure and traffic patterns instead of relying... Continue Reading