This content is part of the Essential Guide: Cloud data warehouse guide: Using Redshift, rival platforms

Amazon RedShift performance monitoring strategies

The RedShift platform tackles a lot of cluster and database administration, but doesn't do it all. Some AWS tools can fill the monitoring void.

Amazon Redshift is a data warehouse platform built on a PostgreSQL foundation. Amazon Web Services handles much...

of the cluster- and database-software administration, making the platform easy to use. But that doesn't mean it's hands-off. Admins must monitor clusters and ensure database performance requirements are met. And AWS provides some key monitoring tools to help detect potential performance problems.

Amazon Redshift features two types of data warehouse performance monitoring: system performance monitoring and query performance monitoring. The goal of system monitoring is to ensure you have the right amount of computing resources in place to meet current demand. Query monitoring, on the other hand, is designed to help identify database code that's dragging and still meet end-user reporting requirements.

Monitoring system performance and health

AWS CloudWatch is a system performance monitoring tool that provides metrics on system health, as well as more detailed information such as CPU and storage use. The most basic feature of CloudWatch is a HealthStatus check that works like a server connectivity test (i.e., ping). CloudWatch sends a query to a cluster and responds with either a 'healthy' or 'unhealthy' diagnosis.

CPU utilization metrics can help determine appropriate cluster sizing. For example, if CPU utilization is consistently high -- above 80% for extended periods of time -- consider resizing the cluster. At the same time, if use is consistently low -- below 40% -- you could downsize the cluster. Relational databases often are CPU-intensive, particularly during join operations, so it's important to track CPU metrics.

These metrics can also help identify ways to optimize data load procedures. If some data load jobs perform CPU-intensive operations, schedule them outside of heavy query times. The elasticity of cloud allows IT teams to add and remove servers, as needed. However, if there is a clear pattern of demand and jobs can be adjusted to even out demand, it will save some administrative overhead.

CloudWatch also monitors read/write IOPs, which is helpful when using provisioned IOPs. If performance is slow for end users, admins may want to provision more for a particular load. Similarly, if actual IOPs are much lower than provisioned, it could be a signal that IT is overpaying for unused IOPs. Other metrics include storage disk utilization, read/write throughput, read/write latency and network throughput.

Load performance monitoring

System performance monitoring is just one piece of maintaining healthy clusters. Query and load performance monitoring is important, particularly for heavily used queries. A large number of users often utilize a small number of queries; fine-tuning these queries is worth the effort.

A Query Detail description is available for Amazon RedShift queries. This includes a SQL statement that shows how the SQL was translated into a sequence of procedural steps, such as reading an index, reading data blocks or executing a loop to join records.

Keep an eye out for unexpected or missing steps. For example, if you expect the database to use an index but it does a full table scan, perhaps the index was mistakenly dropped. Load Summary and Load Detail reports are also available and both detail information about specific loads at particular points in time. Information gathered includes query IDs, number of errors, SQL statements that were run, and the time it took to run a query.

Amazon Redshift has made finding all of this information easy with direct access from the control console; it is important to note that query and load data is available through the Redshift console. The CloudWatch interface enables you to view multiple metrics at the same time. This is helpful when looking for points of synchronous stress on the system, such as operations that rely heavily on both read and compute resources. Admins may also change the time ranges displayed to fine-tune what time period you want to monitor, as well as change measurement periods (e.g., change from measuring in five-minute to 10-minute increments).

RedShift helps lessen the load on data warehouse administrators. As good as it is, it still requires knowledgeable administrators to monitor both systems performance and query performance metrics.

About the author:
Dan Sullivan holds a Master of Science degree and is an author, systems architect and consultant with more than 20 years of IT experience. He has had engagements in advanced analytics, systems architecture, database design, enterprise security and business intelligence. He has worked in a broad range of industries, including financial services, manufacturing, pharmaceuticals, software development, government, retail and education. Dan has written extensively about topics that range from data warehousing, cloud computing and advanced analytics to security management, collaboration and text mining.

Dig Deeper on AWS database management