adam121 - Fotolia
Software developers can spend a lot of time tuning applications to improve performance. If data from AWS S3 is...
too slow, developers can use Elastic Block Storage or provision IOPs to reach a guaranteed level of I/O performance. If the problem stems from limited CPUs or memory, developers can scale up to a larger instance. But what do you do when none of these options are enough and you're still burdened with slow performance on AWS? It might be time to turn to Amazon ElastiCache.
The first task when identifying an application performance issue in AWS is to find the bottleneck. If your application is CPU-bound, increase the size of the instance you use or add servers to your cluster to reduce the load on each server. If the system is struggling with limited memory and the problem stems from reading data from disks or solid-state storage devices, use a cache. A cache is an in-memory storage mechanism that reduces the amount of I/O in disk reads.
Disks, and even solid-state drives that emulate disk protocols, are significantly slower than reading from memory. With a cache, once you read data from disks, you can keep it in memory and retrieve it again, when needed. The first time data is read, the system will incur the usual latency of reading from persistent storage. After that, however, you will have in-memory performance.
Amazon ElastiCache service addresses the need for this type of caching. ElastiCache users can choose between two software caches: Memcached and Redis. While Memcached is widely used, Redis is a full-featured key value data store. This tip focuses solely on Redis.
Using Redis for cached data
Suppose an enterprise profiled an application's performance and discovered it's spending too much time reading the same data from the database. This is problematic when developers have to wait for long-running queries that perform multiple times, which happens when different users issue the same queries. This could mean there's a DBA with some free time on her hands. If that's the case, ask that DBA to tune the database-caching mechanism or de-normalize the table. More likely, however, there's another solution.
Storing results of long-running queries in memory is one solution. When the application tries to retrieve potentially cached data, it queries the cache. If the cache contains data, then the data is returned. If there is no data in the cache, it's retrieved, returned to the calling function, and stored in the cache for future use. To implement this model of caching, you need a way to uniquely identify different queries and store the results.
Redis is a key value data store. The key is a string that uniquely identifies a value. For simple queries, such as retrieving a column value from a table using a primary key, use a convention such as "<table name>:<primary key>:<column name>." Redis also supports a wide range of value types, including: strings, lists, sets, hashes and sorted sets. These allow for richly structured data types. Result sets from queries with multiple rows and columns could be stored as JSON or XML objects, for example.
Amazon ElastiCache can run on a range of server node types. The smallest supported node is the cache.t2.micro, 0.555 GB of RAM and low to moderate network performance. The largest node it supports is the high memory cache.r3.8xlargef with 32 CPUs, 237 GB of RAM and 10 GB network speeds.
The number of read / write operations per unit of time and the size of data it stores influence the optimal node size for your needs. Pricing ranges from $0.017 to $3.64 per hour per node. ElastiCache nodes are not backed up by default. If you want to back up the cache, specify backup parameters under Configure Advance Settings in the Amazon ElastiCache console.
About the author:
Dan Sullivan holds a master of science degree and is an author, systems architect and consultant with more than 20 years of IT experience. He has had engagements in advanced analytics, systems architecture, database design, enterprise security and business intelligence. He has worked in a broad range of industries, including financial services, manufacturing, pharmaceuticals, software development, government, retail and education. Dan has written extensively about topics that range from data warehousing, cloud computing and advanced analytics to security management, collaboration and text mining.