Most modern applications depend on a database tier for aggregating, querying and managing data. Deploying a database...
on top of AWS demands the need for monitoring and management capabilities to make it easier to scale applications up, reduce the impact of failures and keep costs down.
AWS has done a good job of improving its core database capabilities and the automated management and monitoring features associated with them. At the same time, alternatives to native AWS database services are improving their management features. Enterprises may want to take a closer look at the tradeoffs between the balance of features and management of native and third-party database services available on AWS.
Understand system requirements
It is critical to think hard about the application requirements before the software development team starts building, said Sebastian Stadil, founder and CEO at Scalr. It's generally easier to make architectural changes before starting. While it's always difficult to evaluate the scale of a system, it's rarely difficult to evaluate the consistency guarantees that are required.
For example, Netflix uses AWS to host much of its data in the Apache Cassandra database. In the grand scheme of things, it's more important for Netflix to be online (available) than remember exactly where the last movie watched was stored (consistent).
Organizations need to think about whether availability and scalability are the most important considerations. Many organizations implement applications on a NoSQL platform optimized for availability, which needs to be tweaked to guarantee consistency. A better approach for these types of applications would be to go with a traditional relational database like PostgreSQL, which can provide the required consistency more effectively.
Conversely, many enterprises make the mistake of going with a transactional relational database that provides consistency, but is not highly available -- such as for a recommendation system application. In these cases, the enterprise would have been better off using a database like Cassandra from the start.
Another good practice lies in thinking about leveraging native AWS services to replace traditional database architectures. Dayal Gaitonde, director of engineering at Appirio, said that when developers migrate apps from traditional environments to AWS it is common that they will set up database servers and clusters.
Amazon RDS provides the ability to use a database as a managed service, which provides a nice alternative to MySQL, PostgreSQL, Oracle and Microsoft SQL Server, Gaitonde said. This is because RDS includes a number of monitoring and management features for taking care of things like upgrades, patches and failure detection. However, it does not remove all database administrator responsibilities. He said enterprises still need to configure backups, determine the requirements of the database and specify things like memory and storage.
AWS does include a number of database services that come with built-in management and monitoring functionality like SimpleDB and DynamoDB. These services are provided in such a way that scaling up is relatively trivial and can reduce the need for dedicated services. But Kelly Stirman, director of products at MongoDB, a NoSQL database vendor, noted that other NoSQL databases tend to provide a much richer feature set required for many use cases, which is the reason they are growing in popularity.
This could change as AWS improves its underlying functionality. For example, AWS just added support for JSON data models on top of DynamoDB, which has been one of the compelling differentiators for alternatives like MongoDB. But at the moment, there are much more stringent limits with the AWS services in terms of the size of documents, support for text search capabilities and geospatial capabilities compared with third-party alternatives, Stirman said.
Balancing scalability with functionality
One of the main challenges with third-party NoSQL databases like MongoDB has been the difficulty in provisioning new instances, updating software or scaling them up. For example, 150 steps are typically required to configure a MongoDB cluster running on 12 servers. Automation features recently added to MongoDB's MMS can reduce this manual work to a few clicks. "With MMS you don't need to understand the internals of how it works, and you can do it in minutes instead of hours," Stirman said.
Decisive, an Internet advertising management service that runs on top of AWS, used MMS to simplify automatic deployment and scale up new MongoDB instances for a new API for creating and optimizing mobile advertising campaigns. Ryan Witt, CTO of Decisive said they chose MongoDB because of the flexibility of the data model. MongoDB makes it possible to gather a wide variety of data about a campaign, such as the number of and types of users that are seeing ads and their click-throughs.
While AWS provides some basic monitoring and management capabilities for the databases running on top of it, these are not detailed enough for many needs, Witt said. MMS allows Decisive to measure disk I/O traffic coming in and latency on the client side. MMS can also help to assess the heath of the database cluster, which makes it easier to troubleshoot problems when they occur. As the write and read loads get higher, MMS reports where these are occurring and provides guidance on the best way to scale up the system to address these challenges.
At the same time, AWS database services like Redshift are sufficient and cost-effective for many use cases like data warehousing. "For certain types of work like enterprise analytics, Redshift works great," Witt said. "We will often store events on MongoDB, and once we no longer want to store it, we will move it to Redshift. But then you tend to pay for that in terms of flexibility."