LAS VEGAS -- Amazon continues to fill gaps in its AWS database services for customers with distributed systems...
in an increasingly competitive market among cloud providers.
Updates to Amazon Aurora and DynamoDB -- some available now and others in the pipeline -- intend to address consistency issues and avoid downtime for applications spread across multiple regions. AWS also added a graph database service for workloads that use sources, such as social media streams, and brought querying capabilities to make better use of unstructured data stored on the platform.
A feature called Multi-Master, now in preview for Aurora, the MySQL- and PostgreSQL-compatible relational database, and DynamoDB, the NoSQL database offering, aims to improve reliability and avoid errors when requests arrive in rapid succession.
In Aurora, Multi-Master can add up to 15 read replicas across three availability zones to accommodate millions of reads per second. If a master or availability zone fails, the new feature provides subsecond failover to accommodate high throughput and high availability.
"The multi-write feature has been an ask for all of the database engines forever ... it's an issue of consistency," said Sean Finnerty, executive director of healthcare and life sciences, security and compliance at REAN Cloud, an AWS consultancy in Herndon, Va.
Sean Finnertyexecutive director, REAN Cloud
Braze, a global consumer engagement company, processes a third of a trillion data points a month on AWS database services. It will move to Aurora for a new application, but wants to shard workloads because of the 64 TB maximum of disk space. The company is excited about Multi-Master's potential, but it may have to do application-based sharding if it can't go past that disk limit, said Jonathan Hyman, CTO at Braze.
Manu Mahajan, a development manager for a risk management company in New Jersey, said his team is in the early stages of a move to AWS database services. They use Microsoft SQL Server and IBM Db2, and they are interested in doing a lift-and-shift to Elastic Compute Cloud, followed by a move to Amazon Relational Database Service and ultimately onto Aurora. That last step, however, wouldn't have been feasible prior to some of these updates.
"We're looking to take it slow, but we were having this challenge of doing multiregion, active-active replication. So, now that Aurora is giving that out of the box is really good news," he said.
In DynamoDB, Multi-Master enables applications to conduct reads and writes in the region where the application is in use. Spreading masters across AWS global data centers also means applications will continue to run even if an entire region goes down. AWS also added on-demand backup capabilities it said won't affect performance on DynamoDB. And in 2018, it will add a feature to restore a 35-day-old backup to protect against data loss from application errors.
Another addition, Amazon Aurora Serverless, eliminates the need to provision servers to run the database. The product is suited for workloads that only run intermittently or face unexpected spikes in traffic. The MySQL-compatible version is expected to be available in the first half of 2018, with PostgreSQL support coming in the back half of the year.
Not everyone will want to swap out the existing Aurora for the serverless version. Braze, for example, has a fixed amount of read capacity, and scaling up and down isn't a clear benefit, Hyman said.
Other cloud providers have added services to address database consistency for applications with global footprints, with Google Cloud Spanner and Microsoft Cosmos DB going on sale earlier this year.
Amazon issues new offerings and features at a dizzying pace, but often the innovations are incremental, said Merv Adrian, an analyst with Gartner. That may not be easily visible to observers who aren't keeping close tabs on these updates, but they're often piled on top of one another to build out multiple competitive offerings.
One new addition to AWS database services, Neptune -- a fully managed graph database currently in limited preview -- does push into new territory that other players in the market will have to tackle.
"While they don't leap out ahead with Neptune, they demonstrate again that they respond quickly to market opportunities," Adrian said.
Query capabilities for data storage services
In addition to database updates, a new querying capability in Simple Storage Service (S3) and Glacier was a highlight at the AWS re:Invent user conference here this week.
S3 Select and Glacier Select can be used for simple SQL expressions to pull out only the data users need from the objects, rather than pull out the full bucket to search the data.
Braze had ingested data into Hadoop, but it's building a better data lake in S3, Hyman said. AWS Glue, the extract, transform, load service introduced last year, hasn't lived up to the promise, and the process of moving data from S3 to Redshift can be a major pain, he said.
S3 Select, currently in preview, supports CSV and JSON formats, but Braze uses Avro. Amazon has a history of adding services with limited capabilities and then expands them over time. Braze can't use S3 Select now, but it sees potential if AWS adds Avro support.
"If Amazon can fix that problem for us, we'll look to put it in Avro and just read it and get instant ROI," he said.
NASA Jet Propulsion Laboratory sees potential to archive less expensively through Glacier Select.
"If I want to know what the telemetry of a particular radar has been over the past 10 years, being able to look at that data without taking it all out is a huge benefit," said Tom Soderstrom, IT CTO.
Trevor Jones is a senior news writer with SearchCloudComputing and SearchAWS. Contact him at [email protected].