Amazon Web Services includes storage services such as Simple Storage Service, Elastic Block Storage, Glacier storage...
and DynamoDB. These four storage services provide users with a variety of options for storage usage with other services such the AWS Elastic Compute Cloud computing service.
Associated with the storage services are multiple regions that contain multiple availability zones where data and applications reside. The rules about where data can reside, and whether it can be moved from region to region, vary among storage services and impact the choice of storage services for applications.
Customers who move some, or all, of their IT responsibility to the Amazon Web Services (AWS) environment are challenged by the number of parameters associated with the storage services that they need to consider when selecting the appropriate storage service(s) for their applications. These parameters include, but are not limited to: amount of storage needed, storage bandwidth, level of availability, level of durability, charge for data transfer, charge for storage, charge for deletions, use of regions, pricing for each and various other parameters.
Planning for AWS storage use is extremely important and consists of two phases:
- An initial planning phase during which you determine which AWS storage service(s) will best satisfy your application requirements.
- Ongoing planning after deployment on AWS storage.
Planning with respect to AWS storage use is ongoing because application needs may change, you may be adding new applications, and Amazon makes frequent changes to its storage services, including pricing.
Selecting the most appropriate AWS storage service for an application is a challenge for many customers because in most cases each service has its own variations on bandwidth, level of availability, level of durability, charge for data transfer, charge for storage, charge for deletions, use of regions, pricing, etc.
This article provides summary characteristics and tips around the selection of storage services based on the needs of applications. A short description of each of the four AWS storage services is provided below to help users intelligently select which services will work best for them.
Simple Storage Service
Amazon Simple Storage Service (S3) is an object storage service used in a large variety of applications for creating, retrieving and deleting objects. S3 is appropriate for unstructured data objects where the data is considered a string of bits. Every S3 object has a unique URL. S3 gives an organization the opportunity to offload some or all of its storage infrastructure onto the AWS EC2 service.
S3 is the most flexible storage service that AWS provides, allowing you to use it for numerous use cases as well as for archiving and backing up critical data in an organization. Most users do not access S3 objects via the AWS API (application programming interface). Instead, objects are usually accessed by a higher-level tool or application that provides an easier-to-use interface for manipulating S3 objects. These can be accessed external from AWS via URLs over the Internet as well as internally with other AWS services such as EC2.
Users tend to use S3 as secure, location-independent storage. Another common personal use is to back up local files. Companies use S3 to store user manuals, company videos and presentations, etc. S3 objects are created, retrieved and deleted.
With S3, you specify a region where a bucket is created that contains objects (limited to 5 terabytes in object size) that can be written, read, deleted and listed. Multiple buckets can be created in a region with an unlimited number of objects per bucket. Different AWS regions can be chosen for regulatory compliance requirements, performance, service-level agreements, cost and redundancy with authentication mechanisms including encryption. Unlike EBS-based storage, while S3 objects are created in a specific region, they can be accessed from anywhere.
You pay for S3 storage by the gigabyte. Cost varies depending on the AWS region selected and ranges from $0.03 per gigabyte per month for the first terabyte to $0.0275 per gigabyte per month for more than 5,000 terabytes of storage (in the U.S. Standard region). There are also charges for AWS API requests. S3 data transfer pricing is based on data transferred in to and out of AWS S3. Data transfer to S3 and deleting objects is free. S3 pricing options also vary depending on whether you select standard or reduced redundancy. Reduced redundancy trades faster availability recovery time for lower cost per gigabyte of space capacity.
Elastic Block Storage
Amazon Elastic Block Storage (EBS) is a volume-based, network-based, block-level storage service for use with AWS EC2 instances. EBS storage is attached to AWS instances to provide additional and persistent storage with a life span separate from EC2 instance types. When an instance is terminated, attached volumes are detached from the instance and can be attached to another instance, provided it is in the same availability zone within the same region. Unlike S3 objects, EBS volumes are updatable.
EBS provides standard volumes and provisioned IOPS volumes. Standard volumes are designed for applications with moderate I/O requirements. Provisioned IOPS volumes offer storage with consistent and low-latency performance, and are designed for applications with I/O-intensive workloads, such as databases.
EBS volume storage is persistent storage that can be attached and detached from running EC2 instances. EBS volumes, however, are only accessible from within the availability zone in which they were created. Furthermore, they can be accessed only by the EC2 instance to which they are attached.
EBS storage can be used like a formatted disk drive. Formatting requires file system software on the EC2 instance to which an associated volume is attached. A mounted EBS volume allows an operating system to read and write the EBS volume. Any EC2 instance that needs to mount and use a volume must be located within the same availability zone as the instance. AWS provides technology, via snapshots used with S3 storage, to make the use of EBS volumes across regions possible.
EBS pricing is $0.05 per gigabyte per month in the U.S. East region. AWS charges $0.05 for I/O requests to EBS volumes. Provisioned IOPS starts at $0.125 per gigabyte per month and snapshot-based S3 storage goes for about $0.095 per gigabyte per month.
Glacier storage services
Glacier storage is designed for long-term backup and archival storage at a cost per gigabyte much less than that of S3 and for which retrieval times in single-digit hours (three to five) are acceptable. Glacier pricing is also dependent on the region but it is as much as 90% cheaper than S3.
Because Glacier storage is backup/archival storage, in most cases there is no need for fast retrieval, and as a result, the cost per gigabyte is low ($0.01 per gigabyte in the U.S. East region). There is no cost for data transferred into Glacier, but the fee for data transferred out is $0.120 per gigabyte.
DynamoDB storage service
Amazon Dynamo Database (DynamoDB) is the newest AWS storage service. It is a NoSQL database service designed for high scalability and predictable performance. DynamoDB is intended to reduce the administrative burden of scaling distributed bases. It is also very good for key-value storage and provides highly scalable, high-performance storage based on tables indexed by data values referred to as "keys." DynamoDB storage is dispersed across availability zones. You pay a flat hourly rate for DynamoDB starting at $0.0065 per hour.
It is important to pay attention to security, flexibility of the storage solutions, performance and scalability. Determine which AWS storage services complement each other and how this can be used to your advantage to save you money.
Reduced redundancy with S3
Amazon EBS weakness
Glacier freezes tape
DynamoDB vs SimpleDB