One of the first challenges of performing a cloud migration is determining the best storage service that will house your data. Amazon Simple Storage Service is frequently used for object storage in Amazon Web Services.
The book Amazon Web Services in Action, written by Andreas and Michael Wittig and published by Manning Publications takes readers through a step-by-step breakdown of how to use bedrock Amazon Web Services (AWS) products, including Elastic Compute Cloud, Elastic Beanstalk and Simple Storage Service (S3). The book contains four chapters on storing data in the cloud, explaining how to use different types of databases and instance stores.
Amazon S3 provides object storage for AWS users, with a variety of storage classes for different business needs. S3 offers a standard tier, archiving through Amazon Glacier and a recently added tier for users who want infrequent access to data. While S3 is the most popular object storage option in the cloud, AWS users can turn to a variety of database options, including Amazon Relational Database Service, DynamoDB and Redshift.
This excerpt, from Chapter 7 of the publication, gives the reader an S3 tutorial, describing use cases and features. The publisher also offers a discount to SearchAWS readers interested in purchasing Amazon Web Services in Action. Enter the code "wittig4iq" at checkout to receive a 40% discount.
Back in the old days, data was managed as files in a hierarchy consisting of folders and files. The file was the representation of the data. In an object store, data is stored as objects. Each object consists of a globally unique identifier (GUID), some metadata, and the data itself, as Figure 1 illustrates. An object's GUID is also known as its key; addressing the object from different devices and machines in a distributed system is possible with the GUID.
The separation of metadata and data allows clients to work only with the metadata for managing and querying data. You only have to load the data if you really need it. Metadata is also used to store access-control information and for other management tasks.
The Amazon S3 object store is one of the oldest services on AWS. It's a typical web service that lets you store and retrieve data in an object store via an API reachable over HTTPS.
The service offers unlimited storage space and stores your data in a highly available and durable way. You can store any kind of data, such as images, documents and binaries, as long as the size of a single object doesn't exceed 5 TB. You have to pay for every GB you store in S3, and you also incur minor costs for every request and transferred data. As Figure 2 shows, you can access S3 via HTTPS using the Management Console, the Command-line interface (CLI), SDKs and third-party tools, to upload and download objects.
S3 uses buckets to group objects. A bucket is a container for objects. You can create up to 100 buckets, each of which has a globally unique name. By unique we really mean unique -- you have to choose a bucket name that isn't used by any other AWS customer in any other region, so we advise you to prefix the buckets with your domain name (such as com.mydomain.*) or your company name. Figure 3 shows the concept.
Typical use cases are as follows:
- Backing up and restoring files with S3 and the help of the AWS CLI;
- Archiving objects with Amazon Glacier to save money compared to Amazon S3;
- Integrating Amazon S3 into applications with the help of the AWS SDKs to store and fetch objects such as images; and
- Hosting static Web content that can be viewed by anyone with the help of S3.
Editor's note: This Amazon S3 tutorial is an excerpt from Amazon Web Services in Action, authored by Andreas and Michael Wittig, published by Manning Publications, September 2015, ISBN 978-1617292880.
For source code, sample chapters, the online author forum and other resources, go here.
Storage service provider puts S3 to use
S3 issues behind cloud snafu
DynamoDB and S3 work well together