Sergey Nivens - Fotolia

When should data normalization occur in Amazon DynamoDB?

Not sure when to normalize data in Amazon DynamoDB? Follow these three examples to learn more about the normalization process and how it can play a role in this AWS database.

Ofir Nachmani, I am OnDemand

Published: 31 Jul 2018

For most admins, data normalization is a key concept that comes to mind when they think of a relational database. But for users of Amazon DynamoDB -- which follows a NoSQL, nonrelational database model -- normalization could also play a role.

In a relational database, normalization helps ensure data integrity, as it involves structuring data in such a way that it's not stored multiple times. To achieve this, admins store data in different tables and connect those tables via relationships.

Data normalization and DynamoDB

While the normalization process is largely associated with relational databases, there are exceptions. For example, there is a type of NoSQL database -- called a wide-column store -- that uses tables, rows and columns. To store unstructured data, the formats and names of the columns in these types of NoSQL databases can vary on the row level. In essence, a wide-column store NoSQL database -- including Amazon DynamoDB -- is a dimensional key-value store.

So, when would data normalization make sense with Amazon DynamoDB? Here are three examples.

1. To store large items

The maximum item size in Amazon DynamoDB is 400 KB. To store items larger than this, admins can either use an S3 bucket or a separate, normalized DynamoDB table. If they use the DynamoDB table, they can break the larger items into smaller chunks and then organize relationships between those chunks to re-create the item in an application.

2. Frequent data updates

Admins provision DynamoDB tables with capacity units for read and write operations. A capacity unit is defined as one operation -- read or write -- per second for an item up to 1 KB in size. If an organization constantly updates data, it will quickly consume the provisioned write units and will need to upgrade the limits -- which isn't cheap -- to avoid performance issues.

In some situations, an application might be slow and totally unreachable. If this is the case, update normalized data -- the smaller, necessary fields -- rather than unstructured data, as Amazon DynamoDB calculates updates based on the entire item, not the portion of that item that needs updates.

3. Expected application behavior

If admins can organize their application data into separate tables for frequently accessed versus not frequently accessed, they can apply data normalization to them and save money with different read and write capacity unit configurations. This isn't easy in most modern web and mobile applications, but admins can monitor how an application uses Amazon DynamoDB to help optimize performance and cut costs.

Of course, planning is key to any good database design, so an organization should review official AWS documentation on relational modeling in Amazon DynamoDB before it makes any unnecessary changes to a table's layout.

When should data normalization occur in Amazon DynamoDB?

Not sure when to normalize data in Amazon DynamoDB? Follow these three examples to learn more about the normalization process and how it can play a role in this AWS database.

Data normalization and DynamoDB

Dig Deeper on AWS database and analytics strategy

Cloud database security: Best practices, challenges and threats

time-to-live (TTL)

Use AWS Glue workflows to convert semistructured data

A better way to query DynamoDB data with SQL

Related Q&A from Ofir Nachmani

How can you prevent a cloud data breach in your AWS environment?

Plan ahead for traffic spikes with AWS CDN

VMware Network Insight eyes better AWS network visibility