For most admins, data normalization is a key concept that comes to mind when they think of a relational database....
But for users of Amazon DynamoDB -- which follows a NoSQL, nonrelational database model -- normalization could also play a role.
In a relational database, normalization helps ensure data integrity, as it involves structuring data in such a way that it's not stored multiple times. To achieve this, admins store data in different tables and connect those tables via relationships.
Data normalization and DynamoDB
While the normalization process is largely associated with relational databases, there are exceptions. For example, there is a type of NoSQL database -- called a wide-column store -- that uses tables, rows and columns. To store unstructured data, the formats and names of the columns in these types of NoSQL databases can vary on the row level. In essence, a wide-column store NoSQL database -- including Amazon DynamoDB -- is a dimensional key-value store.
So, when would data normalization make sense with Amazon DynamoDB? Here are three examples.
1. To store large items
The maximum item size in Amazon DynamoDB is 400 KB. To store items larger than this, admins can either use an S3 bucket or a separate, normalized DynamoDB table. If they use the DynamoDB table, they can break the larger items into smaller chunks and then organize relationships between those chunks to re-create the item in an application.
2. Frequent data updates
Admins provision DynamoDB tables with capacity units for read and write operations. A capacity unit is defined as one operation -- read or write -- per second for an item up to 1 KB in size. If an organization constantly updates data, it will quickly consume the provisioned write units and will need to upgrade the limits -- which isn't cheap -- to avoid performance issues.
In some situations, an application might be slow and totally unreachable. If this is the case, update normalized data -- the smaller, necessary fields -- rather than unstructured data, as Amazon DynamoDB calculates updates based on the entire item, not the portion of that item that needs updates.
3. Expected application behavior
If admins can organize their application data into separate tables for frequently accessed versus not frequently accessed, they can apply data normalization to them and save money with different read and write capacity unit configurations. This isn't easy in most modern web and mobile applications, but admins can monitor how an application uses Amazon DynamoDB to help optimize performance and cut costs.
Of course, planning is key to any good database design, so an organization should review official AWS documentation on relational modeling in Amazon DynamoDB before it makes any unnecessary changes to a table's layout.
Dig Deeper on AWS database management
Related Q&A from Ofir Nachmani
AWS resource tags provide admins with more visibility into their cloud deployment. Learn how to implement them with these best practices. Continue Reading
While Spot Instances can be economical, they can also cause data loss without proper backup. Use databases to save instance data, and set up ... Continue Reading
Personnel changes are inevitable at an enterprise, and security teams need to revoke AWS credentials accordingly to ensure the integrity of their ... Continue Reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.