grandeduc - Fotolia
We're currently working on an automated mechanism, so our Amazon EBS volumes will be backed up on a daily basis. We know how to create snapshots, but we're concerned about the size of some of them. If we have a ton of data, we'll have a significant increase in the invoice for each backup (since we will be charged on the size). But here's our question: If our backup is incremental and we're only uploading the modified data, where's our original data?
I think the confusion here is how incremental backups work. Typical backup schedules involve something like one full backup per week, and daily backups in between. Amazon EBS does block-level backups, including for incremental backups, so here's how that works.
- Your first backup must be a full one, which stores all blocks in a compressed format in Amazon Simple Storage Service (S3).
- Your next backup can be incremental, which just stores any changed blocks of data, also stored in S3.
- Another backup may be made that's also incremental, which again just stores changed blocks from the previous incremental backup.
Let's say now you restore your third backup. It first has to load your last full backup (Backup 1), then load the changes from Backup 2, then from Backup 3 to push all of the changes on top of the last full backup. This is identical to how most version control systems operate: storing the original and then just incremental changes each time a new commit is created.
The advantage to incremental backups like this is that if you have a system that does not change very much, only the things which are changed need to be stored again, and those changes are usually small and require significantly less disk space and time to upload.
The disadvantage of incremental backups is that the restore time will almost always be longer, as it has to re-build the full backup from your batches of changes.
I strongly recommend that you adjust your Amazon EBS backup policy, and keep at least one full backup per month, and keep your daily backups as incremental backups. You can also make sure to clean out and remove all but one backup for each prior month to help trim down the amount of storage you need to keep long term.
Why enterprises are moving to private storage clouds
Dig Deeper on AWS cloud development
Related Q&A from Chris Moyer
Can an application have Python as a container, run SQL queries on an external Microsoft SQL database and publish the results on an Apache web server ... Continue Reading
The wait is over, as you can now trigger Lambda functions with SQS messages. Follow these steps to get up and running with this new capability. Continue Reading
Event-driven computing means no IaaS provisioning and no data center to run. Can I migrate all enterprise apps to be event-driven? Continue Reading