New Backup Format
The new backup format approach implies that backup data on backup storage is always kept as a data container, regardless of the backup type. This enables keeping backup plans completely independent from each other. Thus, every backup plan is a separate entity that delivers backup data to a separate location on backup storage that allows avoiding any possible plan interference issues.
Backup data is divided into blocks and a block is a main operating entity. During an upload to the cloud, blocks are combined into parts, which size can vary. A part size depends on uploading speed and backup storage provider limitations. Uploading by parts enables to continue upload in case of backup interruption.
Backup format key features are:
- Grandfather-Father-Son (GFS) Retention Policy
- Restore Verification
- Immutability (BETA)
- Client-Side Deduplication
- Consistency Checks
- Synthetic Backup for file-level, image-based, and VMware backups
- Modified Block Tracking for Image-Based Backup
- Restore on Restore Points.
- The number of requests to storage is reduced significantly
- Uploading by data parts enables continued upload in case of network issues
- Any characters (emoji, 0xFFFF, etc) and extra-long filenames supported
- Filename encryption in the box (one password for generation)
- Real full backup for file-level backups
- Fast synchronization (reduced number of objects in backup storage)
- Plan configuration is always included in a backup
- Backup logs are backed up along with backup data
- Object size is now limited to approximately 2PB regardless of the storage provider limitations
- Fast purge (reduced number objects on backup storage, deletion of whole generation database)
- Password Hint
- Faster backup and restore for a large number of small files
- Lower costs for a large number of small files (not applied for S3 standard-IA with 128KB limit).
Currently, the new backup format is supported for the following backup types:
- File backup
- Image-based backup
- VMware backup
- Hyper-V backup
Terms and Definitions
The section contains several new terms and entities that need to be explained to operate them in the future.
The backup Plan determines the backup data configuration sent to a backup destination. The configuration contains a number of parameters:
- Backup Data
- Retention Policy
- Backup Plan Run Schedule.
Bunch is a notion of a backup plan in the main database. Bunch is linked to a directory in the database which in turn is linked to a destination. A destination can be modified. Bunch is always unique within the cloud folder and the plan type. This approach enables comfortable data deletion on cloud storage since all backup content is stored in one directory.
Generation is a complete self-contained data set sufficient for data restoration. In other words, generation is a set of a full backup and chain of incremental backups for a specific backup plan.
Restore Point is a partial data set for restore. A full-fledged restore point contains at least one file or directory. If a restore point does not contain any file or directory, it is considered empty, but successful can contain blocks for further subsequent runs. A valid Restore Point guarantees a correct restore of backed-up data. As the opposite, invalid Restore point does not contain a complete data set for restore, but at the same time can contain blocks that are used for restore from other Restore Points.