Data Protection Methods

Quick Links to Topics:

Data Protection Overview

Protecting data is paramount for safeguarding important information from corruption and/or loss. Considering a broad range of disaster scenarios is essential when designing a data protection strategy. These situations can range from a deleted file or corrupted virtual machine, to a crashed server or a complete data center disaster. The more situations you plan for, the better prepared you are if one occurs.

Commvault^® software has several methods to help you successfully achieve data protection. Each method impacts the Recovery Point Objective (RPO) and Recovery Time Objective (RTO) parameters, which determine the best data backup plan. Understanding Commvault's data protection methods, their strengths and benefits will result in an optimal architectural design that achieves your data protection needs.

Traditional Backups

Traditional backups to tape or disk protect data by backing up each object to protected storage. One advantage of using traditional backups is that each item protected is a complete separate copy that is backed up to separate media. When using tape media, the backup becomes portable.

Many modern backup solutions incorporate traditional backups to disk storage, which is then replicated to a Disaster Recovery (DR) site. For example, Commvault deduplication and DASH Copy uses traditional backups with a scheduled replication, where only changed blocks are transmitted to the DR location.

Traditional backups are usually slower than some modern protection technologies. Time-consuming backups can have a negative effect on Service Level Agreements (SLAs). This performance bottleneck is more severe when millions of items, such as large file repositories, require protection.

While there are positive and negative results of traditional backups, this method has been the most common and cost effective because it's a reliable data protection solution.

Illustration of a traditional backup

Snapshots

Snapshots are logical point-in-time views of source volumes that are conducted almost instantaneously. This allows for shortened RPOs since the snapshots can be conducted more frequently throughout the day. A snapshot is not truly considered a DR protection strategy since the protected data is not physically moved to separate media.

Advanced snapshot technologies allow for data to be mirrored or vaulted to separate physical disks, which can be located at off-site DR locations. Snapshot technologies are used to meet strict SLA requirements, but are considerably more expensive to implement because dedicated hardware is required. Commvault's Continuous Data Replicator (CDR) is a software-based snapshot and replication technology, which is a cost-effective alternative to hardware snapshots.

For supported hardware and CDR, IntelliSnap^® technology is used to conduct and manage snapshots.

Illustration of snapshots

Replication

Replication technology is used to reproduce block or object changes from a source volume to a destination volume. Replication methods use synchronous or asynchronous replication to synchronize source and destination volumes using a one-to-one, one-to-many (fan out), or many-to-one (fan in) replication strategy. Production data is replicated providing fast SLAs for high availability. Backup data or snapshot data is replicated providing a more complete DR solution.

If corruption occurs at the source, it may be replicated to the destination. Therefore, replication should be used along with point-in-time snapshots.

Illustration of a replication scenario

Block-Level Backups

Block-level backups use a Commvault^® driver and Volume Shadow Services (VSS) to quiesce and conduct a software snapshot of a disk volume. A block-level backup protects data much quicker than traditional object-level backups. Indexes can optionally be generated for granular recovery, or a live browse operation can be conducted to virtually mount the volume to provide full or granular recovery.
Block-level backups are recommended for volumes with a large number of small files.

Illustration of block-level backup

Index vs. Non-Index vs. Live Browse Based Jobs

Commvault^® software provides several methods to backup and restore data that has been lost, accidentally deleted, corrupted or made inaccessible. The following methods offer different benefits depending on your situations:

Index
Non-Index
Live Browse and Deferred Index-Based jobs

Index-Based Jobs

For many agents, Commvault^® software provides granularity during restores. For example, you can use the file system agent to recover a single deleted file without restoring the entire drive. To restore that single file, Commvault software has a mechanism that indexes the backup jobs. The index contains the information about all objects that are protected in the job, as well as the chunk information to locate the object in the library. A data protection job for such agents is referred to as an index-based job.

Index based data protection job workflow

Non-index-Based Agents

Some other agents, such as database agents, only support a full database recovery and do not rely on indexes. This is referred to as non-index based agents. But not all database agents are non-index based. For instance, the Oracle agent offers granularity that requires the protection jobs to be indexed.

Non-index based data protection job workflow

Live Browse-Based and Deferred Index-Based Jobs

Only a few agents offer the capability to run an indexed job or to defer the indexing process to be executed during the browse operation. These are referred to as live browse-based jobs and deferred index-based jobs.

Live Browse Jobs

Live browse has the capability to allow some agents use a mechanism that mounts the backup from the library and display its content in the console. An example of a Live Browse-based agents is the Virtual Server Agent (VSA) which will, if configured as such, mount the VM from backups to recover granular files and folders in guest VMs.

Deferred Index Jobs

Another option for some agents is to defer the creation of the index of a backup job. This is referred to as a deferred-index job. An example is the IntelliSnap^® snapshot; where the option to create an index can be deferred later. In this case, the snapshot created during the backup job is mounted when scheduled and the index is generated. This is useful to complete the backup quickly and defer the creation of the index out of peak hours.

Deferred index (Live Browse) based data protection job workflow

Backup Types

Commvault^® software provides the following backup types for protecting data:

Full
Incremental
Differential
Synthetic Full
DASH Full

Full

A full backup protects all data within a subclient each time it is run. This provides the most complete level of protection by backing up all data every time the job runs. It also provides the fastest recovery time since all data is contained within a single job.
Full backups require the most storage space and take the longest time to complete.

Full backup process

Incremental

Incremental backups protect all data that has been added or modified since the last successful backup operation. Based on the average incremental rate of change and growth, the incremental backups should remain consistent in size. An incremental backup is considered a dependent operation, as it is dependent on previous incremental backups and the full backup which started the cycle.
For a full restore of data, the full and all incremental backups are required.

Incremental backup process

Differential

A differential job backs up all data that has been modified or added since the last full backup job. The first differential job following a full backup just contains incremental changes since the full backup completed. As subsequent differential backups are run, the backup size increases since all changed and added data is backed up in each differential. As the cycle progresses and more differential backups are executed, they continually increase in size requiring more storage until the next full backup runs. Restores are slower than a full, but faster than using incremental jobs since only the full and most recent differential is required for a complete restore.

Another advantage of differential jobs is that modified data is being redundantly stored throughout the cycle as each differential completes. This could potentially limit data loss if a differential job is lost or damaged.

Differential backup process

Synthetic Full

A synthetic full backup synthesizes full backup operations by copying previously backed up data into a new full backup job. It works by using the Image file from the most recent backup. The image file contains a list of all objects that existed at the time the backup operation was run. The synthetic full uses the image to determine which objects require protection and copies the objects from previous backup jobs into a new synthetic full backup. No data is backed up from the production client, which can reduce the time required to generate the synthetic full backup as opposed to a traditional full backup.

For synthetic full backups to work properly, an initial full must be run which provides the foundation in which the synthetic full backups will be based on. Incremental backups must be run after the initial full and subsequent synthetic full to ensure all required objects are in protected storage. When the synthetic full runs, it copies all required objects into a new synthesized full backup, which will then become the foundation for the next synthetic full backup.

Synthetic Full key points:

Synthetic full backups are useful for large volumes or Exchange mailbox backups where many objects require protection or when the production client has very short operation windows.
Synthetic full backups work best on non-deduplicated disk storage. When using Commvault deduplication, use DASH Full backups instead of traditional synthetic full backups.
Using Synthetic full backups on appliance-based deduplication devices can have a negative impact on performance. In some cases, the performance can be slower than running regular full backups.

If using 3rd party deduplication solutions, test this option before implementing.

Synthetic full backup process

On-Demand Learning Library

Data Protection Methods

Data Protection Overview