Commvault

eDiscovery Overview

Quick Links to Topics:



eDiscovery is the process of proactively or reactively conducting content searches on information within an environment. This is most commonly used during litigation cases to discover relevant information for an investigation. This information is made up of custodians, individuals directly or indirectly related to the case, date ranges, and information, contained within messages and documents. The eDiscovery process requires several steps to identify, preserve, review, and produce responsive information.


eDiscovery investigation high level concept





The eDiscovery Process and Commvault® Software

Understanding the eDiscovery process using Commvault software is essential for a complete investigation. IT and legal communication is essential during all phases of the process.


The eDiscovery process using Commvault software includes the following steps:


  • Identify data for investigation including custodians, relevant data types, and data ranges.
  • Preserve data using IT holds or litigation holds.
  • Content index data.
  • Create a legal review workflow process.
  • Conduct basic and advanced searches for relevant information.
  • Move relevant items to Review sets for deeper analysis including tagging and adding comments.
  • Move responsive items to legal holds or export sets.
  • Produce responsive items for investigation as CAB, PST, NSF and HTML files.
  • Release non-relevant data from legal holds.


Identify data for investigation

The first part of identifying data is knowing the type and location of the data. This is primarily Email but also can include documents which may be on servers or personal computers. The legal team should communicate as much information to the IT team so they can quickly identify the location of the data to be preserved. Along with what type of data must be preserved, the date range and relevant custodians must also be provided.
In modern environments, it is quite common for user data not to be in central locations. Using Commvault features including end user desktop and laptop agents provides a longer reach of what data can be identified. An understanding of what users are doing with their data can assist when an investigation arises.


Preserve data

Once data is identified it must be preserved. In certain cases, especially when it is unclear what data must be preserved, an IT legal hold is implemented. This can be accomplished in several ways:


  • Disabling data aging operations for storage policy copies, clients, or the entire CommCell® environment.
  • Using Reference Copy to preserve files and Email messages on certain production systems. Reference Copy configuration is typically implemented by Commvault administrators with guidance from legal teams.
  • Using Case Manager to preserve files and Email messages owned by custodians. It is important to note that file ownership is tied to system ownership which means that Case Manager is suitable when managing end user data such as laptops and desktops. Server data such as file shares and home folders can be included but it is not tied to a specific custodian. Filter criteria such as file types and key words can be used to determine which server data is preserved in the case. Case Manager implementation is typically handled by legal teams.


A critical point in the data preservation process is that data is managed independently from standard corporate retention policies. A standard policy of 30 – 60 days may be used for normal business operations which would not be adequate for an investigation that may span multiple years. Disabling data aging is a temporary method to preserve data but can come at a significant cost in extra storage requirements. Reference Copy and Case Manager physically copies data to an alternate storage location providing a more efficient long term storage solution.


Content index data

Content indexing is required to conduct full content searches for Email messages and files. In some environments content indexing is an ongoing process. This allows investigations to be conducted with minimal communication with IT teams, although it is still critical to check with Commvault administrators to ensure all indexing operations are up-to-date based on the scope of the investigation. Content indexes can exist for the entire retention time of the data or indexes can be pruned prior to data exceeding retention. Any indexable data that exists in a Commvault environment can retroactively be content indexed. It is always important to establish the data types, custodians, and date ranges to ensure all required data is preserved and content indexed.


Legal Review Workflow Process

Although searches can immediately be conducted, a good practice is to establish a workflow process. This includes who will be investigating and at what stage they will be actively involved. At this point a workflow can be established by defining query sets and review sets. Multiple sets can be defined and permissions can be assigned to ensure a secure workflow process.


Conduct basic and advanced searches for relevant information.

At the start of the review process, basic and advanced queries can be crafted to begin identifying relevant information. Queries are then modified to remove non-relevant information and narrow the scope of search results. Multiple queries can be crafted and saved to query sets to simplify the process and divide responsibilities when multiple legal team members are involved. It is critical that all queries are saved in a query set to ensure a complete and defensible investigation.
In some cases, relevant items are immediately exported or placed in legal hold retention policies. This is common for basic investigation or in early case assessment situations where items are to be exported and presented to others or analyzed using third party software.


Move relevant items to Review sets

Once queries are crafted, relevant items can be moved to one or more review sets. Review sets provide a more granular method of investigation where comments can be added to items and items can be tagged. Multiple review sets can be created in a cascading manner when multiple levels of investigation are required. For example, the primary review set can be used for a high-level investigation to identify and tag potentially relevant items. These items can then be moved to another review set for a deeper analysis. This also is used to create a workflow where different individuals at different phases of the investigation can analyze items for relevance, non-relevance, or attorney-client privilege.


Move responsive items to legal holds or export sets

Throughout the investigation process, relevant items are reviewed to determine if they are responsive – all items within the scope of the investigation. These items can be moved to a legal hold or an export set. Both operations create physical copies of items and it is important to note that depending on how many items are included, this process can take some time.
A legal hold, in the context of the Commvault Compliance Search interface, copies items to a separate physical location and a specific retention is placed on these items. Legal hold retention policies are defined by the Commvault administrator with the assistance of legal teams. Legal hold retention policies can be named based on a case, retention terms such as 5-year hold, or any other naming convention required by the organization.
A common practice is to have a legal hold policy using infinite retention. This guarantees the preservation of data for the life of the investigation. Once the investigation is closed, the legal hold data can be released.
Export sets take selected items and immediately exports them to a compressed file such as CAB, PST, NSF or HTML. The export process may take some time to complete depending on how many items must be copied to the compressed file. The export file can then be downloaded directly to the local computer. It is important to note that placing items into an export set does not change the retention of items in Commvault protected storage.


Produce responsive items

Once all items are placed in a legal hold or and export file, they can be exported outside of the Commvault environment.


Release data from legal holds

Legal holds, implemented by IT, can be released if it is determined that all responsive information has been produced or the investigation is closed. If it is uncertain that all information has been gathered, it may be necessary to maintain the legal holds. Note that Commvault has additional features including Commvault OnePass archiving, storage to cloud, and SILO storage which can be implemented to hold on to data for extended periods of time.




Proactive and Reactive Investigations

When a case is initiated, it could be conducted as a reactive or proactive investigation. The differences between these will determine how data is identified, preserved, and indexed; as well as which Commvault eDiscovery tools provide the most efficient methods to process the case.


Reactive Investigation

A reactive investigation is typically a case where data that is preserved in storage but is not indexed or the indexes have been pruned. An example of this would be a harassment case with a former employee and a manager. The relevant custodians and time range for the investigation consists of Email messages dating back two years ago. The Email messages have been preserved but they have not been content indexed. The Commvault administrators will need to run content indexing jobs on the older data for the legal teams to conduct their searches to identify relevant items for the investigation.


Commvault® tools for a Reactive Investigation

In order to conduct a reactive investigation, the jobs the data resides in must be retained in Commvault storage. The jobs are picked or re-picked for content indexing. This may require Commvault administrators to place an IT legal hold on the jobs until it is known what data requires preservation and legal teams properly preserve all relevant information. Once the jobs are content indexed, there are several tools that can be used to identify and preserve relevant information.


  • Create a case using Case Manager to identify custodians, data types, and keywords to preserve data by copying relevant case items to a separate physical location.
  • Use reference Copy to identify data types and keywords to preserve data by copying relevant items to a separate physical location. Note that data in a Refence Copy would require a separate content indexing job to make the data searchable in the Compliance Search interface.
  • Conducting searches using the Compliance Search interface and move relevant items into a legal hold policy.


Proactive Investigation

A proactive investigation is when custodians and data types are known to legal teams during an ongoing investigation. This allows a proactive preservation to occur by identifying, isolating and preserving relevant data into a secure physical location. An example of a proactive investigation would be the collection of all data relevant to a new product that is being developed. The preservation of data would include everyone involved with the product development as well as any Email messages and documents that contain the name or patent information of the product.
Another method for proactive investigation is identifying various custodians and risk levels. For example, Corporate executives will have all Email messages preserved for seven years and content indexed for the duration of the preservation. Other users will have their Email messages preserved for three years but not have their messages content indexed. Note that if an investigation is required for the users, their data can be reactively content indexed for the investigation within the three year period the data is being preserved.


Commvault® Tools for a Proactive Investigation

There are several methods which can be used for proactive preservation of data:

  • Disable data aging on specific end user systems. This preserves the data even if the data has not been content indexed.
  • Create a case using Case Manager to identify custodians, data types, and keywords to preserve data by copying relevant case items to a separate physical location. This method requires data to be content indexed.
  • Use Reference Copy to identify data types and keywords to preserve data by copying relevant case items to a separate physical location. If keyword searches are not used, this method does not require content indexing. Note that data in a Reference Copy would require a separate content indexing job to make the data searchable in the Compliance Search interface.
  • Conduct searches using the Compliance Search interface and move relevant items into a legal hold policy. This method requires data to be content indexed.
  • Create multiple subclients, isolating custodians within each subclient and direct the subclients to various storage policies corresponding to retention requirements. Subclients can be selected to be content indexed or skipped for indexing.




Responsibilities for Legal and IT Teams During an Investigation

When using Commvault software and features during an investigation, it is critical for IT and legal teams to communicate. The responsibilities will shift during the various phases but it is important to note that no investigation is static. The scope may change based on evidence discovered during reviews. Additional data may need to be preserved and indexed. In some cases, this data may be in cold storage and adequate time must be provided to IT teams to make the data available for review.


During the beginning phases of an investigation (identify, and preserve), Commvault administrators must work with legal teams to determine the scope of the investigation. Once relevant information has been identified and preserved, the legal teams takes over digging into the information to review and produce responsive information for the investigation. This provides a separation of powers and reduces IT responsibilities during the legal processes (review, analyze, and produce). If legal teams discover evidence that may affect the scope of the investigation, they must communicate with IT to ensure additional data is available for the legal teams to search.


eDiscovery IT and legal responsibilities high level concept



Preservation Methods

There are two methods for preserving data:


  • Proactive preservation
  • Reactive preservation


Proactive Preservation

Proactive preservation is used to identify, preserve and content index data that is actively being protected and retained. As new data is protected by Commvault software, indexing jobs can run to make the data immediately searchable. This provides a big advantage as legal teams can work more autonomously without the need to check with IT teams to content index data.


Reactive Preservation

Any indexable job in Commvault storage can be retroactively indexed. This is useful when conducting investigation that require searches on older data. Depending on the capacity of the index engine, content indexes may not be able to be retained if the data is being retained. For example, an investigation requiring searches on data that is five years old is required. The retention on the data is seven years but the content index retention is only set to three years. The Commvault administrator can re-pick the five-year-old jobs to be content indexed. Once the indexing process is complete, legal teams can conduct searches using the Compliance Search interface.


The following table details responsibilities for both legal teams and IT teams during the Identification, Preservation, Review, and Production phases.





Identification Phase


IT Responsibilities

Legal Responsibilities

  • Coordinate with legal team regarding custodians, search scope, and data types.
  • IT must assess how the relevant information is currently being managed in the Commvault environment. All data within the legal team's defined scope must be included in Commvault's protected environment.
  • Legal must define the scope of the search including date ranges, relevant custodians, types of data required (Email, document types).
  • Present this information to IT so they can begin preparing data for collection, preservation, and content indexing.


Preservation Phase


  • Coordinate with the legal team regarding length of investigation.
  • An IT legal hold would be required if processing and analysis of relevant data is going to be potentially performed beyond the scope of standard retention policies.
  • Configure subclients to define all relevant data (if necessary) and direct them to a Content Indexing enabled storage policy.
  • Configure a new or existing CI enabled storage policy to content index relevant subclient data.
  • Configure a legal hold storage policy for use by the legal team.
  • Legal teams must determine the length of the investigation.
  • Provide IT with the length of time data must be preserved so they can assess current data retention and destruction policies and determine whether an IT legal hold will be required.
  • If the data is going to be processed and analyzed within the currently defined data retention policies, then the legal team can perform any legal holds if required. Coordinate with IT so they can define legal hold policies within the Commvault software that will be used by the legal team.
  • If the length of the investigation is going to be potentially beyond the scope of standard retention policies, IT must place data into IT legal hold.


Review Phase


  • Security can be defined to permit certain users to have rights to searching custodian data. Coordinate with the legal team to determine security requirements based on each member of the legal team.
  • IT can define Reference Copy policies and schedule them to run which can be used for an ongoing investigation where new data must be collected daily. This can be beneficial in ongoing investigations where custodians are being actively monitored or if additional data is discovered after the initial searches.
  • Tags can be created by the legal team or a list can be provided to IT to create in the CommCell Console.
  • When jobs are submitted to legal hold or export, these jobs may take some time to run and there is the possibility of object failures if data cannot be retrieved from backups and archives. It is essential to monitor jobs, configure alerts, and reports. Determine what type of reports and alerts are required and which legal team members should receive them.
  • Conduct initial queries using basic or advanced search. Refinements to queries should be made to eliminate non-relevant data. Use the advanced search options to exclude non-relevant messages, date ranges, file types, and keywords. Strong knowledge of search and query language should be obtained through Commvault training.
  • When data is moved to a review set, ensure as much non-relevant data has been excluded from queries. The purpose of the review set is to process documents and messages individually, comment and tag items for relevance and follow up.
  • Case Manager can be used to proactively identify and preserve items by legal teams. Items for custodians, data types, and review set items can be preserved for ongoing investigations.
  • Once all relevant data has been processed, items can be moved to a legal hold or export set.
  • When jobs are submitted to legal hold or export, the jobs may take some time to complete and there is the possibility of object failures if data cannot be retrieved from backups and archives. Coordinate with IT so they can set up all required alerts and reports for the legal team to receive.


Production Phase


  • Once all relevant data has been processed it is important for IT to remove any IT legal holds to ensure data protection requirements are complying with standard data retention and destruction policies.
  • Data can be exported to CAB files, PST (Exchange), or NSF (Lotus Notes).
  • Once all relevant data has been processed coordinate with IT so they can remove any IT legal holds to comply with defined data retention and destruction policies.




Effective Data Preservation

Commvault software provides several methods for preserving data. It is important to note, from a legal and compliance perspective, that the physical separation of compliance data from normal backup data is essential. Consider locking down specific items for an investigation that may last five years on a disk array which is also storing normal backup data. These disks are working hard, backing up new data, running restore operations, auxiliary copy jobs, and pruning old data off. The potential for disk failures and data loss could result in losing months of investigative work. Commvault software provides several methods of locking data in place and also copying data to separate physical locations.


In place preservation methods:

  • IT legal holds implemented by disabling data aging operations.
  • Case Manager cases where custodian data is not associated with a storage policy.


Physical copy preservation methods:

  • Reference Copy implemented from the CommCell® console.
  • Case Manager cases where custodian data is associated with a storage policy.






Copyright © 2021 Commvault | All Rights Reserved.