Separating the wheat from the chaff with eDiscovery
By AdvocateDaily.com Staff
In the final instalment of a two-part series on effective eDiscovery, Jason Bell-Masterson explores the review, predictive coding and document preparation phases.
In order to identify and produce relevant data as part of a lawsuit, investigation, or arbitration, the eDiscovery process often involves collecting gigabytes — or in some cases terabytes of data, which can be both expensive and time-consuming to review, says Jason Bell-Masterson, director at the Toronto branch of the legal technology company Epiq.
Focusing on what’s pertinent — by eliminating extraneous information before review — through the various stages can make the eDiscovery project more efficient and increase the likelihood of staying within prescribed deadlines and budget, Bell-Masterson tells AdvocateDaily.com.
"Filtering, search terms, de-duplication, and the later application of technology assisted review tools, email threading, and near duplicate detection can all help to zero in on specific information being sought, fine-tuning the data set," he says.
Crafting relevant search terms, he says, is a delicate mix of art and science. The idea is to narrow the information without being overly broad and missing key material that may be useful, Bell-Masterson says.
“We’ll do multiple iterations of search terms, which involves trying one set and seeing what it returns and then trying another,” he says.
The search can be tailored by narrowing the terms or combining them and using proximity language, such as a key phrase or select words to provide more context, Bell-Masterson says.
The filtered data can then be migrated into a review platform.
“Historically, you put a bunch of lawyers in a room on computers and go through every document,” much like a paper review to determine what’s relevant and privileged, he says.
“The more data you have, the more extensive it’s going to be, the longer it’s going to take, the less likely you’re going to end up on time or under budget.”
Bell-Masterson says analytic tools, which are constantly being improved, are useful at this stage to cut back on the time and cost to review the documents.
Email threading connects conversations, providing a sequence of the most complete version.
Predictive coding or technology-assisted review analyzes documents and applies that coding across the set, ranking the material and showing the documents most likely and least likely to contain responsive material, he says.
“So you might just pick the top 20 per cent of the collection, for example, to review with some confidence that it will return 80 to 90 per cent of the relevant data,” Bell-Masterson says. “This process will give you most of the information you need to make an informed decision about the case itself.”
Click here to read part one, where Bell-Masterson discusses budgetary limitations and how to lay out objectives and cull the information to create a focused data set.