Part 1 of this three-part article discussed the terminology of TAR: Key word search, both simple and advanced, conceptual search and predictive coding. Part 2 will drill into two specific statistical methodologies commonly used in TAR: Support vector and conceptual search, and will describe how they work, what they do and how they can be used. Part three will focus on the practical: What do the cases say? How can parties use TAR in a defensible way? How can TAR help to achieve proportionality?
“Concept-based” and “support vector-based” TAR
Thankfully, lawyers need not be trained statisticians who fully understand the theoretical minutia behind TAR applications; however, it is important and helpful to understand certain broad categories of statistical TAR methodology. In both methods discussed below, a subject matter expert or “SME”, who has a strong knowledge of the case, reviews an initial set of documents and makes a yes/no determination depending upon whether the document is relevant. (The term “relevant” is used as opposed to “responsive” because the data set may not have been filtered for date ranges and other criteria specific to the case.) The SMEs need not be laws firm partners, and in fact need not be lawyers at all, although senior members of the trial team who perform the initial review may find that they gain visibility into the strength of the client's case, allowing early strategic analysis.
Additional review is performed until the system “stabilizes” (meaning the system has learned all it can in order to accurately predict relevance). The entire data set is then run through the algorithm, which scores or ranks the documents. At this point, lawyers and their eDiscovery experts use the ranking information to make decisions about the documents and their review.
Concept-based TAR systems translate the meaning of words used in context within a set of documents into mathematical models based on a conceptual index. Once the model has been built for a document set, a “find more like these” algorithm finds documents that are similar in conceptual content to those found in the original set. TAR applications using conceptual search engines require first that statistical parameters be established by the legal team (lawyers and experts) - the confidence level and confidence interval desired - followed by the random selection of documents for humans to review and determine relevance. (Confidence level is expressed as a percentage and represents how often the true percentage lies within the confidence interval. The 95-percent confidence level means that if the test were run 100 times, the same results would be delivered 95 times. Confidence interval represents the variation from the confidence level for the population. For example, if a confidence interval of +/- 2 percent were used with a 95-percent confidence level, the margin of error would be between 93 percent (95 - 2 percent) and 97 percent (95 + 2 percent). A narrower confidence interval will require a larger sample size. For TAR, a confidence interval of +/- 2 percent is common.)
The SME reviews the documents that have been randomly selected by the application. Once the SME has reviewed that batch, another SME or SME team validates the results by sampling documents that have been categorized and overturning the relevance decision where appropriate; the algorithm is re-run until the desired confidence level is met.
Concept-based search engines are a clear improvement over using search terms alone. These systems return more potentially content-relevant documents without the limitations of Boolean logic, and false positives can be suppressed through document seeding (presenting the system with documents that are known to be relevant). The ranking that is obtained from the categorization process measures how closely the documents resemble the exemplars provided to the system. If you are trying to find documents that closely resemble each other, a concept-based TAR application may be for you.
On the plus side, concept-based TAR engines are normally embedded within review platforms, and thus can easily be deployed if a new set of documents arrive during the review and advanced technology is warranted. Conversely, early use of the technology outside of the review platform is nearly impossible, making it difficult to use concept-based TAR for culling. In addition, some experts criticize concept-based TAR because it tends to take longer to reach system stability - and thus to begin the actual review - than SVM-based TAR (described below). This is largely because concept-based TAR requires the post-categorization “overturn correction workflow” described above in order to reach the desired confidence level. System stability for concept-based TAR systems is normally reached after review of 10,000 to 20,000 documents (8 to 15 days of training). Finally, with concept-based TAR, not all documents are scored. Documents that don't fit into either the “relevance” or “not relevance” conceptual space are left scored as “uncategorized.”
Support vector-based TAR
“Support Vector Machine” (SVM) methodology also a well-established “predictive analytics” statistical modeling methodology, and is widely used in a variety of industrial applications. The SVM approach automatically establishes the key benchmarks of recall and precision of a given population early in the process. (Recall asks “what percent of the relevant documents were retrieved by the algorithm?” Precision asks “what percent of a given sub-set of documents are relevant?” and is inversely proportional to recall.)
The system first presents a randomly selected set of document to the SME for review and calculates an initial score. Then, the SME reviews several much smaller sets of documents that are selected by the system to optimize its learning. During this latter phase, the algorithm includes some documents that have been previously presented to the SME in order to ensure consistency, as well as documents that are similar to previously selected documents. Behind the scenes, the TAR system is generating a set of weighted attributes to include and a subset of weighted attributes to exclude, which will result in optimal recall and precision outcomes. At a certain point, using indicators supplied by the TAR engine, the system is said to have “stabilized.” It will then apply a “relevance score” to all of the documents, resulting in a ranking of all documents from the most relevant to the least relevant.
Proponents of the SVM approach appreciate being able to use it outside of the review platform to limit the volume of ESI that must be loaded for review. One clear advantage is that SVM systems typically take from three to five days to train, with a review of 2,500 to 3,500 documents.
Use Cases and Workflows
TAR applications using either the concept-based or support-vector-based algorithm can be used with strong success at several stages of a case. The specific workflow chosen depends on case variables and the legal team's objectives. However, we believe these workflows, appropriately documented, strengthen the results and ultimately the “defensibility” of a party's production. For example, early in the case, an SVM-based TAR can be used on documents collected from the most important custodians to find important documents as part of a settlement risk analysis or to identify an initial set of keywords to use in negotiations with opposing counsel to develop a search term protocol. Or, with a tight time frame in which to review a large document set, an SVM-based TAR can be used to rank documents in order of likelihood of relevance; the ranking can be used to make decisions about review priority. While preparing for a deposition, a conceptual search-based TAR can be used to identify documents that relate to important issues of the case; if the review platform allows for a further honing or “pivot” on the deponent and date, the document collection can be further refined. Finally, upon completion of a review, counsel can use TAR to conduct quality assurance, setting up a discrepancy matrix, comparing the relevance designations of the review team to the system's relevance scores.
|TAR algorithm/not responsive||1576||40,495||42,071|
In this hypothetical example, the table shows there were 3,048 documents the review team and the TAR application agreed were responsive and would be included in the production. There were 40,495 documents the review team and the TAR application agreed were not responsive and would not be produced. The scope of the QA effort could then be focused on the two subsets where the TAR system and the review team disagreed on the determination (the 2,531 documents that the review team marked as not responsive but which the TAR program scores as responsive and the 1,576 documents that the review team marked as responsive and the TAR program marked as not responsive). These 4,107 documents were submitted to a senior reviewer for second pass review and verification. The senior reviewer found that almost 1,500 of these documents were in fact responsive. The responsive set increased from 4,624 to 6,000. The lead members of the legal team comfortably concluded that a reasonable effort has been made to identify responsive documents.
Lawyers who understand basic TAR algorithm types, even at a high level, can better choose the tool that is right for the job at hand, whether it is early assessment of data, culling, or focusing on particularized data sets for depositions and other specific uses. Part III of this article will discuss what courts are saying-and not saying-about TAR.
Copyright © 2019 Legal IT Professionals. All Rights Reserved.