Newswire

PRO Partners

To SQL or Not To SQL: The Question for Litigation Support Databases

Richard RuyleThe available database solutions for the litigation support market have radically evolved in the last decade.  Many newer solutions use relational database technologies, specifically SQL (Structured Query Language).  This discussion provides some insight into database solutions for litigation support and whether SQL should be the only choice. While there are many variations of SQL, for brevity, this is limited to Microsoft SQL Server.  I am and have been a proponent of SQL Server for a variety of solutions, but as with any technology solution, it is only the right solution if it is chosen for the right reason and environment.

Historical Perspective
Ten years ago, most cases were stored in somewhat proprietary database products.  Most of those products began to stress under the increasing volume of records produced by electronic discovery, compared to the relatively small number of paper/imaged documents.  Modern solutions migrated to SQL based platforms, some for the right reasons, others because it was believed that large scale cases required it.  These legacy databases of the past reached their limit because they were originally designed a decade or two ago.  Most were never redesigned for modern use in order to avoid major conversions as well as to preserve backward compatibility. That said, companies who invested in modernizing their flat file systems for today’s disk storage, memory, and operating systems handle large volume cases with excellent performance; as in all products, it is about design and implementation.

Advantages of SQL Server
SQL Server is a highly scalable and robust relational platform requiring sophisticated configuration and maintenance to run properly.  It is much more than a database server; it also performs functions similar to an application server, a file server, and an operating system. A server based platform such as SQL Server requires an experienced DBA, sophisticated monitoring, maintenance, and backup; generally outside the control of the application software.  Conversely, a file based database usually can be maintained by litigation support personnel and normal IT personnel (file backup), since most of the database internal operations are controlled by the software itself.

If the database solution sought requires load balancing, sophisticated relational querying, access to the data from outside the application, or needs to be browser based; then SQL is definitely the right choice.  SQL servers, web servers, and files servers can be scaled using common standards and technology and the knowledge to do so is widely available in today’s Microsoft based world. There are many trained personnel available and more are entering the market each day. While file based database applications can certainly ‘build in’ remote access, load balancing, querying, and access to data using scripting languages, this requires proprietary, complex, specialized infrastructure handling, and additional training, which makes SQL a better alternative.

Advantages of Flat-File Database
If the key requirement is performance and support for large volume cases, a well-developed file based database can match, and in many cases, outperform SQL Server in litigation document databases.  Before all the SQL developers and DBA’s shout their objections, there is a valid reason for this statement.  Most litigation document databases perform far more data reads than writes; closer to a “write-once-read-many” paradigm than to large scale transaction based processes.  Database writes occur when data is loaded during an import, documents are issue tagged, notes are entered, lists are created, and images are annotated.  Those write operations pale in comparison to read operations like searching, review, and production.  SQL excels at large volume transactions and relational querying; but a properly designed file system database will soar for review, even on large cases, because they are optimized for reading data in non-transactional related methods.  A database engine does not have to be SQL to perform the same type of searches and filtering that most people associate with relational platforms.

Comparing SQL and Flat-File
There are two fundamental engines at the core of SQL Server, the Relational Engine and the Storage Engine, providing the foundation for data access and data storage.  Ultimately, SQL Server stores its data to disk (just like flat file systems) in an MDF file and the logs in an LDF file. The disk storage algorithms are very similar across SQL and non-SQL database platforms with performance, stability, and reliability resulting from the underlying architecture. One interesting note: SQL Server uses indexes to improve the performance of querying and other operations.  These indexes are stored on disk using a B-Tree algorithm http://msdn.microsoft.com/en-us/library/ms177443.aspx.  That name should sound familiar to many of us; the core algorithm was invented in 1972 http://en.wikipedia.org/wiki/B-tree.  While the methods have been improved over the years; at its core, this is the basis for most high speed storage on disk.  SQL server data storage uses a modified version of OLE DB (although this will change in the near future).  Paradigms that work rarely change; they are just improved.

SQL Server resides between the client and the database file(s) and handles all the querying, reads, writes, caching as well as a multitude of other operations.  Anytime there is an abstraction layer between the client and server data, there is additional overhead. The incorrect tuning of a SQL Server can ultimately result in a solution that becomes unusable.  A file based system is designed so each client accesses shared data from the network storage, so the performance is based upon disk and server access as well as the design of the database.  Both SQL and file based systems are affected by data storage speed and capability. SQL Server is a fantastic and sometimes amazing solution; but if the requirements do not dictate the need, then a quality file based database is a much better choice, eliminating the complexity and the need for specialized hardware and personnel while performing equally as well.  A decision should be based on features, technology, and the company that created the application instead of just the platform or data structure.

When to Choose a SQL Based Solution

  1. You have a competent DBA
  2. Most of your applications are already SQL based (which means you should already have a competent DBA)
  3. You need to scale to hundreds or thousands of users utilizing load balancing and/or redundancy
  4. Your application is hosted and/or browser based and accesses the data remotely
  5. You absolutely need to query or access the data outside of the application,.e.g. custom reports, data mining, integrated data migration, etc.

The right solution comes from a firm understanding of business requirements, infrastructure, industry, and current technology.  SQL or non-SQL solutions can be equally viable. The key determining factors should be product features, environment and infrastructure match, reliability, performance, and the reputation of the company that builds it. As full disclosure, IPRO Tech develops our Enterprise Solutions, Allegro, eCapture and Eclipse, using SQL as the database engine.  We also offer a non-SQL database solution with our Eclipse SE product.

About the author
Richard Ruyle has been involved in software development for vertical markets for over twenty years, including complex database solutions. He currently serves as CTO/CIO for IPRO Tech which has been developing leading edge technology solutions for the litigation support industry since 1990. For more information on IPRO, visit www.iprotech.com.
 

Copyright © 2023 Legal IT Professionals. All Rights Reserved.

Media Partnerships

We offer organizers of legal IT seminars, events and conferences a unique marketing and promotion opportunity. Legal IT Professionals has been selected official media partner for many events.

development by motivus.pt