February 26th, 2009 — eDiscovery
George Socha released the 78-page EDRM search guide draft v. 1.14 on February 6th for public review and comment. The guide is meant as educational commentary for legal professionals and litigation software and service providers, including guidelines in developing “appropriate and effective search methods.” The guide is the work of 18 people from 15 organizations – some of the boldfaced company names I picked out from the document changes being Autonomy-Zantaz, Clearwell Systems and Vivisimo, as well as a number of law firms and others.
The guide offers a step-by-step scenario of litigation response involving two fictitious companies and their alliterative employees (e.g. Alex Arnold of Alpha Corporation and Bonnie Benson of Beta Corporation) embroiled in an intellectual property dispute. This example is used to illustrate the workflow that goes along with the EDRM. It was instructive and sometimes entertaining, for those of us who read mystery novels and are interested in the chase, anyway.
It also includes an in-depth primer on search methodologies and syntax, which tickled my librarian bone. This section would be at home in a library school curriculum, right down to the search query treatment of diacritics. Of course the flip side of this is that average users might find it too technical. Overall it’s a good read for anyone interested in search and information retrieval. Later sections deal with search documentation and validation of results.
I found the guide to be comprehensive in addressing the litigation response process and strategies for defensibility. It is straightforward in listing the actions that professionals in each role (general counsel, outside counsel, custodians, IT personnel, witnesses) must take, and overall it achieves a good balance of outlining the legal process and the resulting steps taken with each fictitious party’s ESI. I think it elegantly addresses one of the big criticisms of e-discovery products, references and approaches – namely that they are not always a happy marriage of IT and legal expertise, or do not encourage collaboration and mutual understanding between these two groups.
Rob Robinson of Orange LT has logged some early commentary on the guide here. Like Rob, I was a little surprised that there was not a section for judicial rulings regarding search. On the other hand it’s nice to read practical advice that is not mired in legal-speak. There are certainly plenty of other sources that provide history and explanation, and with the rapid succession of decisions being handed down lately it might be difficult to keep it current for the sake of the guide.
February 5th, 2009 — eDiscovery
I heard from veterans that this year’s LegalTech New York was smaller than last, but I can’t say that knowledge made it any less intimidating for a first-timer. Several in the booths told me that despite the lower numbers, the quality of customer was going up – there were fewer tire-kickers and swag-grabbers and more substantial customer prospects. An encouraging sign in a down economy.
Not surprisingly, in the booths and in the conference halls one of the biggest themes was cost. This jibes with a key finding from our December report on e-discovery and e-disclosure, basically that they’re out of control. Another of our projections, the moving of e-discovery in-house in corporations, was a concurrent theme as one of the best means of reducing those spends. Vendors seem to be moving further leftward in the Electronic Discovery Reference Model (EDRM) towards the earliest stages of data creation in order to capture more of the revenue from this, also as we reported.
The YouTube town hall meeting gave good insight into what issues are important to the legal community in dealing with the challenges of e-discovery. Some of the hot-button issues:
-
-
-
Search methods in review, their transparency and defensibility.
-
International e-discovery considerations such as cultural differences, data privacy and the importance of Unicode in multilingual review.
-
Monica Bay, editor-in-chief of Law Technology News (LTN)
made points in a youtube question about jargon involved in vendor claims (some that resonated with my own experience) – namely that the same ten terms of jargon are used by all vendors to describe their considerably different products, and these are often not well understood by potential buyers.
In my experience, terms such as concept search can be confusing and early case assessment can vary greatly in definition and execution from vendor to vendor, but now seem to be offered by many of them with minimal explanation.
To build on that point, it’s a common complaint in any market that vendors are “selling what they have,” versus what the customer needs, but in such a critical area as e-discovery this can be downright dangerous. The consumer needs to be armed with information and expertise in order to make an informed choice – probably one of the reasons that service providers and consultants remain some of the most trusted entities in the field.
The hottest topic NOT discussed in the panels and sessions (at least the ones I attended) was the Autonomy-Interwoven acquisition and what it will mean for the market, about which 451 subscribers can learn more here and here.
Overall the show brought together some of the best minds in the industry for a slightly dizzying wealth of legal and market information. Let me not forget a big “thank you” to the several vendors who met with me to discuss their products and views on the market landscape over the three days. Here’s looking forward to next year.
For additional perspective, see the excellent coverage from Rob Robinson of Orange LT, Whit Andrews at Gartner, and Sean Doherty at Law.com
January 29th, 2009 — eDiscovery
It might be overshadowed by the ramp-up to LegalTech, but a big project of the Sedona working group on e-discovery will be kicking off later in February, the 2009 Text Retrieval Conference (TREC) Legal Track. For going on 18 years, TREC has been a workshop for encouraging research in information retrieval. The three-year-old legal track is organized through the National Institute of Standards and Technology and co-coordinated by Jason Baron, director of litigation at the National Archives.
Participating teams work with a test case to evaluate the most effective search protocols for finding relevant documents in e-discovery, i.e. what finds the responsive documents best? Concept search, expert human reviewers, Boolean keyword search using “x and y or z” or other methods? I spoke with Jason Baron briefly about this year’s TREC, which will switch test collections from the tobacco litigation “Master Settlement Agreement” repository to the Enron collection of email and attachments. He believes it will increase participation by reducing the need to search OCR’d documents.
Most of the participants in past years have not been vendors, but a few we know of are H5 and Clearwell Systems. It’s certainly a worthy goal to find the most effective methods and work towards improving standards in how we approach sifting through legal documents for relevancy. You’d think more vendors would be willing to put their tools to the test for the greater good, “walking the walk” if you will. Instead their absence is a reminder that e-discovery is still a wild west scenario with no “standards sheriff” in town.
How likely is it that vendors will put their products to the test in an attempt to back up claims and find better methods in review and analysis of Electronically Stored Information for trial? We hope that more of them will answer the call. It’s a good time to prove your chops if you’re in e-discovery – anyone going to LegalTech next week will be able to attest to the dizzying number of vendors making the market ripe for consolidation, which we have already partly seen through the high volume of acquisitions in the last few years. It remains to be seen how many of the smaller fish will still be left swimming at year end given the economic climate and fierce competition. Proving their mettle and working toward the common good wouldn’t hurt any.
I will be at LegalTech from February 2-4 talking to vendors about their products and their own methods for finding responsive documents. If you would like to get in touch or schedule a meeting, I can be reached here.
August 11th, 2008 — eDiscovery
We’ve knuckled down on our upcoming e-discovery report – thanks for the many responses to the blog post. Even though e-discovery has been around for years, the current market activity proves the party’s just beginning – and it’s going to be a barn-burner.
As we saw a few weeks ago with Interwoven’s acquisition of Discovery Mining (log-in required), SaaS is emerging as a viable option, and may be a playing-field leveler for smaller vendors. Also last week, PwC and Iron Mountain / Stratify announced a strategic e-discovery partnership. Then yesterday IBM released its new E-discovery Manager for its content management and email archiving platforms – 451 clients can expect upcoming coverage.
There are a number of established tier-one players. But where there are large players, there is room for smaller alternatives. And because some large vendors, consultants and services firms are playing catch-up in the boom, there is still plenty of potential for acquisition, particularly because the e-discovery process covers several disparate areas of technology: email archiving, storage, records management, search and text analysis. In this race, the market is just hitting its stride, there are probably too many vendors vying for the business and the players shift frequently.
The latest Socha-Gelbmann survey results bear this out. They’ve swelled the ranks of “Top electronic discovery software providers” from 11 to 15 overall. Autonomy and Clearwell have reached the top tier, along with incumbents FTI Consulting, Guidance Software, Inc. and LexisNexis. Of the second tier, 2006 winners Cataphora, DocuLex, ISYS and Oracle are out entirely, Attenex (recently acquired by FTI) and CT Summation are down from tier one, and Epiq systems, iConect and Symantec are first timers to the list. Third tier is all new for the category: AccessData Corporation, Equivio, Kazeon Systems, Inc., Kroll Ontrack and MetaLINCS (owned by Seagate) – note that many of these were previously present on other best-of’s for service or specific software type.
Socha-Gelbmann does offer the caveat that “anyone who makes buying decisions primarily on these rankings is a fool,” although we haven’t seen the quote included in many vendor press releases.
So how then do software purchasers choose a vendor, and what does it mean for the market? We plan to address these questions in our upcoming e-discovery report, in which we’ll be looking at a number of users, vendors and service providers with a range of experience and across sectors, keeping our collective eye on new developments and a view of where the market is headed from here.
What we can tell you as a preview is that it’s exciting to watch such a dynamic market. Now that more companies are becoming familiar with the demands of storing, managing, searching and producing Electronically Stored Information (ESI), they’re no longer buying nick-of-time service. The new standards of the amended Federal Rules of Civil Procedure (FRCP) are not a one-time inconvenience, but require a legally-defensible methodology and the speed to produce on-demand in a number of days. Users are investing in long-term plans for all types of litigation. IT is developing comprehensive strategies for approaching various ESI repositories. Preventive measures are available for monitoring ESI distribution in potentially litigious areas – stopping trouble before it starts in high-litigation operations. These developments are reflected in corporate structure, where IT and legal have more overlap and greater cross-functionality.
Stakes are high, the time-frames for discovery are short – one services exec told us “fast in this business is FAST” – the competition is crowded, and the need is ubiquitous. We’re looking forward to continuing the conversation with many of you – and if you have yet to get in touch, please do so.
July 23rd, 2008 — Archiving, Content management, Search
We’ve been covering the e-discovery big guns and usual suspects here at The 451 Group in one way or another for about five years now. But we’re looking to get more systematic about it in part in preparation for a long-form market overview of this sector to come this fall. There are certainly no shortage of vendors targeting this market, as anyone attending the LegalTech conference this year would tell you.
We currently have several analysts looking at this market from different angles: Nick and Katey cover the search and text analytics vendors, Simon and Henry keep track of storage and archiving, and Kathleen looks after records management and content management aspects.
But with this approach, we wonder who we’re missing. Where are the up-and-comers? Are there any start-ups or new emerging companies you’ve had your eye on? Let us know in the comments or via email so we can make sure our e-discovery coverage is more comprehensive.
June 20th, 2008 — Archiving, Search, Text analysis
You’ve had Nick’s take, now here’s mine, with a little overlap – great minds think alike, right?
We were not expecting the 40 attendees for the pre-conference workshops during prime Sunday TV viewing time. Seth Grimes laid out “Text Analytics for Dummies,” while Nick gave a market overview. But the attendance (and the long Q&A sessions) were good indicators of user enthusiasm and the desire for real, practicable advice about the field.
Some of the other memorable moments:
- Best of the vendor panel: Seth Grimes’s challenge to say something nice about a fellow vendor’s offerings. And the vendors’ response to an audience question about incorporating UIMA, which was uniformly that it wasn’t necessary or in demand.
- The Facebook presentation on trend-tracking through users’ “Wall” posts was brought back for an encore by popular demand. The crowd in my session was a little confrontational about the amount of analysis being done on the available information (never enough!), but as far as quick and dirty zeitgeist goes, it was unbeatable, and a lot of fun.
- The Clarabridge 1-hour deployment was good sport, with at least one customer’s testimony that once the system is learned, it can actually be configured with speed approaching that of CTO Justin Langseth. You have to hand it to Clarabridge: they make it look easy.
Some thoughts on the users’ takes:
- In presentations and in private chats, frequently recurring themes among vendors was eDiscovery and social media – some of the drivers for the market. The user questions I heard were mostly about sentiment analysis, deployment time and ROI. Specifically, information on how to judge all of the offerings – is sentiment analysis accurate enough? What is the expected deployment time, what is the ROI?
- Precision and recall went back and forth again, but the hard truth is that the edge depends on the application. For patents or PubMed searches or eDiscovery, you need recall. For other applications, precision is paramount. Some users I spoke with mistook this as a lack of accuracy – it’s more of a sliding scale of usefulness.
- Accuracy was a recurring issue, both because text analytics is an emerging technology, and, of course, text is messy and imprecise. Partly it’s a matter of maturation. But the “fast / cheap / or good – pick any two” truism about software development is equally true here. Even with built in taxonomies and dictionaries or domain-specific knowledge, any text analytics software needs configuration to increase accuracy for its application and user, which takes time.
- “Win fast and win often” – great words from Tony Bodoh of Gaylord Hotels, on the user panel. Because of the financial investment, the fact that text analysis software can automate (obsolete) some employee work, the time it takes to configure, and general resistance to change, it is important to gain both executive and user buy-in early in the process. Chris Jones of Intuit echoed the sentiment, adding that it’s not advisable to go after your largest (and most time-consuming) problem first – come up with a number of smaller successes to prove the concept to users and higher-ups. Incidentally, both of these are Clarabridge users.
- Jones also noted that one of his “lessons learned” was to avoid over-configuring or too much tinkering with the analytics. He advised after a prudent amount of configuration to treat it more or less like a black box, and not worry about what is going on under the hood, just let it do its job and leave it to the professionals.
- Some more wisdom from the user panel: you can’t go into a text analytics deployment expecting quantifiable ROI. “You don’t know what you don’t know” - which is what the tool is there to solve. In many cases, the real potential isn’t obvious until you can see how it works with your business. At that point it’s possible to come up with applications that not even its creators could have thought up.
- Lastly (and this is not a new sentiment, but it meant more coming from school Superintendent Chris Bowman, who looked like he had my parents on speed-dial): the text analytics field is emerging, and will become integrated with larger applications. This will eventually render a conference like this obsolete, but it also means a great chance to get a leg up as an early adopter.
Looking forward to next year!