The Data Day, Today: Mar 13 2012

Drawn to Scale raises funding. Cloudera launches HBaseCon. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Drawn to Scale Announces Funding for Real-Time Big Data

* Cloudera Announces HBaseCon 2012, the Industry’s First Apache HBase Community Conference

* Gazzang Launches Big Data Encryption and Key Management Platform

* Jaspersoft Closes Record Fiscal Year

* Schooner Information Technology Releases Membrain 4.0

* How Project Mercury is eBay’s Big Data play Up on the roof of EBay’s big data center.

* SAND Announces Universal Query

* Oracle has a cloud computing secret The potential impact of metered pricing.

* Why should I consider memcached plugin? …for MySQL.

* For 451 Research clients

# Microsoft launches SQL Server 2012, with an eye on ‘big data’ Impact report

# Global IDs hones governance, MDM focus; looks to the cloud and appliances for growth Impact report

# Clarabridge ups the ante in ‘voice of the customer’ with v5.0 as the CEM space heats up Impact report

# ScaleBase launches elastic load balancing for MySQL databases Market Development report

# Dassault’s Exalead searches for a ‘big data’ role Market Development report

And that’s the Data Day, today.

Search by another name: enterprise search starts to mature into ‘application era’

Customers of The 451 Group would have seen my report on the enterprise search market published September 15. If you are a client, you can view it here. I thought it would be useful to provide a condensed version of the report to a wider audience as I think the market is at an important point it in its development and it merits a broader discussion.

The enterprise search market is morphing before our eyes into something new. Portions of it are disappearing, and others are moving into adjacent markets, but a core part of it will remain intact. A few key factors have caused this, we think. Some are historical, by which we mean they had their largest effect in the past, but the ongoing effect is still being felt, whereas the contemporary factors are the ones that we think are having their largest impact now, and will continue to do so in the short-term future (12-18 months).

Historical factors

  • Over-promising and under-delivery of intranet search between the last two US recessions, roughly between 2002 and 2007, resulting in a lot of failed projects.
  • A lack of market awareness and understanding of the value and risk inherent in unstructured data.
  • The entrance of Google into the market in 2002.
  • The lack of vision by certain closely related players in enterprise content management (ECM) and business intelligence (BI).

Contemporary factors

  • The lack of a clear value proposition for enterprise search.
  • The rise of open source, in particular Apache Lucene/Solr.
  • The emergence of big data, or total data.
  • The social media explosion.
  • The rapid spread of SharePoint.
  • The acquisitive growth of Autonomy Corp.
  • Acquisition of fast-growing players by major software vendors, notably Dassault Systemes, Hewlett-Packard and Microsoft.

The result of all this has been a split into roughly four markets, which we refer to as low-end, midmarket, OEM and high-end search-based applications.

Entry-level search

The low-end, or entry-level, enterprise search market has become, if not commodified, then pretty close to it. It is dominated by Google and open source. Other commercial vendors that once played in it have mostly left the market.

The result is that potential entry-level enterprise search customers are left with a dichotomy of choices: Google’s yellow search appliances that have two-year-term licenses and somewhat limited configurability (but are truly plug-and-play options) on the one hand, and open source on the other. It is a closed versus a very open box, and they have different and equally enthusiastic customer bases. Google is a very popular department-level choice, often purchased by line-of-business knowledge workers frustrated at obsolete and over-engineered search engines. Open source is, of course, popular with those that want to configure their search engine themselves or have a service provider do it and, thus, have a lot of control over how the engine works, as well as the results it delivers. Apache Lucene is also part of many commercial, high-end enterprise search products, including those of IBM.

Midmarket search

Mid-market search is a somewhat vague area, where vendors are succeeding in deals of roughly $75,000-250,000 selling intranet search. This area has thinned out as some vendors have tried to move upmarket into the world of search-based applications, but there are still many vendors making a decent living here. However, SharePoint has had a major effect on this part of the market, and if enterprises already have SharePoint – and Microsoft reckons more than 70% have at least bought a license at some point already – then it can be tough to offer a viable alternative. However, if SharePoint isn’t the main focus, then there is still a decent business to be had offering effective enterprise search, often in specific verticals, albeit without a huge amount of vertical customization.


The OEM search business has become a lot more interesting recently, in part due to which vendors have left it, leaving space for others. Microsoft’s acquisition of FAST in early 2008 meant one of the two major vendors at the time had essentially left the market entirely, since its focus moved almost entirely to SharePoint, as we recently documented. The other major OEM vendor at the time was Autonomy, and while it would still consider itself to be so, we think much of its OEM business, in fact, comes from document filters, rather than the OEMing of the IDOL search engine. Autonomy would strongly dispute that, but it might be moot soon anyway – it now looks as if it will end up as part of Hewlett-Packard following the announcement of its acquisition at a huge valuation, on August 18.

Those exits have left room for the rise of other vendors in the space. Key markets here include archiving, data-loss prevention and e-discovery. Many tools in these areas have old or quite basic search and text analysis functionality embedded in them, and vendors are looking for more powerful alternatives.

Search-based applications

The high end of the enterprise search market has become, in effect, the market for search-based applications (SBA) – that is, applications that are built on top of a search engine, rather than solely a relational database (although they often work alongside a database). These were touted back in the early 2000s by FAST, but it was too early, and FAST was too complex a set of tools to give the notion widespread acceptance. But in the latter part of the last decade and this one, SBAs have emerged as an answer to the problem of generic intranet search engines getting short shrift from users dissatisfied that the search engines don’t deliver what they want, when they want it.

Until recently, SBAs have mainly been a case of the vendors and their implementation partners building one-off custom applications for customers. But they are now moving to the stage where out-of-the-box user interfaces are being supplied for common tasks. In other words, it’s maturing in a similar way to the application software industry 20 years ago, which was built on top of the explosion in the use of relational databases.

We’ve seen examples in manufacturing, banking and customer service, and one of the key characteristics of SBAs is their ability to combine structured and unstructured data together in a single interface. That was also the goal of earlier efforts to combine search with business-intelligence tools, which often simply took the form of adding a search engine to a BI tool. That was too simplistic, and the idea didn’t really take off, in part because search vendors hadn’t paid enough attention to structure data.

But SBAs, which put much more focus on the indexing process than earlier efforts, appear to be gaining traction. If we were to get to the situation where search indexes are considered a better way of manipulating disparate data types than relational databases, that would be a major shift (see big data). Another key element of successful SBAs is that they don’t look like traditional search engines, with a large amount of white space and a search bar in the middle of the screen. Rather, they make use of facets and other navigation techniques to guide users through information, or often simply to present the relevant information to them.

As I mentioned, there’s more in the full report, including more about specific vendors, total (or big) data and the impact of social media. If you’d like to know more about it, please get in touch with me.

Autonomy pops up to pronounce an RDBMS revolution is afoot

In one of those Autonomy announcements that seemingly appear out of nowhere, the company has declared its intention to “transform” the relational database market by applying its text analysis technology to content stored within database. The tool is called IDOL Structured Probabilistic Engine (SPE), as it uses the same Bayesian-based probabilistic inferencing technology that IDOL uses on unstructured information.

The quote from CEO Mike Lynch grandly proclaims this to be Autonomy’s “second fundamental technology” – IDOL itself being the first. That’s quite a claim and we’re endeavoring to find out more and will report back as to exactly how it works and what it can do.

Overall though this is part of a push by companies like Autonomy, but also Attivio, Endeca, Exalead and some others into the search-based application market. The underlying premise of that market is database offloading; the idea of using a search engine rather than a relational database to sort and query information. It holds great promise, partly because it is the bridge between enterprise search and business intelligence but also because of the prospect of cost savings for customers as they can either freeze their investments in relational database licenses, reduce them, or even eliminate them.

Of course if the enterprise search licenses then get so expensive as to nullify the cost benefit, then customers will reject the idea, which is something of which search vendors need to be wary.

Users can apply to joint the beta program at a very non-Autonomy looking website.

Attivio, Exalead and the new ‘search’ market

I had a very interesting conversation with Attivio recently. Most of it is confidential and thus not enough for me to wrote a full 451 report on it yet (but I will as soon as we’re able to), but I can say that the startup founded mostly by former employees of FAST Search & Transfer that has a couple of public customers has some much more interesting news coming down the pike in coming months. It just announced one of these customers, here.

We talked about what is triggering the customer wins it is getting and one thing stood out; the key value prop Attivio has been focused on from the start, which is the conjoining of structured and unstructured data. But it’s not just about having access to both types (that was possibly years ago and anyway there’s myriad types between those two simplistic and oft-confused descriptions, but bear with me). And it’s not just the ability be able to query them both; that too has been done, but crucially with different tools in the past.

It’s the ability to switch between both based on what the user is doing – not based on what type of data is being queried – to do so without the user knowing and having a single API for access to both types. And that, say the Attivio team, is hitting home with customers.

Attivio is not the only company to be thinking like that of course. Attivio’s management’s alma mater FAST was doing this for a while as a product called Adaptive Information Warehouse. We wrote a report on it in January 2007 under the headline: FAST: everything you thought you knew about BI is wrong. And there are other companies both positioning their products this way and winning customers doing it. See our recent report on Exalead for another example and we’re sure IBM could build its customers just such a system if they paid enough.

The challenge with this idea is of course that for the past 30 years or so relational databases have been where the ‘important stuff’ has been stored and the multi-billion BI market grew on top of that as a way to access it. Database administrators rule(d) the roost as far as information management goes. Meanwhile enterprise search got relegated to a side room where it was all about finding documents and getting pages and pages of results returned to you. What Attivio, Exalead and a few other companies are moving towards is a convergence of the two; call it database offloading, unified information access; unified information intelligence or something similar.

We’re not seeing these vendors being dragged kicking and screaming towards this single-API-no-matter-what-the-data-is nirvana by their customers, however; it takes a fair amount of market education on the parts of the vendors and their partner to make it a reality. But given what we see happening here, we expect to hear alot more from the vendors mentioned here and others throughout 2009 and beyond as enterprise search morphs into something new.

Our take on M&A in enterprise search

I’ve gathered all my current thinking on potential M&A in enterprise search in a SectorIQ that we published earlier this week to our customers. In it, I look at four main potential targets plus a few other small ones and look at a few of the likely acquirers. (This is the way we write all our Sector IQs, btw and they’re a great way of getting a quick grasp on what might be coming down the pike in any particular sector of the IT industry)

Fortunately those of you that are not our customers (yet!) are able to read it via our arrangement with the New York Times DealBook section. Click here to see the NY Times posting or go here to go straight to the report – and while you’re there, sign up for a trial of our M&A KnowledgeBase, where we’ve been collecting details of every IT, internet and telecoms deal since the start of 2002!

Finally, a quick word about the headline. We like to have some fun here at 451 with these things and while I appreciate that this one might have been pushing things a little in terms of clearly explaining what the report was about, when else would I be able to use it? 😉