Entries Tagged 'Text analysis' ↓

Goodbye Newssift, we hardly knew you

Newssift, which was set up by the FT Search unit within the Financial Times and launched in March 2009 was shut down recently after just a few months in full operation. We looked  at it from time and time and obviously looked at it closely at launch time, talked to the executive in charge of it (who also appears to have left the FT) and to most of the vendors that supplied the technology – namely Endeca, Lexalytics, Nstein and Reel Two.

But the fact that people like us only looked at it from time was indicative of the problems the site apparently had. According to a source at one of the technology suppliers, the site was sticky once people had stayed around on it for the first time, but  those new users were hard to come by and those that didn’t persist that first time didn’t find a reason to come back.

But Newssift’s loss is a bit of a blow to those in the text analysis industry, as it was supposed to be a flagship application of the technology, brought to market by one of the pre-eminent publishers in the world. That combination apparently wasn’t enough to make it succeed.

The thing that reminded us to look at it was yesterday’s acquisition of Nstein technologies by fellow Canadian content management player (and roll-up machine) Open Text for $35m, which seems to be a case primarily of Open Text consolidating its industry as long as it can get a good price, rather than being a deal for customers or technology.

We probably should have been using Newssift daily rather than relying on M&A to jog our memory as to its existence. But we weren’t, and now it’s gone .

Information management team at this year’s 451 client event

Most of the information management team are attending the 4th annual 451 client event, which takes place in Boston next week, November 2-3, so I thought I’d let you know what we’re up to.

Four of us are presenting, here’s the dates/times (all ET) and themes:

  • Nov 3, 3.30-4.15: Matt Aslett – Open source to the rescue?

Can open source really help enterprises cut costs and ride out the economic storm? What has been the impact of current conditions on open source adoption? How is this being reflected in the business strategies of vendors – both open source specialists and traditional proprietary vendors?

  • Nov 4, 11.00-11.45: Nick Patience & Kathleen Reidy – E-Discovery to Information Governance: From Reactive, Unavoidable Cost to Proactive Cost-Avoidance.

E-discovery is a market without a lot of discretionary spending – legal events and investigations occur, and require that organizations produce relevant electronic information, no matter the difficulties or costs. This fact has driven lots of vendors from various sectors to the e-discovery (also known as e-disclosure) market: it is driving business in the archiving, enterprise content management and enterprise search markets, as organizations want to figure out how to better prepare for litigation before it occurs.

  • Nov 4, 11.45-12.30: Simon Robinson -  Storage Technology Is Thriving in the Economic Downturn

The economy is shrinking, but data is growing. Almost universally, storage vendors claim they can help IT ‘do more with less’ by squeezing more value out of storage assets to meet rampant data growth and stiffer retention criteria. This presentation will examine how three key trends in storage innovation – optimization, unification and the cloud – are helping some storage vendors thrive in this uncertain climate. The session will conclude with a vendor panel discussion.

Henry Baltazar is also attending and we’re all avaiable for 1:1s, though some of our days are getting pretty near to full. Contact your account rep about booking a slot.

If you are a client and you’re not attending then you’re missing out on one of the key beneifts of being a client!

If you’re not a client and you wish to attend, you can do that too, only you’ll have to pay to get in. Either way, you can register here.

Beyond information management all our other themes will be address including cloud (a lot!), security, virtualization, eco-efficient IT and our popular M&A panel, which always comes right before cocktails on day 1.

See you there!

Enterprise search & text analysis market sizing report

I’m pleased to announced that the first market sizing report from our Information Management practice here at 451 has been published. It covers the enterprise search and text analysis markets, providing revenues figures from 2009-2013 and our growth expectations for those years.

We look at the reasons for that projected growth, identifying 10 drivers overall, one of which is the rise of search-based applications. At some point in the future we’d like to try and size that market, although it’s too nascent to put a number on it just yet.

You can download an executive summary or find out more about the report here.

Suffice to say I’m very excited about this new addition to our coverage, adding the quantitative element to our many years of analyzing the market on a qualitative basis.

This report will be updated every six months with new figures and every 12 months with new analysis an figures. We provide analysis of the industry throughout the year through our Market Insight Service in shorter, more regular form.

This is not only the fist in a series of reports on the enterprise search business, but also the first in a series of market sizing reports within information management. The next will be on the data warehousing business, due in early 2010, written by Matt Aslett.

Autonomy pops up to pronounce an RDBMS revolution is afoot

In one of those Autonomy announcements that seemingly appear out of nowhere, the company has declared its intention to “transform” the relational database market by applying its text analysis technology to content stored within database. The tool is called IDOL Structured Probabilistic Engine (SPE), as it uses the same Bayesian-based probabilistic inferencing technology that IDOL uses on unstructured information.

The quote from CEO Mike Lynch grandly proclaims this to be Autonomy’s “second fundamental technology” – IDOL itself being the first. That’s quite a claim and we’re endeavoring to find out more and will report back as to exactly how it works and what it can do.

Overall though this is part of a push by companies like Autonomy, but also Attivio, Endeca, Exalead and some others into the search-based application market. The underlying premise of that market is database offloading; the idea of using a search engine rather than a relational database to sort and query information. It holds great promise, partly because it is the bridge between enterprise search and business intelligence but also because of the prospect of cost savings for customers as they can either freeze their investments in relational database licenses, reduce them, or even eliminate them.

Of course if the enterprise search licenses then get so expensive as to nullify the cost benefit, then customers will reject the idea, which is something of which search vendors need to be wary.

Users can apply to joint the beta program at a very non-Autonomy looking website.

Quick thoughts on IBM-SPSS

Quick thoughts on the deal. We will have a full report for clients tonight. This is mainly thoughts about the text analytics part and I haven’t had a chance to speak with either company at the time of writing, so bear that in mind.

  • This is long-predicted, by us and many others. I recall a chat with SAS founder and CEO Jim Goodnight a couple of years ago and he said it me – and I’m slightly paraphrasing -  in so many words, “why doesn’t IBM just buy them, I don’t understand why they haven’t already?” Well IBM finally has, or at least has made the initial move. And for $50 per share or almost $1.2bn.
  • Of course like almost every IBM deal in recent years, the two are partners, IBM signed an OEM deal for the SPSS’ PASW statistics software in Q2 and has had other deals with it going back many years.
  • IBM has text anlaytics tools, of course but they really are just that; tools. It is not a major player in text analytics applications at this juncture. The vast majority of its engagements tend to be very large, custom-based ones and are still few and far between, as far as we can gather, mostly in financial services and telecommunications.
  • SPSS, on the other hand has tools, workbenches and applications and has found  some hot spots in this area, including analyzing customer feedback surveys, in particular the open-ended questions that can provide some of the richest material in such surveys but are often ignored because they’re too manual-intensive to analyze by hand.
  • SAS Institute now has a much bigger analytics competitor. Goodnight didn’t rate SPSS much as a competitor, but IBM? That’s a bit different.
  • SAP-Business Objects must be thinking of making a move too.

More considered thoughts from myself and my fellow 451 analysts later on today.

Text Analytics Summit 2009

The 2009 Text Analytics Conference was a great time, congratulations to the organizers for once again putting on a terrific event. I heard from one of them that attendance was down 20% from last year, which sounds about right given the economic situation and travel budgets right now, but it didn’t put a damper on the festivities.

Voice of the customer was once again the application that got the most play, from vendors and speakers. However reputation analysis/opinion mining/buzz monitoring – or what was sometimes called social media analysis – was a close second this year, with an eye to the lower-cost offerings springing up in this area to mine blogs and internet forums. Some related points:

  • Twitter came up several times (it’s everywhere this year of course), but prevailing opinion was that it’s not a great resource for text mining – too many misspellings, abbreviations, and just plain not enough text per tweet to be able to get a good read on the content.
  • Facebook’s Roddy Lindsay was back to offer an overview of some of the projects underway to mine popular topics on the site for insight on its users and how their age, gender and regional demographics affect their views. Unfortunately as data on Facebook is private to its users and their network of friends, this was kind of a tease for those of us who would love a bigger peek at it.
  • In non-social media, another sentiment analysis-focused site, the Financial Times’ recently launched meaning-based news search Newssift, also got some mentions (in part because two of the vendors present, Lexalytics and Endeca, were involved in the project along with NStein and Reel2).

End users were well-represented this year, and I was even fortunate enough to get to moderate the end user panel, featuring former school superintendent Chris Bowman, Mike House of Maritz Research, Bryan Jeppsen of JetBlue, John Lehto of Monster and Rick Lewis of AOL. The gentlemen weighed in on everything from technical problems (they overwhelmingly chose SaaS to avoid issues) to variations on the inevitable ROI question, and provided some much-needed perspective to what end users expect out of the vendors and their products. Response has been good, and for anyone wanting more, be aware that the ever-quotable Mr. Bowman is now on Twitter and may very well be watching your every move.

Text Analytics Summit 2009

With the 5th annual Text Analytics Summit now in the bag, here are my thoughts on the event.

My talk on which vendor options to choose on Sunday night was, I think at least, well received. Probably only about 30 people in the room but all bar about 5 of them were end users, which is good. The slides are available to anyone who drops me a note, and for those that were there on Sunday, I will get them to you very soon.

That end-user theme carried on to the main conference, whereby there was a higher proportion of end users this year than last year without a doubt. The overall attendance was down slightly and when I saw the list on Monday morning I was concerned, but more than a third of them were users, which was much better than last year when there was often a feeling of vendors pitching to other vendors, which doesn’t help anybody.

A fair few of the end users present were at a very early stage of their assessment, too. Many were merely aware that text analytics can do something for them, but hadn’t engaged properly with any of the vendors. I will be following up with those and the other users I met during the conference as we look to help them evaluate their vendor options.

The end-user panel, moderated well by our own Katey Wood was interesting as ever. Jon Lehto of Monster.com had some rich insight and Bryan Jeppsen at JetBlue, now two years into its use of Attensity explained how it had changed its customer surveys from 1 open-ended question in 40 (and 39 structured questions) to mostly open-ended as it now has the power to analyze that text and get insight it would have never had received had it had to work out in advance what sort of answer it wants. Both AOL and JetBlue were able to bypass their IT departments and go with the SaaS versions of their vendors’ products.

The analyst panel, if I’m being honest, was probably a bit flat from the audience’s perspective as we were agreeing too much. I tried to disagree at one point but then didn’t quite clarify what I meant, so I did it in an earlier post. We had a question from the audience from someone at Whirlpool about ROI which we all struggled with a bit. That’s because ROI on text analytics apps is tricky because

  • quite often you’re doing something completely new that you’ve never been able to think of doing before, such as automatically parsing customer’s comments on blogs
  • many text analytics apps are quite small and thus don’t often require such an ROI measure
  • they’re often part of some sort of competitive or customer intelligence effort that’s much larger and thus the text analytics element itself isn’t subject to ROI.

But clearly for a company with the size of investment Whirlpool has made with text analytics, it’s a valid question and made us all ponder the ROI question a bit more deeply.

Things I thought I’d hear more about but didn’t: cloud and eDiscovery. There were SaaS-based representatives there in the shape of Clarabridge and Attensity for sure and Clarabrige in particular has some great reference customers willing to speak on its behalf, notably AOL and Intuit. But in terms of true cloud-based text analytics, it’s still too early, and may even been so next year.

I was more surprised not to hear much about eDiscovery. What little I did hear (apart from the listening to the sound of my own voice, of course) was from Ernst & Young and its proactive fraud detection work, plus some of which has been parlayed from previous successful eDiscovery work with clients, which is exactly what we thought would be happening (always good to hear end user validations of predictions made in research).

Things I though I’d hear about and did: sentiment analysis. Last year it was the undercurrent of the conference. This year it came very much to the surface. There wasn’t too much difference between a lot of the offerings and some of the presentations (but by no means all) were a bit too down in the weeds. But there’s tons of interesting implementations out there now, although a fair amount of work still to be done.

Anyway overall it was well worth it and I recommend the conference next year to anyone interested in how to leverage text for insight into customers, competitors, risk exposure or all sorts of other business and organizational issues.

Text Analytics startups

I made a comment on the analyst panel at the end of day 1 about the emergence of startups in this space that I wanted to qualify, as it’s caused a bit of confusion here at the Text Analytics Summit. The other three panel members said they are seeing startups while I said I’m not and nor are VC customers asking about them in the way they did a few years back. I said that partly to shake up the panel a bit as we were agreeing on everything until then, which isn’t that interesting for the audience ;) , but I meant it in a specific way.

The main area where text analytics-based startups have emerged in the last few years is in sentiment analysis, in areas such as opinion mining, buzz, product/service reviews and advertising targeting. Many of these apps are being used by enterprises for sure.

But what I was referring to is that I’m not seeing companies offering text analytics tools (whether on-premise or on a SaaS or cloud basis) that can be used as the basis of text-aware or search-based applications. I am seeing a lot of demand and interest in those apps from enterprises (our main focus here at 451) but  the tools to build them are not coming from startups.

Instead they’re coming mainly from more established search, content management and eDiscovery-focused companies (with one or two notable exceptions, such as Attivio and Digital Reef in the past two years). There is probably room for more startups in this space, that’s for sure.

More on what has been a great conference so far later.

Upcoming Enterprise Search & Text Analytics summits

We have two ’summit’s coming up in the next few weeks on the east coast that we’ll be attending.

We’ll be at the Enterprise Search Summit in New York May 12-13 at the Hilton on 6th Avenue. We have a bunch of meetings already but still have room for more, so if you’re attending and would like to meet (end users in particular, but vendors too), please get in touch with myself or Katey.

And just a few weeks later we’ll be in Boston where I’ll be at the 5th annual Text Analytics Summit. I’m doing the Sunday night graveyard slot once again on May 31, laying out my assessment of vendors fo but last year (it’s called “Top Tips on Vendor Choices” in the agenda). I recall it was enjoyable and we ended up taking the conversation to the bar afterward; a tradition I intend to continue this year. I’m also on a panel at the end of Day 1 (June1), right before cocktails (I’m seeing a trend here). Likewise, please get in touch if you want to meet up. I’m staying in Boston June 3 to meet clients, then back to London.

Brief thoughts on Attensity Group

451 clients will be getting our fill report on this deal today, so I won’t be spilling all our thoughts on the deal (or all the details of the structure) here. But here’s a few initial thoughts on the news:

  • The new entity will comprise a bigger threat against the many text analysis competitors recently acquired by much larger companies, notably Teragram by SAS, ClearForest by Reuters and Inxight by Business Objects (and subsequently by SAP), as well as SPSS, which got into this business via acquisition back in 2003.
  • Such deals aren’t usually done from a position of dominance and it’s fair to say that Attensity wasn’t growing as fast as it used to be. They’re driven in part by investors who either want a payout now or see the potential for one by adding heft to a company and thus getting some economies of scale.
  • Attensity remains in the voice of the customer business, but adds a few more, including customer self-service.
  • CEO Ian Bonner and CTO Ian Hersey are back together almost two years since selling Inxight.
  • Hersey now has a major software integration job on his hands for the next couple of years.