Upcoming data events and travel plans

I’m gearing up for a busy few weeks of international travel with presentations in the Europe and both the east and west coasts of the US.

It all starts on March 28 when I’ll be heading to London for Cassandra Europe 2012 where I’m looking forward to attending a packed schedule of Apache Cassandra case studies. Later in the day I’ll be essentially improvising a presentation combining our view of the state of the NoSQL market with an overview of highlights from the case studies stream for those who have attended the workshop stream.

The following week is HCTS EU, 451 Research’s own event in London, which takes place on April 2-3 and is Europe’s go-to convergence event for CIOs, cloud decision makers, vendors and investors. On April 3 I’ll be presenting our ‘Big Data’ Survival Guide – explaining the importance of ‘big data’ – what it is, what it isn’t and why you should care, we well as 451 Research’s associated concept of Total Data, designed to enable the realisation of valuable business intelligence from ‘big data’.

After a quick trip to California for an analyst event I’ll be heading for Zurich for a couple of events where I’ll be explaining our perspective on the development and adoption of NoSQL and NewSQL databases, including some insights from our forthcoming long format report on the competitive dynamic between MySQL, NoSQL and NewSQL. Specifically, I’ll be presenting at the ESE Conference on March 25th, followed by the NoSQL Road Show on March 26.

Then I’m off to Washington DC to attend MarkLogic World, where I’ll be appearing on a panel with other analysts on May 2 to discuss the impact and implications of ‘big data’.

At some point during all this traveling I’ll be completing the forthcoming long format report on the competitive dynamic between MySQL, NoSQL and NewSQL, hopefully before I’m back in California for OSBC, where I’m scheduled to present our findings on May 21.

Look out also for details of a couple of webinars currently being scheduled between now and the end of May as well.

And then I’m going on holiday.

Tags: , , , , ,

The Data Day, Today: Mar 13 2012

Drawn to Scale raises funding. Cloudera launches HBaseCon. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Drawn to Scale Announces Funding for Real-Time Big Data

* Cloudera Announces HBaseCon 2012, the Industry’s First Apache HBase Community Conference

* Gazzang Launches Big Data Encryption and Key Management Platform

* Jaspersoft Closes Record Fiscal Year

* Schooner Information Technology Releases Membrain 4.0

* How Project Mercury is eBay’s Big Data play Up on the roof of EBay’s big data center.

* SAND Announces Universal Query

* Oracle has a cloud computing secret The potential impact of metered pricing.

* Why should I consider memcached plugin? …for MySQL.

* For 451 Research clients

# Microsoft launches SQL Server 2012, with an eye on ‘big data’ Impact report

# Global IDs hones governance, MDM focus; looks to the cloud and appliances for growth Impact report

# Clarabridge ups the ante in ‘voice of the customer’ with v5.0 as the CEM space heats up Impact report

# ScaleBase launches elastic load balancing for MySQL databases Market Development report

# Dassault’s Exalead searches for a ‘big data’ role Market Development report

And that’s the Data Day, today.

Tags: , , , , , , , , , , , , , , , , ,

What’s in a name? Analyzing ‘Dropbox for the enterprise’

We’ve been spending a good deal of time lately talking to vendors looking to deliver ‘Dropbox-for-the-enterprise’ alternatives.  By this, providers generally mean that they enable users to sync and share their files across desktops and devices, but in a way that is palatable to corporate IT departments.   I’d say we really started to see this activity in earnest about a year ago, when Box started getting serious about the enterprise market and I began to get a lot of briefing requests from the likes of Accellion, Egnyte and others about their enterprise file sharing and sync offerings.  Things really started heating up later in 2011, as we saw VMWare announce its Dropbox-for-the-enterprise in August, Citrix acquire ShareFile in October; open source play ownCloud set sail in December and we recently initiated coverage on another startup, Germany-based TeamDrive.

These are only a few of the movements in this emerging market. Things will only become more active in 2012. Perhaps one of the more notable features is the broad background of players entering this space – we see vendors from virtualization, security, storage, content management and mobiltity sectors all vying for attention. This is likely to cause an awful lot of noise, and consfusion.

Compounding the matter is that everyone in this market seems to be struggling with what exactly to call it.  ”Enterprise-grade Dropbox” neatly encapsulates it, but it’s not really a viable way to refer to a market segment.  We put out a report on ‘cloud file sharing’ late in 2011, but that really is a broader focus and doesn’t really capture what is important and different about this segment in particular.  Dropbox is a obviously a cloud service and many of the players that want to offer Dropbox-like services are as well.  But while the cloud certainly *can* be enabling an enabling technology, it doesn’t have to be.  Indeed, a number of players, such as Accellion, Egnyte, GroupLogic, ownCloud, Oxygen Cloud and, presumably, VMWare when it gets to it, are offering private-cloud or on-premises approaches for file sharing and sync.

So we’ve settled on Mobile File Sharing and Sync Platforms as the way that we are going to refer to this segment, at least for now.   The mobility part of this, as opposed to cloud, is what is really new and disruptive.  That is what drives the need for sync and native apps for specific device types.  We also think it is important to identify these emerging products, including Dropbox itself, as ‘platforms’ since we suspect there will be ample opportunity moving forward for customization and plug-ins to these tools.  We are already seeing some of these in the areas of security, content management and collaboration for Dropbox specifically.

Calling a set of Dropbox-like capabilities a platform is interesting, though we can also flip the conversation on its head and wonder whether sync is really a feature, as others are doing.  The answer may well be that it is both.  In the enterprise, it certainly makes sense as a feature of content management, collaboration and even storage offerings, since business content is generally part of broader business processes and often needs to be retained for compliance reasons.   IT also wants to get the most out of existing investments. We are already seeing sync as a feature from the likes of OpenText and Huddle, and this is arguably Box’s approach as well.  We also have partnerships between the likes of Oxygen Cloud and EMC, to layer a sync service on top of storage infrastructure.

We take a more extensive look at the market for Mobile File Sharing and Sync Platforms in a recent report (login required) for 451 clients.  This report looks at user and IT requirements and provides more detail on the enterprise players we’ve begun to track. How this market plays out exactly over time remains to be seen, but we think it has the potential to be extremely disruptive. For that reason it’s a space we’ll continue to watch closely, and from multiple vantage points.

Tags: , , , , , , , ,

The Data Day, Today: Mar 8 2012

Microsoft launches SQL Server 2012. MapR integrates with Informatica. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Microsoft Releases SQL Server 2012 to Help Customers Manage “Any Data, Any Size, Anywhere”

* SQL Server 2012 Released to Manufacturing

* SAS Access to Hadoop Links Leading Analytics, Big Data

* MapR And Informatica Announce Joint Support To Deliver High Performance Big Data Integration And Analysis

* Teradata Expands Integrated Analytics Portfolio

* New Teradata Platform Reshapes Business Intelligence Industry

* Microsoft’s Trinity: A graph database with web-scale potential

* KXEN Announces Availability of InfiniteInsight Version 6, a Predictive Analytics Solution with Unprecedented Agility, Productivity, and Ease of Use

* Software AG Announces its Strategy for the In-memory Management of Big Data

* Attunity and Hortonworks Announce Partnership to Simplify Big Data Integration with Apache Hadoop

* Schooner Information Technology and Ispirer Systems Partner to Deliver SQLWays for SchoonerSQL

* Big Data & Search-Based Applications

* Namenode HA Reaches a Major Milestone

* How Twitter is doing its part to democratize big data

* Dropping Prices Again– EC2, RDS, EMR and ElastiCache

* For 451 Research clients

# SAS outlines Hadoop strategy, previews Hadoop-based in-memory analytics Market Development report

# Pervasive rides the elephant into ‘big data’ predictive analytics Market Development report

# IBM makes desktop discovery and analysis play, shares business analytics priorities Market Development report

# Clustrix launches SDK to tap developer interest in new databases Market Development report

# Continuent and SkySQL team up for clustered MySQL support Analyst note

# MapR gets a boost from Cisco and Informatica Analyst note

And that’s the Data Day, today.

Tags: , , , , , , , , , , , , , , , , , , , , , ,

Cisco and Informatica deals provide a boost for MapR

We recently speculated that EMC Greenplum’s focus on the integration of its Greenplum HD Hadoop distribution with its Data Computing Appliance (DCA) and Isilon storage technology would mean an increasingly niche role for Greenplum MR- the Hadoop distribution based on MapR’s M5.

Two recent announcements indicate that niche might continue to be a lucrative one for MapR, however. First, Cisco released details of a reference architecture for deploying Greenplum MR on Cisco’s UCS servers. Then Informatica announced a partnership with MapR to jointly support its Data Integration Platform running on MapR’s distribution for Hadoop.

The Informatica relationship also covers bi-directional data integration with Informatica PowerCenter and Informatica PowerExchange, snapshot replication using Informatica FastClone, and data streaming into MapR’s distribution via NFS using Informatica Ultra Messaging. In addition, In addition, the free Informatica HParser Community Edition will be available for download as part of the MapR distribution.

While the partnership with Informatica is a direct one for MapR, the Cisco reference architecture announcement illustrates that the benefit MapR gains from its relationship with EMC Greenplum includes exploiting the company’s leverage with potential partners.

Tags: , , , , ,

The Data Day, Today: Mar 2 2012

Hortonworks partners with Talend. Teradata and Greenplum updates. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Talend Empowers Apache Hadoop Community with Talend Open Studio for Big Data

* Hortonworks Announces Strategic Partnership With Talend to Bring World’s Most Popular Open Source Data Integration Platform to Apache Community Talend Open Studio for Big Data, will be bundled as part of Hortonworks Data Platform.

* Teradata Transforms Global Database Technology

* New EMC Greenplum Database Enhancements Boost Big Data Analytics

* Cisco’s servers now tuned for Hadoop

* Amplidata Closes $8M Funding Round with Big Bang Ventures, Endeavour Vision, Intel Capital and Swisscom

* Got Big Data? Jaspersoft CEO Brian Gentile outlines three approaches to connecting to ‘big data’ for business intelligence reporting and analysis.

* Cray’s YarcData Division Launches New Big Data Graph Appliance

* Introducing Spring Hadoop Developing applications for Hadoop technologies based on Spring technologies.

* MarkLogic and Hortonworks Partner to Enhance Real-Time Big Data Applications with Apache Hadoop

* Continuent and SkySQL Join Forces to Better Serve the Global MySQL Community

* Data Entrepreneurship

* For 451 Research clients

# Anaplan bags $11.4m in VC, looks beyond budgeting and planning to business operations Impact Report

# XtremeData seeks to differentiate analytic database for extreme data workloads Impact Report

# Calpont adds parallel loading to columnar database for online analytics Market Development Report

# MarkLogic formalizes Hadoop support with Hortonworks partnership Analyst note

And that’s the Data Day, today.

Tags: , , , , , , , , , , , , , , , ,

The Data Day, Today: Feb 29 2012

Microsoft and Hortonworks expand Hadoop partnership. Oracle ships Exalytics. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Hortonworks to Bring Apache Hadoop to Millions of New Users Hortonworks and Microsoft expanded their relationship around Apache Hadoop.

See also:
# Big Data for Everyone: Using Microsoft’s Familiar BI Tools with Hadoop
# Microsoft’s Hadoop roadmap reveals new big data deliverables
# Karmasphere Expands Big Data Analytics on Hadoop in the Enterprise
# Datameer to Bring Hadoop Analytics to Windows Azure
# HStreaming Brings Real-Time Analytics to Microsoft’s Hadoop-based Services for Windows Server and Windows Azure

* Oracle Announces Availability of Oracle Exalytics In-Memory Machine

* Fujitsu Releases “Interstage Big Data Parallel Processing Server V1.0″ to Help Enterprises Utilize Big Data

* Pentaho and DataStax announce strategic partnership delivering the first complete Apache Cassandra-based big data analytics solution to the market

* Cloudant Names Andy Palmer to its Board of Directors

* R integrated throughout the enterprise analytics stack

* Jaspersoft Announces Big Data Index to Track Demand for Big Data Analytics

* 1010data Enables Companies to Rapidly Model and Predict Individual Consumer Behavior and Social Network Relationships

* Tableau Software Teams with Attivio to Tap Unstructured Content and Deliver Deeper Insight to Business Users

* Infochimps and the Future of Data Marketplaces “This is the clearest indication yet that data marketplaces may be the latest ‘Application Service Provider’ cycle, as in right idea, wrong time.”

* HStreaming and RainStor Partner to Lower the Cost of Big-Data Analytics on Hadoop

* JustOne Database Sets the Stage for Accelerated Growth in 2012 and Beyond

* Big Data investment map

* A group of Google Engineers released “vitess” – a project to help scale MySQL databases.

* For 451 Research clients

# Reassessing the M&A potential of NoSQL and NewSQL Sector IQ report

# Sears Holdings creates Hadoop managed service provider MetaScale Impact Report

# Datawatch turns the corner with focus on report analytics suite Impact Report

# arcplan details growth plan, as it expands into front end for SAP HANA and social BI Impact Report

# Objectivity adds reusable queries to InfiniteGraph NoSQL database Market Development Report

# Host Analytics illuminates cloud performance management growth strategy and roadmap Market Development Report

And that’s the Data Day, today.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

A stupid question about in-memory analytics

During my first trip to Oracle OpenWorld as an analyst a few years ago I asked a room full of Oracle data-warehousing users whether any of them had explored the use-cases for other Oracle data management assets, such as the TimesTen in-memory database.

The question was met with complete silence before Ken Jacobs kindly suggested that perhaps this wasn’t the right crowd for that sort of question.

It was one of those moments that really haunts you. My first industry event as an analyst and I had embarrassed myself by asking an apparently stupid question in front of a room of more experienced colleagues and potential clients.

Doesn’t seem like such a stupid question now though, does it?

I have exorcised the demons! This house is clear.

Tags: , ,

The Data Day, Today: Feb 24 2012

Teradata partners with Hortonworks. New CEOs for Zettaset and VoltDB. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Teradata-Hortonworks Partnership to Accelerate Business Value from Big Data Technologies

* Skytree Unlocks the Advanced Analytics Power of Big Data with Unprecedented Performance, Scalability and Accuracy

* Big Data Innovator Zettaset Appoints Jim Vogt as New President and CEO

* Zettaset to Create Secure Hadoop with ‘SHadoop’ Initiative

* VoltDB Names Bruce Reading President and Chief Executive Officer

* Basho Unveils New Graphical Operations Dashboard, Diagnostics With Release of Riak 1.1

* Pervasive RushAnalyzer Launches ‘No Compromise’ Predictive Analytics for Hadoop and Big Data

* QlikTech Reveals Pricing for its QlikView Business Discovery Platform

* Kognitio Announces Completely Memory-Based Pricing

* Objectivity Adds New Plugin Framework, Integrated Visualizer And Support For Tinkerpop Blueprints To InfiniteGraph

* Announcing the Infochimps Platform for Big Data

* Big Data, Hadoop and StreamInsight

* Three New Cloud Providers join the MongoDB ecosystem

* Hadoop Has Promise but Also Problems

* Hortonworks: Reaffirming our Commitment to 100% Pure Open Source Despite speculation to the contrary.

* WhySQL? Evernote explains why it continues to use SQL databases.

* More on database consistency Anders Karlsson explains the different definitions of database consistency.

* Graphic proof of big demand for big data talent Or just graphic proof of use of phrase ‘big data’ in jobs ads?

* Will ‘big data’ transform your industry?

* For 451 Research clients

# CrowdFlower – it’s like Hadoop, but with people? Impact Report

# Teradata and Hortonworks strike Hadoop marketing and development deal Market Development report

# Hypertable reemerges with high-performance NoSQL database Market Development report

And that’s the Data Day, today.

Tags: , , , , , , , , , , , , , , , , , , , , , , , ,

Updated: sizing the big data problem: ‘big data’ is *still* the problem

In late 2010 I published a post discussing the problems associated with trying to size the ‘big data’ market based on a lack of clarity on the definition of the term and what technologies it applies to.

In that post we discussed a 2010 Bank of America Merrill Lynch report that estimated that ‘big data’ represented a total addressable market worth $64bn. This week Wikibon estimated that the big data market stands at just over $5bn in factory revenue growing to over $50bn by 2017, while Deloitte estimated that industry revenues will likely be in the range of $1-1.5bn this year.

To put that in perspective, Bank of America Merrill Lynch estimated that the total addressable market for ‘big data’ in 2010 was this

Wikibon estimates that the ‘big data’ market in 2012 is this

and Deloite estimates that the ‘big data’ market in 2012 is this

UPDATE – IDC has become the first of the big analyst vendors to break out its big data abacuses (abaci?). IDC thinks the ‘big data’ market in 2010 was $3.2bn. That’s this

Not surprisingly they came to their numbers by different means. BoA added up market estimates for database software, storage and servers for databases, BI and analytics software, data integration, master data management, text analytics, database-related cloud revenue, complex event processing and NoSQL databases.

Wikibon came to its estimate by adding up revenue associated with a select group of technologies and a select group of vendors, while Deloitte added up revenue estimates for database, ERP and BI software, reduced the total by 90% to reflect the proportion of data warehouses with more than five terabytes of data, and reduced that total by 80-85% to reflect the low level of current adoption.

IDC, meanwhile, went through a slightly tortuous route of defining the market based on the volume of data collected, OR deployments of ultra-high-speed messaging technology, OR rapidly growing data sets, AND the use of scale-out architecture, AND the use of two or more data types OR high-speed data sources.

There is something to be said for each of these definitions. But equally each can be easily dismissed. We previously described our issues with the all-inclusive nature of the BoA numbers, and while we find Wikibon’s process much more agreeable, some of the individual numbers they have come up with are highly questionable. Deloitte’s methodology is surreal, but defensible. IDC’s just illustrates the problem:

What this highlights is that the essential problem is the lack of definition for ‘big data’. As we stated in 2010: “The biggest problem with ‘big data’… is that the term has not been – and arguably cannot be – defined in any measurable way. How big is the ‘big data’ market? You may as well ask ‘how long is a piece of string?’”

Tags: