Updated database landscape graphic

One of the most popular pieces I have produced since joining 451 is not a research report or presentation but the database landscape graphic that accompanied our NoSQL, NewSQL and Beyond report.

We’ve seen it crop up in other presentations and websites – sometimes even with attribution 😉

We actually updated the image to accompany our more recent report MySQL vs. NoSQL and NewSQL: 2011-2015 but I realised that I haven’t made that newer version more generally available. So here it is:

We wouldn’t claim it to be perfect. There’s a whole new breed of data platform-as-a-service providers that have emerged in recent months that will need to be added, if we can find space for them.

Meanwhile there are a group of database vendors that have also emerged that don’t easily fit into the segments we’ve created: companies like Drawn to Scale, FoundationDB, Aerospike and Splice Machine.

But since the original graphic continues to be popular, I thought I’d share the latest iteration as well. Any feedback always welcome

Forthcoming webinar: Choosing a Next-Gen Database

On Tuesday, November 13 at 12pm ET I’ll be taking part in a webinar in association with ScaleBase on the subject of Choosing a Next-Gen Database: The new world order of NoSQL, NewSQL & MySQL.

With the database market becoming increasing complex and changing on an almost daily basis, I’ll be providing an overview of this ever-changing market: discussing the benefits and drawbacks of NoSQL, NewSQL & MySQL databases and exploring real-life use cases for each.

Joining me will be Doron Levari and Paul Campaniello, both from ScaleBase, who will be discussing specific use cases of ScaleBase’s Data Traffic Manager, which is designed to enable next generation applications that require big data transactional processing, without changing the existing infrastructure.

For full details, and to register for the event, click here.

MySQL vs. NoSQL and NewSQL – survey results

Back in January we launched a survey of database users to explore the competitive dynamic between MySQL, NoSQL and NewSQL databases, and to to discover if MySQL usage is really declining – as had been indicated by the results of a prior survey.

The publication of the associated report took longer than expected, mostly because we expanded its scope to include revenue and growth estimates for the MySQL ecosystem, NoSQL and NewSQL sectors respectively, and with that report now published I am pleased to fulfil our promise to share the survey results.

We seem to be having some random embedding issues so for now the results can be found on SlideShare, adapted from the presentation given at OSBC earlier this week. For greater context, we have also included an explanation of each slide, below:

Slide 2: Provides an overview of the associated report – MySQL vs NoSQL and NewSQL 2011:2015, which is available here.

Slide 3: Explains why we launched the report. We once described as the crown jewel of the open source database world, since its focus on Web-based applications, its lightweight architecture and fast-read capabilities, and its brand differentiated it from all of the established database vendors and made for a potentially complementary acquisition. Today, the competitive situation is very different.

Slide 4: Oracle’s MySQL business faces competition from the rest of the MySQL ecosystem, as illustrated in Slide 5, many of which have emerged following Oracle’s acquisition of Sun/MySQL.

Slide 6: The emergence of these alternatives was triggered, in part, by concern about the future of MySQL. A previous 451 survey,conducted in November 2009, showed that there was real concern about the acquisition, with only 17% of MySQL users believing Oracle should be allowed to acquire MySQL.

Slide 7: The 2009 survey also showed that while 82.1% of respondents were already using MySQL, that figure was expected to drop to 72.3% by 2014. That survey was conducted amid a climate of fear, uncertainty and doubt regarding the future of MySQL, and one of the drivers for our current report was to see if that predicted decline occurred.

Slide 8: To put this in context, we asked the current survey sample (which included 205 database users) about their reaction to the acquisition. While the vast majority of MySQL users reported that they continued to use MySQL where appropriate, 5% indicated that they were more inclined to use MySQL, and 26% said they were less inclined to use MySQL. Not surprisingly the proportion of users less inclined to use MySQL was much higher amongst those abandoning MySQL than those sticking with MySQL.

Slide 9: We also asked respondents to rate Oracle’s ownership of MySQL on a range of very good to very bad. Overall, the balance tipped in favour of a negative perception of Oracle’s track record, while there was naturally a more negative perception of Oracle amongst those abandoning MySQL compared to MySQL mainstays. However, the results showed that the percentage of respondents rating the company’s performance ‘very good’ and ‘very bad’ was actually quite similar for both abandoners and mainstays. While those abandoning MySQL are more likely to have a negative perception of Oracle, it is not necessarily safe to assume that Oracle’s actions and strategy are the cause of the abandonment. Clearly there are other competitive forces at work.

Slide 10: Not least the emergence of NoSQL, as illustrated in Slide 11, and NewSQL, as illustrated in Slide 12.

Slide 13: Based on some very high profile examples of projects migrating from MySQL to NoSQL, there is a common assumption that NoSQL and NewSQL pose a direct, immediate threat to MySQL. We believe the competitive dynamic is more complex.

Slide 14: While 49% of those survey respondents abandoning MySQL planned on retaining or adopting NoSQL databases, only 12.7% said they had actually deployed NoSQL databases as a *direct replacement* for MySQL.

Slide 15: In comparison, there is much greater overlap between NewSQL and MySQL, but of a complementary nature. 33% of respondents retaining MySQL had considered, tested or deployed NewSQL database technologies, while approximately 75% of the NewSQL revenue for 2011 is from vendors that we also consider part of the MySQL ecosystem.

Slide 16: The results of our 2012 survey show that MySQL is currently the most popular database amongst our survey sample, used by 80.5% of respondents today.

Slide 17: However, it’s popularity is again expected to decline to 2014 and 2017. This indicates an accelerated decline in the use of MySQL, compared the findings of our 2009 survey. While that survey was conducted amid a climate of fear, uncertainty and doubt regarding the future of MySQL we are not aware of any specific reason why the 2012 sample, which was self-selecting, should have a disproportionately negative attitude to MySQL or Oracle.

Slide 18: MySQL’s predicted decline of 26.4 percentage points between 2012 and 2017 compares to a predicted decline of just 9.3 percentage points for Microsoft SQL Server, and only 5.9 percentage points for Oracle Database. In comparison, MariaDB, Apache Cassandra and Apache CouchDB are expected to increase in usage by 3.0 percentage points or greater between 2011 and 2017.

Slide 19: Although alternative MySQL distributions including MariaDB, Drizzle and Percona Server are expected to see increased adoption over the next five years, they are not growing at the same rate that MySQL is declining.

Slide 20: So where are those abandoning MySQL going to? Looking specifically at the 55 MySQL users who expect to abandon it by 2017 (which is admittedly a small sample, and therefore not to be considered statistically relevant) we see that PostgreSQL is the most popular database being retained or adopted over the same period, followed by Microsoft SQL Server, Oracle, MongoDB, and MariaDB.

Slide 21: This only tells part of the story, however. Just because a company is retaining Oracle Database, for example, does not necessarily mean that Oracle Database is being used as a replacement for the abandoned MySQL. We therefore also specifically asked survey respondents which databases they had considered, tested or deployed as a direct replacement for MySQL. The response from the 55 respondents planning to abandon MySQL again saw PostgreSQL, MariaDB and MongoDB as the most popular answers, followed by Apache CouchDB and Apache HBase.

Slide 22: While NoSQL database were well-represented in this list, we saw that anyone considering NoSQL considered multiple NoSQL databases. Per respondent, NoSQL databases were the least considered of all alternatives by existing MySQL users.

Slide 23: The survey results suggest that MongoDB is the most often considered, tested or deployed as a replacement or complement for MySQL, followed by Apache CouchDB, Apache HBase, Apache Cassandra/DataStax, and Redis.

Slide 24: NewSQL technologies that improve the scalability and performance of MySQL scored well, with eight of the top 10 most considered NewSQL technologies being directly complementing MySQL. Of the other two, one (Drizzle) is a derivative of MySQL, and the other (Clustrix) can also be used in a complementary manner as part of a MySQL cluster, although in the long-term is positioned as a direct alternative.

Slide 25: MariaDB is the member of the MySQL ecosystem most often considered, tested or deployed as a replacement or complement for MySQL, followed by Continuent Tungsten, Percona Server, MySQL Cluster, and Amazon RDS.

Slide 26: More than half of all MySQL users had considered, tested or deployed another relational database as a direct replacement, while over 40% had considered, tested or deployed a caching technology to complement MySQL. The memcached caching technology was the most widely-deployed of all the technologies we asked about, followed closely by PostgreSQL, which supported anecdotal evidence that a number of MySQL users are migrating to the other major open source transactional database.

Slide 27: For the record, the survey had 205 respondents. Primary job roles among respondents included: director/manager of IT infrastructure (18.0%); architect/engineer (17.6%); developer/programmer (15.6%); database/systems administrator (14.6%); consultant (14.1%); VP level or above (13.7%); analyst (3.4%); and line-of-business manager (2.9%).

Further survey analysis and perspective on the competitive dynamic between MySQL, NoSQL and NewSQL is available in the MySQL vs NoSQL and NewSQL report, which also includes market sizing and growth predictions for the three segments.

451 Research delivers market sizing estimates for NoSQL, NewSQL and MySQL ecosystem

NoSQL and NewSQL database technologies pose a long-term competitive threat to MySQL’s position as the default database for Web applications, according to a new report published by 451 Research.

The report, MySQL vs. NoSQL and NewSQL: 2011-2015, examines the competitive dynamic between MySQL and the emerging NoSQL non-relational, and NewSQL relational database technologies.

It concludes that while the current impact of NoSQL and NewSQL database technologies on MySQL is minimal, they pose a long-term competitive threat due to their adoption for new development projects. The report includes market sizing and growth estimates, with the key findings as follows:

• NoSQL software vendors generated revenue* of $20m in 2011. NoSQL software revenue is expected to rapidly grow at a CAGR of 82% to reach $215m by 2015.

• NewSQL software vendors generated revenue* of $12m in 2011 (of which $9m is also considered MySQL ecosystem revenue). NewSQL revenue is also expected to grow rapidly at a CAGR of 75% to reach $112m by 2015 (including $56m in MySQL ecosystem revenue).

• The MySQL support ecosystem generated revenue* of $171m in 2011 (including $9m from NewSQL technologies). MySQL ecosystem revenue is expected to grow at a CAGR of 40% to reach $664m by 2015 (including $56m in NewSQL revenue).

“The MySQL ecosystem is now arguably more healthy and vibrant than it has ever been, with a strong vendor committed to the core product, and a wealth of alternative and complementary products and services on offer to maintain competitive pressure on Oracle,” commented report author Matthew Aslett, research manager, data management and analytics, 451 Research.

“However, the options for MySQL users have never been greater, and there is a significant element of the MySQL user base that is ready and willing to look elsewhere for alternatives,”

As well as revenue and growth estimates, the report also includes a survey of over 200 database administrators, developers, engineers and managers. The survey findings include:

• While the majority of MySQL users continue to use MySQL where appropriate, the use of MySQL is expected to decline from 80.5% of survey respondents today to 62.4% by 2014 and just 54.1% by 2017.

• Despite the emergence of NoSQL and NewSQL database products, the most common direct replacement for MySQL among survey respondents today is PostgreSQL, which is also the focus of a recent burst of commercial activity.

• While 49% of those survey respondents abandoning MySQL planned on retaining or adopting NoSQL databases, only 12.7% of MySQL abandoners said they had actually deployed NoSQL databases as a direct replacement for MySQL.

“While there have been some high profile example of users migrating from MySQL to NoSQL database, the huge size of MySQL installed base means that these projects are comparatively rare,” commented Aslett.

The report describes how NoSQL database technologies are largely being adopted for new projects that require additional scalability, performance, relaxed consistency and agility, while NewSQL database technologies are, at this stage, largely being adopted to improve the performance and scalability of existing databases, particularly MySQL.

“NoSQL and NewSQL have not made a significant impact on the MySQL installed base at this stage but MySQL is no longer the de facto standard for new application development projects,” said Aslett. “As a result, NoSQL and NewSQL pose a significant long-term competitive threat to MySQL’s dominance.”

MySQL vs. NoSQL and NewSQL: 2011-2015 is now available to existing 451 Research subscribers. Non-clients can apply for trial access to 451 Research’s content.

*451 Research’s analysis of MySQL, NoSQL and NewSQL revenue is based on a bottom-up analysis of each participating vendor’s current revenue and growth expectations, and includes software license and subscription support revenue only. Revenue line items not included in these figures include hardware associated with the delivery of these services, revenue related to applications deployed on these databases, traditional hosting services, or systems integration performed by the vendors or other third parties.

The revenue estimates do not take into account unpaid usage of open source licensed MySQL, NoSQL and NewSQL software, and therefore represent only a fraction of the total addressable market. Based on the above revenue figures and other analysis, 451 Research estimates that the total value of the MySQL ecosystem in terms of ‘displaced’ proprietary software might equate to $1.7bn in 2011, while the NoSQL market had a displaced value of $195.7m and the NewSQL sector a displaced value of $99.4m.

The Data Day, Today: Feb 29 2012

Microsoft and Hortonworks expand Hadoop partnership. Oracle ships Exalytics. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Hortonworks to Bring Apache Hadoop to Millions of New Users Hortonworks and Microsoft expanded their relationship around Apache Hadoop.

See also:
# Big Data for Everyone: Using Microsoft’s Familiar BI Tools with Hadoop
# Microsoft’s Hadoop roadmap reveals new big data deliverables
# Karmasphere Expands Big Data Analytics on Hadoop in the Enterprise
# Datameer to Bring Hadoop Analytics to Windows Azure
# HStreaming Brings Real-Time Analytics to Microsoft’s Hadoop-based Services for Windows Server and Windows Azure

* Oracle Announces Availability of Oracle Exalytics In-Memory Machine

* Fujitsu Releases “Interstage Big Data Parallel Processing Server V1.0” to Help Enterprises Utilize Big Data

* Pentaho and DataStax announce strategic partnership delivering the first complete Apache Cassandra-based big data analytics solution to the market

* Cloudant Names Andy Palmer to its Board of Directors

* R integrated throughout the enterprise analytics stack

* Jaspersoft Announces Big Data Index to Track Demand for Big Data Analytics

* 1010data Enables Companies to Rapidly Model and Predict Individual Consumer Behavior and Social Network Relationships

* Tableau Software Teams with Attivio to Tap Unstructured Content and Deliver Deeper Insight to Business Users

* Infochimps and the Future of Data Marketplaces “This is the clearest indication yet that data marketplaces may be the latest ‘Application Service Provider’ cycle, as in right idea, wrong time.”

* HStreaming and RainStor Partner to Lower the Cost of Big-Data Analytics on Hadoop

* JustOne Database Sets the Stage for Accelerated Growth in 2012 and Beyond

* Big Data investment map

* A group of Google Engineers released “vitess” – a project to help scale MySQL databases.

* For 451 Research clients

# Reassessing the M&A potential of NoSQL and NewSQL Sector IQ report

# Sears Holdings creates Hadoop managed service provider MetaScale Impact Report

# Datawatch turns the corner with focus on report analytics suite Impact Report

# arcplan details growth plan, as it expands into front end for SAP HANA and social BI Impact Report

# Objectivity adds reusable queries to InfiniteGraph NoSQL database Market Development Report

# Host Analytics illuminates cloud performance management growth strategy and roadmap Market Development Report

And that’s the Data Day, today.

Last chance to take part in our MySQL/NoSQL/NewSQL survey

Thanks to everyone who has already taken part in our survey exploring changing attitudes to MySQL following its acquisition by Oracle and examining the competitive dynamic between MySQL and other database technologies, including NoSQL and NewSQL.

The response has been great and even a quick look at the results makes for interesting reading, particularly in the light of our previous findings which indicated declining MySQL usage.

I am really looking forward to having the opportunity for a deep dive into the results and break out the figures to get a better understanding of the potential impact of alternative MySQL distribution and support providers, as well as NoSQL and NewSQL, on continued usage of MySQL.

The survey results will be made freely available on our blogs, as well as being included in a long format report containing our additional analysis and research related to the MySQL ecosystem and competitive dynamic.

Right now, however, is your last chance to contribute to the survey and get your voice heard. There are just 12 questions to answer, spread over four pages, and the entire survey should take no longer than five minutes to complete. All individual responses are of course confidential.

The survey will close in 24 hours.

Is MySQL usage really declining?

If you’re a MySQL user, tell us about your adoption plans by taking our current survey.

Back in late 2009, at the height of the concern about Oracle’s imminent acquisition of Sun Microsystems and MySQL, 451 Research conducted a survey of open source software users to assess their database usage and attitudes towards Oracle.

The results provided an interesting snapshot of the potential implications of the acquisition and the concerns of MySQL users and even, so I am told, became part of the European Commission’s hearing into the proposed acquisition (used by both sides, apparently, which says something about both our independence and the malleability of data).

One of the most interesting aspects concerned the apparently imminent decline in the usage of MySQL. Of the 285 MySQL users in our 2009 survey, only 90.2% still expected to be using it two years later, and only 81.8% in 2014.

Other non-MySQL users expected to adopt the open source database after 2009, but the overall prediction was decline. While 82.1% of our sample of 347 open source users were using MySQL in 2009, only 78.7% expected to be using it in 2011, declining to 72.3% in 2014.

This represented an interesting snapshot of sentiment towards MySQL, but the result also had to be taken with a pinch of salt given the significant level of concern regarding MySQL future at the time the survey was conducted.

The survey also showed that only 17% of MySQL users thought that Oracle should be allowed to keep MySQL, while 14% of MySQL users were less likely to use MySQL if Oracle completed the acquisition.

That is why we are asking similar questions again, in our recently launched MySQL/NoSQL/NewSQL survey.

More than two years later Oracle has demonstrated that it did not have nefarious plans for MySQL. While its stewardship has not been without controversial moments, Oracle has also invested in the MySQL development process and improved the performance of the core product significantly. There are undoubtedly users that have turned away from MySQL because of Oracle but we also hear of others that have adopted the open source database specifically because of Oracle’s backing.

That is why we are now asking MySQL users to again tell us about their database usage, as well as attitudes to MySQL following its acquisition by Oracle. Since the database landscape has changed considerably late 2009, we are now also asking about NoSQL and NewSQL adoption plans.

Is MySQL usage really in decline, or was the dip suggested by our 2009 survey the result of a frenzy of uncertainty and doubt given the imminent acquisition. Will our current survey confirm or contradict that result? If you’re a MySQL user, tell us about your adoption plans by taking our current survey.

451 Research MySQL/NoSQL/NewSQL survey

I’ve just launched a new survey that should be of interest if you are currently using or actively considering MySQL or any of the NoSQL or NewSQL offerings

The aim of the survey is threefold:

– identify trends in database usage over time
– explore changing attitudes to MySQL following its acquisition by Oracle
– examine the competitive dynamic between MySQL and other database technologies, including NoSQL and NewSQL

There are just 12 questions to answer, spread over four pages, and the entire survey should take no longer than five minutes to complete.

All individual responses are of course confidential. The results will be published as part of a major research report due at the end of Q1. Thanks in advance for your participation.

The survey can be found at: http://www.surveymonkey.com/s/MySQLNoSQLNewSQL

How to to provide a strongly consistent distributed database and not break CAP Theorem

In the months since we coined the term NewSQL we have come to define it as referring to a new breed of relational database products designed to meet scalability requirements of distributed architectures, or improve performance so horizontal scalability is no longer a necessity, while maintaining support for SQL and ACID.

During the recent round of NoSQL Road Show events it has emerged that this description could be taken to suggest that NewSQL products are able to provide consistency, availability and partition tolerance and therefore contravene the common understanding of CAP Theorem that “a distributed system can satisfy any two of these guarantees at the same time, but not all three.”

How is possible to provide strongly consistent distributed systems and not break CAP Theorem?

For a start, CAP Theorem is not that simple. As others have pointed out – Cloudera’s Henry Robinson for example – CAP Theorem isn’t simply a case of “consistency, availability, partition tolerance. Pick two.”

In fact the father of CAP Theorem, Dr Eric Brewer, has clarified that the “2 of 3” explanation is misleading: “First, because partitions are rare, there is little reason to forfeit C or A when the system is not partitioned. Second, the choice between C and A can occur many times within the same system at very fine granularity; not only can subsystems make different choices, but the choice can change according to the operation or even the specific data or user involved. Finally, all three properties are more continuous than binary. Availability is obviously continuous from 0 to 100 percent, but there are also many levels of consistency, and even partitions have nuances, including disagreement within the system about whether a partition exists.”

We know that CAP is not simply a case of “pick two”, since while Amazon’s Dynamo (and the many NoSQL databases it has inspired) sacrifices consistency for availability, it does so with eventual consistency, not the total absence of consistency.

Clearly is possible to have systems that are partition tolerant, highly available and offer *a degree of consistency* (although as Fred Holahan points out, whether that degree is suitable for you particular workload is another matter).

Partition tolerance is not necessarily something that can be relaxed in the same manner – in fact the proof of CAP Theorem relies on an assumption of partition tolerance. As Yammer engineer Coda Hale explains: “Partition Tolerance is mandatory in distributed systems. You cannot not choose it.”

Daniel Abadi has previously explained how CAP is not really about choosing two of three states, but about answering the question “if there is a partition, does the system give up availability or consistency?”

Just as systems that sacrifice consistency retain a degree of consistency, Daniel also makes the point that systems that give up availability also do not do so in totality, noting that “availability is only sacrificed when there is a network partition.”

As such, Daniel makes the point that the roles of consistency and availability in CAP are asymmetric, and that latency is the forgotten factor that re-balances the equation.

Daniel has also returned to the issue of the tradeoff between latency and consistency in a more recent post, noting that, unlike availability vs consistency, “the latency vs. consistency tradeoff is present even during normal operations of the system.”

The Apache Cassandra wiki actually makes this point very well:

“The CAP theorem… states that you have to pick two of Consistency, Availability, Partition tolerance: You can’t have the three at the same time and get an acceptable latency. Cassandra values Availability and Partitioning tolerance (AP). Tradeoffs between consistency and latency are tunable in Cassandra. You can get strong consistency with Cassandra (with an increased latency).”

This suggests that you can, in fact, have consistency, partition tolerance and availability at the same time, but that latency will suffer. ScaleDB’s Mike Hogan made that argument earlier this year in describing the ‘CAP event horizon’ – “the point at which latency for a clustered system exceeds that which is acceptable and then you must decide what concessions you are willing to make”.

See also Brian Bulkowski’s explanation of how Citrusleaf can claim to deliver immediate consistency by relaxing availability in the event of partition failure: “During this period, Citrusleaf will seem less highly available – that is, latencies will be higher – until the reconfiguration completes. Transactions still flow during this period – they are queued and forwarded at different places in the client and in the servers – but the cluster has, in theoretical terms, lower availability.”

Like Citrusleaf’s ACID-compliant NoSQL database, NewSQL databases are not designed to avoid the CAP event horizon by being as available as eventually consistent systems – that *would* break CAP Theorem – but arguably they are designed to delay that CAP event horizon as much as possible by delivering systems that, in the event of a partition, are highly consistent and offer *a degree of availability*.

Whether that degree of availability is suitable for your application will depend on your tolerance – not for partitions but for latency.

Our big data/total data survey is now live

The 451 Group is conducting a survey into end user attitudes towards the potential benefits of ‘big data’ and new and emerging data management technologies.

Created in conjunction with TheInfoPro, a division of The 451 Group focused on real-world perspectives on the IT customer, the survey contains less than 20 questions and does not ask for details of specific projects. It does cover data volumes and complexity, as well as attitudes to emerging data management technologies – such as Hadoop and exploratory analytics, as well as NoSQL and NewSQL – for certain workloads.

In return for your participation, you will receive a copy of a forthcoming long-format report covering introducing Total Data, The 451 Group’s concept for explaining the changing data management landscape, which will include the results. Respondents will also have the opportunity to become members of TheInfoPro’s peer network.

The survey is expected to close in late October and we are also plan to provide a snapshot of the results in our presentation, The Blind Men and The Elephant, at Hadoop World in early November.

Many thanks in advance for your participation in this survey. We look forward to sharing the results with you. The survey can be found at http://bit.ly/451data