The Data Day: August 30, 2016

What happened in data and analytics this week will astound you

For 451 Research clients: NewSQL databases: a definitive guide http://bit.ly/2c7wJPu

For 451 Research clients: With Atlas, MongoDB jumps into hosted NoSQL DBaaS waters http://bit.ly/2c7wXq4 By Jim Curtis

For 451 Research clients: DataStax rolls graph into DSE 5.0, highlights NoSQL multi-models http://bit.ly/2bOKF2I By Jim Curtis

For 451 Research clients: Looker illuminates its analytics business and platform strategy http://bit.ly/2c7z396 By Krishna Roy

For 451 Research clients: Arcadia looks to simplify security for Hadoop analysis, lands Rackspace as reseller http://bit.ly/2c7wwvZ By Krishna Roy

For 451 Research clients: With $30m in funding, Vena looks to expand its cloud service to an enterprise platform http://bit.ly/2c7x3Oq By Krishna Roy

For 451 Research clients: Stitch emerges from RJMetrics with ETL as a service following cloud BI sale http://bit.ly/2c7wW5m

For 451 Research clients: CMC Markets sees promising self-service results with new ‘Google-like’ BI tool http://bit.ly/2c7xPuK By Jason Stamper

Tableau appointed Adam Selipsky as new CEO http://tabsoft.co/2bONxN1

Splunk reported a net loss of $86.6m on Q2 revenue up 43% to $212.8m http://splk.it/2bON1P5

Salesforce signs agreement to acquire BeyondCore http://bit.ly/2bOOZPK

Magnitude Software acquired Simba Technologies http://mwne.ws/2bONqRI

SAP is reportedly acquiring Altiscale for over $125m http://bit.ly/2bON8dE

Syncsort acquires Cogito to enhance mainframe data access http://prn.to/2c7ynRb

Galactic Exchange closes seed financing round http://bit.ly/2bOMxZw

Teradata makes Aster Analytics available on Hadoop and Teradata Aster Analytics on Amazon Web Services http://prn.to/2c7y5Kh

JSON support is generally available in Azure SQL Database http://bit.ly/2c7xVT9

Red Hat launches Red Hat Virtualization 4 http://red.ht/2bOMJb0

SnapLogic launches Summer 2016 release of its SnapLogic Elastic Integration Platform http://bit.ly/2bOOmpq

AWS releases Amazon Kinesis Analytics http://bit.ly/2bOOwNd

AWS licenses SQLstream technology for Amazon Kinesis Analytics service http://bit.ly/2c7CZaa

Percona delivers open source in-Memory storage engine for Percona Server for MongoDB http://bit.ly/2bOPIjH

Riversand launches MDMCenter v7.8 http://bit.ly/2c7zDUl

WANdisco announces the release of WANdisco Fusion 2.9 http://bit.ly/2bOOapU

And that’s the data day, today.

The Data Day, A few days: January 2-9 2015

GraphLab changes name to Dato, raises $18.5m. And more

And that’s the data day, today.

The Data Day, A few days: August 1-7 2013

MySQL, NoSQL, NewSQL, DBaaS market sizing. And more

Sizing the opportunities for MySQL, NoSQL, NewSQL and DBaaS

451 Research has recently published an update to our market sizing estimates for the MySQL ecosystem, NoSQL and NewSQL sectors, adding coverage of the database-as-a-service market.

The report, Next-Generation Operational Databases: 2012-2016, can be found here and provides estimates for the size of the aggregate market and each market sector, as well as competitive landscape maps. It also includes a growth forecast for each sector, and highlights the opportunities and threats facing participating vendors.

splash

The key findings are also available in the a short, free presentation (registration required), which can be found here, and provides details of how the MySQL, NoSQL and DBaaS sectors are each expected to grow to generate revenue in excess of $1bn by 2016.

The Data Day, A few days: July 24-31 2013

Next-Gen DB market sizing. Total Data Integration. And more.

And that’s the data day, today.

Forthcoming webinar: The New Path to Performance. No Sharding!

On Tuesday March 26th at 10am PT I’ll be taking part in a webinar with NuoDB on the subject of The New Path to Performance. No Sharding!

As part of the webinar I’ll be explaining the various strategies used by enterprises to attempt to achieve scalability of relational databases, why they fail to meet modern distributed processing requirements, and why companies are increasingly open to looking at alternatives to the traditional relational database.

Wiqar Chaudry from NuoDB will also be discussing how to eliminate technical acrobatics, including:

Sharding
Clustering
Performance tuning
Replication
And other kinds of 20th century database tricks.

To register, click http://go.nuodb.com/no-sharding-webinar-register-s.html

Forthcoming webinar: How to Take Advantage of NewSQL in the Cloud

On February 21, at 10:00am PST / 1:00pm EST, I’ll be taking part in a webinar – How to Take Advantage of NewSQL in the Cloud – in conjunction with Clustrix.

In this free webinar I, along with Mark Sarbiewski, Clustrix CMO, will discuss:

  • The current cloud database inflection point – and how that affects you and your company
  • How to migrate your SQL database to the cloud
  • How to get effortless scale from your database in public or private clouds
  • How to ensure database availability in the cloud for business critical applications

For full details and registration, click here.

Neither fish nor fowl: the rise of multi-model databases

One of the most complicated aspects of putting together our database landscape map was dealing with the growing number of (particularly NoSQL) databases that refuse to be pigeon-holed in any of the primary databases categories.

I have begun to refer to these as “multi-model databases” in recognition of the fact that they are able to take on the characteristics of multiple databases. In truth though there are probably two different groups of products that could be considered “multi-model”:

True multi-model databases that have been designed specifically to serve multiple data models and use-cases

Examples include:
FoundationDB, which is being designed to support ACID and NoSQL, but more to the point in this instance, multiple layers including key-value, document, and object layers

Aerospike, which is planning to combine SQL, key value, and document and graph database technologies in a single database by bringing together its Citrusleaf NoSQL database with the acquired AlchemyDB NewSQL project

OrientDB, which is, at heart, a document database, but can also be used as a graph database; as an object database, making use of the Java persistence API; and as a hybrid database, taking advantage of multiple models to serve different application requirements

ArangoDB, which promises to deliver the benefits of key value and document and graph stores in a single database

Other products that could be considered true multi-model databases are:
Couchbase Server 2.0, which can be used as both a document store and a key value store, as well as a distributed cache

Riak, which is a key-value store, although it can be used as a document store since the value can be a JSON document

NuoDB, which will provide compatibility with other databases by taking on multiple ‘personalities’ – an Oracle personality via PL/SQL compatibility is in the development roadmap, as is a document store personality via JSON support.

General-purpose databases with multi-model options
What’s the difference between multi-model databases and existing general-purpose databases that have optional capabilities for serving multiple models? My book book it’s about being designed for purpose, but I’m sure that will be a debating point for the future. In the mean-time, examples include:

Oracle MySQL 5.6, which can support both SQL-based access and key-value access via the Memcached API.

Oracle MySQL Cluster 7.2, which similarly supports concurrent NoSQL and SQL access to the database.

IBM DB2 10, which extends DB2’s hybrid relational and XML engine to enable the storage and management of graph triples, as well as support for the SPARQL 1.0 query language.

Akiban Server, which has the ability to treat groups of tables as objects and access them as JSON documents via SQL.

PostgreSQL h-store, which can be used for storing key-value pairs within a PostgreSQL data field, thereby enabling schema-less queries against data stored in PostgreSQL

We are also aware of other NewSQL database that plan to adopt support for popular NoSQL data models, while IBM has also talked about plans to integrate key value store NoSQL access capabilities with DB2 and Informix database software.

Other products that could be considered multi-model options include:
Oracle Spatial and Graph, an option for Oracle Database 11g.

One of the drivers of NoSQL database adoption has been polyglot persistence – using multiple databases depending on the specific requirements of individual applications. Multi-model databases contradict this trend, to some extent, so it will be interesting to see whether they begin to gain traction.

While we see the wisdom of selecting the best database for the job, we also recognise that it could sometimes be a matter of choosing the best data model for the job, while relying on a single storage back-end.

Cloud databases, or database on the cloud?

As 2012 came to a close I tweeted

NuoDB has today kicked off that debate with the launch of its Cloud Data Management System and 12 rules for a 21st century cloud database.

NuoDB’s 12 rules appear pretty sound to me – in fact you could argue they are somewhat obvious. This is actually to NuoDB’s credit in my opinion, in that they haven’t simply listed 12 differentiating aspects of their product, but 12 broader requirements.

Either way, I believe that this is the right time to be debating what constitutes a “cloud database”. Database on the cloud are nothing new, but these are existing relational database products configured to run on the cloud.

In other words, they are databases on the cloud, not databases of the cloud. There is a significant difference between spinning up a relational database in a VMI on the cloud versus deploying a database designed to take advantage of, enable, and be part of, the cloud.

To me, a true cloud database would be one designed to take advantage of and enable elastic, distributed architecture. NuoDB is one of those, but it won’t be the only one. Many NoSQL databases could also make a claim, albeit not for SQL and ACID workloads.

This isn’t a matter of SQL versus NoSQL, however. We’ve seen companies building their own next-generation database platforms deploying NoSQL and SQL technologies alongside each other for different workload and consistency requirements. Where the SQL layer falls down is the inability of existing relational databases to support elastic, geographically distributed cloud environments.

NuoDB believes it has a solution to that. So too do others including GenieDB, Translattice and VMware. Meanwhile Google’s F1 and Spanner projects have legitimized the concept of the globally-distributed SQL database.

Either way, the era of the relational cloud database – rather than the relational database on the cloud – has begun.

Our 2013 Database survey is now live

451 Research’s 2013 Database survey is now live at http://bit.ly/451db13 investigating the current use of database technologies, including MySQL, NoSQL and NewSQL, as well as traditional relation and non-relational databases.

The aim of this survey is to identify trends in database usage, as well as changing attitudes to MySQL following its acquisition by Oracle, and the competitive dynamic between MySQL and other databases, including NoSQL and NewSQL technologies.

There are just 15 questions to answer, spread over five pages, and the entire survey should take less than ten minutes to complete.

All individual responses are of course confidential. The results will be published as part of a major research report due during Q2.

The full report will be available to 451 Research clients, while the results of the survey will also be made freely available via a
presentation at the Percona Live MySQL Conference and Expo in April.

Last year’s results have been viewed nearly 55,000 times on SlideShare so we are hoping for a good response to this year’s survey.

One of the most interesting aspects of a 2012 survey results was the extent to which MySQL users were testing and adopting PostgreSQL. Will that trend continue or accelerate in 2013? And what of the adoption of cloud-based database services such as Amazon RDS and Google Cloud SQL?

Are the new breed of NewSQL vendors having any impact on the relational database incumbents such as Oracle, Microsoft and IBM? And how is SAP HANA adoption driving interest in other in-memory databases such as VoltDB and MemSQL?

We will also be interested to see how well NoSQL databases fair in this year’s survey results. Last year MongoDB was the most popular, followed by Apache Cassandra/DataStax and Redis. Are these now making a bigger impact on the wider market, and what of Basho’s Riak, CouchDB, Neo4j, Couchbase et al?

Additionally, we have been tracking attitudes to Oracle’s ownership of MySQL since the deal to acquire Sun was announced. Have MySQL users’ attitudes towards Oracle improved or declined in the last 12 months, and what impact will the formation of the MariaDB Foundation have on MariaDB adoption?

We’re looking forward to analyzing the results and providing answers to these and other questions. Please help us to get the most representative result set by taking part in the survey at http://bit.ly/451db13