The Data Day, A few days: September 2-6 2013

Where database startups go to die. And more.

And that’s the data day, today.

The Data Day, A few days: April 29-May 3 2013

Teradata Q1 disappoints. Actian acquires ParAccel. And more.

And that’s the data day, today.

The Data Day, Two days: February 7/8 2013

Teradata results. Funding for DataXu. The chemistry of data. And more.

And that’s the data day, today.

The Data Day, Today: Feb 8 2012

SAP targets HANA at SMEs. WibiData raises $5m. Zimory acquires Sones devs. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* SAP to Arm Small and Midsize Enterprises With Real-Time Analytics Powered by SAP HANA

* Hadoop startup WibiData raises $5M to power web analytics

* Zimory Acquires Database Development Team from Sones

* Oracle Announces Availability of Oracle Advanced Analytics for Big Data

* Kalido Fuels Growth with New Customers, Market Leading Data Governance Capabilities in 2011

* Xeround Announces Free Version of Popular Cloud Database

* Hypertable Inc. Announces New Products and Services for Next Generation Hadoop NoSQL Database Deployments

* Cloudera Connector for Tableau Has Been Released

* Information Builders Launches WebFOCUS Hyperstage to Speed Performance and Delivery of Business Intelligence

* Actian Releases Vectorwise Workgroup Edition, Claims Best in Affordable Big Data Analytics to Mid-Market

* 10gen and Carahsoft Partner to Bring Leading NoSQL Solution to Government Sector

* MySQL progress in a year

* Endeca CEO: We wanted IPO, but Oracle acquisition gave peace of mind

* For 451 Research clients

# Armed with fresh funding, Tidemark looks to churn up performance management waters Impact Report

# Cloudant seizes opportunity for greater involvement with CouchDB Market Development report

# Xeround details cloud database pricing, launches free option Market Development report

And that’s the Data Day, today.

The Data Day, Today: Jan 10 2012

Oracle OEMs Cloudera. The future of Apache CouchDB. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Oracle announced the general availability of Big Data Appliance, and an OEM agreement with Cloudera for CDH and Cloudera Manager.

* The Future of Apache CouchDB Cloudant confirms intention to integrate the core capabilities of BigCouch into Apache CouchDB.

* Reinforcing Couchbase’s Commitment to Open Source and CouchDB Couchbase CEO Bob Wiederhold attempts to clear up any confusion.

* Hortonworks Appoints Shaun Connolly to Vice President of Corporate Strategy Former vice president of product strategy at VMware.

* Splunk even more data with 4.3 Introducing the latest Splunk release.

* Announcement of Percona XtraDB Cluster (alpha release) Based on Galera.

* Bringing Value of Big Data to Business: SAP’s Integrated Strategy Forbes interview with with Sanjay Poonen, President and corporate officer of SAP Global Solutions.

* New Release of Oracle Database Firewall Extends Support to MySQL and Enhances Reporting Capabilities Self-explanatory.

* Big data and the disruption curve “Many efforts are being funded by business units and not the IT department and money is increasingly being diverted from large enterprise vendors.”

* Get your SQL Server database ready for SQL Azure Microsoft “codename” SQL Azure Compatibility Assessment.

* An update on Apache Hadoop 1.0 Cloudera’s Charles Zedlewski helpfully explains Apache Hadoop branch numbering.

* Xeround and the CAP Theorem So where does Xeround fit in the CAP Theorem?

* Can Yahoo’s new CEO Thompson harness big data, analytics? Larry Dignan thinks Scott Thompson might just be the right guy for the job.

* US Companies Face Big Hurdles in ‘Big Data’ Use “21% of respondents were unsure how to best define Big Data”

* Schedule Your Agenda for 2012 NoSQL Events Alex Popescu updates his list of the year’s key NoSQL events.

* DataStax take Apache Cassandra Mainstream in 2011; Poised for Growth and Innovation in 2012 The usual momentum round-up from DataStax.

* Objectivity claimed significant growth in adoption of its graph database, InfiniteGraph and flagship object database, Objectivity/DB.

* Cloudera Connector for Teradata 1.0.0 Self-explanatory.

* For 451 Research clients

# SAS delivers in-memory analytics for Teradata and Greenplum Market Development report

# With $84m in funding, Opera sets out predictive-analytics plans Market Development report

* Google News Search outlier of the day: First Dagger Fencing Competition in the World Scheduled for January 14, 2012

And that’s the Data Day, today.

What we talk about when we talk about NewSQL

Yesterday The 451 Group published a report asking “How will the database incumbents respond to NoSQL and NewSQL?”

That prompted the pertinent question, “What do you mean by ‘NewSQL’?”

Since we are about to publish a report describing our view of the emerging database landscape, including NoSQL, NewSQL and beyond (now available), it probably is a good time to define what we mean by NewSQL (I haven’t mentioned the various NoSQL projects in this post, but they are covered extensively in the report. More on them another day).

“NewSQL” is our shorthand for the various new scalable/high performance SQL database vendors. We have previously referred to these products as ‘ScalableSQL’ to differentiate them from the incumbent relational database products. Since this implies horizontal scalability, which is not necessarily a feature of all the products, we adopted the term ‘NewSQL’ in the new report.

And to clarify, like NoSQL, NewSQL is not to be taken too literally: the new thing about the NewSQL vendors is the vendor, not the SQL.

So who would be consider to be the NewSQL vendors? Like NoSQL, NewSQL is used to describe a loosely-affiliated group of companies (ScaleBase has done a good job of identifying, some of the several NewSQL sub-types) but what they have in common is the development of new relational database products and services designed to bring the benefits of the relational model to distributed architectures, or to improve the performance of relational databases to the extent that horizontal scalability is no longer a necessity.

In the first group we would include (in no particular order) Clustrix, GenieDB, ScalArc, Schooner, VoltDB, RethinkDB, ScaleDB, Akiban, CodeFutures, ScaleBase, Translattice, and NimbusDB, as well as Drizzle, MySQL Cluster with NDB, and MySQL with HandlerSocket. The latter group includes Tokutek and JustOne DB. The associated “NewSQL-as-a-service” category includes Amazon Relational Database Service, Microsoft SQL Azure, Xeround, and FathomDB.

(Links provide access to 451 Group coverage for clients. Non-clients can also apply for trial access).

Clearly there is the potential for overlap with NoSQL. It remains to be seen whether RethinkDB will be delivered as a NoSQL key value store for memcached or a “NewSQL” storage engine for MySQL, for example. While at least one of the vendors listed above is planning to enable the use of its database as a schema-less store, we also expect to see support for SQL queries added to some NoSQL databases. We are also sure that Citrusleaf won’t be the last NoSQL vendor to claim support for ACID transactions.

NewSQL is not about attempting to re-define the database market using our own term, but it is useful to broadly categorize the various emerging database products at this particular point in time.

Another clarification: ReadWriteWeb has picked up on this post and reported on the “NewSQL Movement”. I don’t think there is a movement in that sense that we saw the various NoSQL projects/vendors come together under the NoSQL umbrella with a common purpose. Perhaps the NewSQL players will do so (VoltDB and NimbusDB have reacted positively to the term, and Tokutek has become the first that I am aware of to explicitly describe its technology as NewSQL). As Derek Stainer notes, however: ” In the end it’s just a name, a way to categorize a group of similar solutions.”

In the meantime, we have already noted the beginning for the end of NoSQL, and the lines are blurring to the point where we expect the terms NoSQL and NewSQL will become irrelevant as the focus turns to specific use cases.

The identification of specific adoption drivers and use cases is the focus of our forthcoming long-form report on NoSQL, NewSQL and beyond, from which the 451 Group reported cited above is excerpted.

The report contains an overview of the roots of NoSQL and profiles of the major NoSQL projects and vendors, as well as analysis of the drivers behind the development and adoption of NoSQL and NewSQL databases, the evolving role of data grid technologies, and associated use cases.

It will be available very soon from the Information Management and CAOS practices and we will also publish more details of the key drivers as we see them and our view of the current database landscape here.

Scalable SQL: more than the mullet of the database world?

In the first part of our coverage on emerging database products and vendors we examined the new NoSQL databases and suggested that the incumbent database vendors would likely respond to the growing threat with a mix of in-memory and distributed caching technologies.

That is yet to happen, although it has only been a few months and the NoSQL databases have generated more noise than revenue at this stage, but in the meantime a new set of database vendors and products have emerged that could pose a more direct threat to the database incumbents while thwarting the potential of the NoSQL upstarts.

For want of a better phrase we have taken to referring to these products collectively as scalable SQL databases, and have just published a new spotlight report pulling together our various reports on the runners and riders.

Some of the vendors promise to deliver the scalability and flexibility promised by NoSQL while retaining the support for SQL queries and/or ACID (atomicity, consistency, isolation, durability). That is not an insignificant boast and it will be tough to offer the best of both worlds.

“SQL For Business, NoSQL For Partay!” is the explanation offered by MulletDB, a project that promises scalability and SQL queries. The danger is the scalable SQL ends up being the database equivalent of the celebrated mullet hairstyle or its business attire equivalent: the jacket and jeans.

One of the companies trying to avoid that problem is GenieDB (coverage) The London-based company’s GenieDB Engine is a fully replicated distributed database that combines a key-value store database with a ‘sharded’ memcached layer. Another example is Clustrix, which was founded in December 2006 to develop a new database appliance that would offer both scalability and durability in a single product.

Meanwhile VoltDB emerged earlier this summer with a transactional database management system that is designed to scale across clusters of industry-standard servers while retaining transactional integrity.

Additionally Xeround has recently confirmed its intention to reposition its Intelligent Data Grid (IDG) technology as Xeround Data Service, a scalable SQL database with support for ACID-compliant transactional capabilities for cloud computing environments, while New Technology/enterprise’s CloudTran, is designed to bring enterprise-level transaction management to GigaSpaces’ XAP in-memory data grid for on-premises deployment, and eventually any PaaS offering.

Meanwhile we are intrigued by VMware’s acquisiton of distributed data management vendor GemStone and its positioning of GemFire as a next-generation data management layer for cloud applications, as well as the forthcoming introduction of SQL querying in GigaSpaces’ eXtreme Application Platform (XAP), which will enable in-memory management of relational data and initiatives.

It is very early stages for all these vendors, and they have yet to prove that they have truly solved the problem of consistency and partition tolerance. In the meantime there are plenty of other contenders waiting in line.

Akiban is promising that it has the secret to SQL scalability with an approach that pre-groups data in order to overcome latency, caching and data distribution issues. Another company currently in stealth mode is JustOne Database which is working on perfecting a new storage model in order to deliver the performance and scalability required to support transactions and analytics on the same data simultaneously.

That is also the goal of Tokutek, which offers the TokuDB MySQL storage engine is based on Fractal Tree indexing technology designed to reduce data-insertion times and improve the performance of MySQL for both read and write applications.

JustOne and Tokutek are part of a slightly different set of vendors we are viewing under the scalable SQL umbrella: those that promise to improve performance for appropriate workloads to the extent that the advanced scale-out capabilities promised by some NoSQL databases become irrelevant.

While we’re on the subject of existing database vendors that could be considered part of the scalable SQL set, it is also worth mentioning MarkLogic. The company has recently been| associating itself with NoSQL and while the fact that it does not support SQL makes it a better literal fit with NoSQL the company’s support for ACID means that we would see it as an option for customers looking to improve performance without losing consistency, especially for unstructured or semi-structured data.*

As we previously noted; to some degree, the rise of NoSQL has resulted from the inability of the MySQL database to scale consistently. It is no surprise to see many of the scalable SQL vendors promising to improve the performance and scalability of MySQL, therefore, while others promote a clean-slate approach to address new big data management problems.

We have more details on each of the products and projects, mentioned above (as well as some not mentioned) their potential use cases, how they relate to MySQL, and what potential impact they may have on the adoption of NoSQL technologies, in the full report.

This is very much the start of our coverage of these vendors however. Expect more coverage in the near future, as well as a wider perspective on the potential for alternatives to the incumbent database suppliers, into 2011.

*Additionally, since the absence of SQL is only really tangential to many of the projects and products referred to as NoSQL it seems to me to be appropriate to have a database that does not support SQL in the scalable SQL category.