451 Research Hadoop survey is now live

If you’re using or considering using Hadoop, please help shape our understanding of global Hadoop usage by taking our 2013 Hadoop survey, which can be found at http://www.surveymonkey.com/s/451Hadoop

The aim of this survey is to identify trends in Hadoop usage, as well as attitudes to Hadoop as it relates to data warehousing.

There are a minimum of 15 questions to answer, and a maximum of 24 (including three optional questions) depending on your organisation’s level of adoption, and the entire survey should take no longer than fifteen minutes to complete.

Some of the specific aspects covered by the survey are:

  • Current and planned Hadoop usage
  • Responsibility for managing Hadoop clusters
  • Preferred infrastructure for Hadoop deployments
  • Hadoop and the data warehouse
  • Potential Hadoop improvements
  • Hadoop-as-a-Service
  • Hadoop hardware
  • Alternative file systems
  • SQL-on/in-Hadoop

All individual responses are of course confidential. The results will be published as part of a major research report due during Q4 which will include market sizing estimates for the analytic database sector, as well as Hadoop. The full report will be available to 451 Research clients, while the results of the survey will also be made freely available.

Thank you in advance for your participation.


CAOS Theory Podcast 2012.08.17

Topics for this podcast:

*Red Hat puts enterprise cred and bet on OpenStack
*LexisNexis touts open source benefits of Hadoop alternative
*Who doesn’t love Hadoop?
*Proprietary vendors siding with open source
*PostgreSQL and its cloud, commercial opportunity
*Our Hosting and Cloud Transormation Summit NA event

iTunes or direct download (32:24, 5.8MB)

Who doesn’t love Hadoop?

I tweeted recently that I had received a query from a journalist about whether Hadoop needs to go closed source to be fit for the enterprise.

Now that the resulting report has been published we can see who was behind that suggestion, with Brian Christian, Zettaset chief technology officer, arguing that “The community serves its needs, not the needs of the enterprise.”

The report also includes some, although naturally not all, of the response I provided to this suggestion, and since the report leaves a few misconceptions unanswered I thought I’d publish my more detailed response.

Hadoop is ‘free like a puppy’
Hadoop currently requires a degree of expertise to configure, manage and operate, but that statement is true for any serious data management technology. Apache Hadoop is relatively immature compared to some other established data management technologies, particularly in areas such as high availability, security and manageability. However, the development community is well-aware of its shortcomings and advances in all areas are currently in early access and should be ready for production deployment later this year.

Hadoop does require a degree of expertise to operate, and that expertise is currently at a premium and comes at a cost. However, all the major Hadoop supporters are working to train up a larger pool of Hadoop developers and administrators. Cloudera alone has trained more than 12,000 people to use Hadoop.

Apache Hadoop is a complex combination of data management technologies and is not without its challenges, which have arguably led to some enterprise taking longer to move from development and testing to deployment than they might have initially expected. However, the Hadoop development community is clearly committed to making Hadoop more suitable for enterprise adoption.

Hadoop is ‘driven by enthusiasts’
The idea that the open source community is populated by individual developers with no concern for enterprise requirements is completely bogus. The Apache Software Foundation has a proven history of developing enterprise-grade software projects through a collaborative development process that combines vendors, users and other interested parties.

The biggest contributors to Apache Hadoop include vendors such as Hortonworks, Cloudera, MapR and IBM, all of which have a vested interest in driving greater enterprise adoption, as well as users such as Yahoo, Facebook and eBay, all of which stand to gain from its improved capabilities.

On a broader note, open source development in general has a proven track record of producing enterprise-grade software. You only have to look at the success of Linux to see how rapidly open source software can be adopted by enterprises once it reaches a suitable level of maturity and has the support of commercial vendors. Hadoop is no exception, and is likely to follow in the footsteps of Linux as it matures.

Additionally, we see the open source nature of Hadoop as one of the adoption drivers – as users know that they can avoid vendor lock in and have a choice of providers for their Hadoop training, support and services.

Hadoop may need to be ‘taken out of open source’
There is no reason to believe that a closed source Hadoop would deliver any functionality that could not be developed by the Apache Hadoop community. While a number of vendors offer closed source alternatives for individual components in the Hadoop stack, anyone offering a fully closed source alternative would suffer by not being able to compete with the collaborative development process and competitive commercial ecosystem that the open source development process enables.

In addition it is worth noting that Hadoop, along with other distributed data management projects including many of the NoSQL databases, were initiated by organizations like Google, Amazon and Yahoo in response to the inability of the established data management vendors to fulfil their data management requirements.

The established closed source data management vendors have had plenty of time to develop a ‘better’ Hadoop than Hadoop, and do not lack development resources, but have chosen to collaborate with Hadoop distributors and contribute to Hadoop instead.

A prime example is Microsoft, which in late 2011 abandoned its own Dryad distributed computing project in favour of contributing to Apache Hadoop. This is a sign that Hadoop has already won enough attention to make it difficult for any competing product to gain traction.

While we see vendors offering closed source alternatives for individual components in the Hadoop stack we do not believe that a full closed source alternative would be viable, or desirable from a customer’s perspective. There is no reason to believe that enterprise-grade improvements to Hadoop cannot be delivered by the Apache Hadoop community and the open source development process.

CAOS Theory Podcast 2012.06.22

Topics for this podcast:

*Sauce Labs grows with fast Selenium application testing
*MySQL, NoSQL, NewSQL survey results and analysis
*Microsoft’s Linux love leaves out Red Hat
*Hadoop roundup with Cloudera, Hortonworks and VMware
*2012 Future of Open Source Survey highlights

iTunes or direct download (28:28, 5.1MB)

Open APIs are the new open source

We’ve seen the rise of open source software in the enterprise and also beyond the IT industry, but the real keys to openness and its advantages in today’s technology world — where efficient use of cloud computing and supporting services are paramount — exist in open application programming interfaces, or APIs.

Open source software continues to be a critical part of software development, systems administration, IT operations and more, but much of the action in leveraging modern cloud computing and services-based infrastructures centers on APIs. Open APIs are the new open source.

Read the full story at LinuxInsider.

451 CAOS Links. 2011.12.02

Talend delivers v5. Zentyal raises series A. The TCO of OSS. And more.

# Talend announced version 5 of its data integration suite, adding business process management capabilities via an OEM relationship with BonitaSoft. Yves De Montcheuil explained the name changes in version 5.

# Zentyal closed a series A venture capital funding of over $1m by Open Ocean Capital.

# The London School of Economics released a report on the total cost of ownership of open source software.

# Couchbase announced the availability of the Couchbase Hadoop Connector, developed in conjunction with Cloudera.

# Rackspace announced the private beta of Rackspace MySQL Cloud Database.

# The debate over the role of open source foundations in the Git era continued, including a follow-up by the instigator, Mikael Rogers, a rallying cry for autonomy from Ceki Gülcü, and Simon Phipps warning about throwing the baby out with the bathwater.

# Marco Abis is stepping down as CEO of Sourcesense.

# NGINX usage has grown almost 300% over the last year, according to Netcraft figures discussed by Royal Pingdom.

# The Wireless Innovation Forum announced the formation of the Open Source Framework for Commercial Baseband Software project.

451 CAOS Links 2011.11.29

Software foundations in the Git era. New funding for Puppet Labs. And more

# Mikeal Rogers’ post on the Apache Software Foundation’s slow response to the Git era prompted significant discussion, from Mike Milinkovich, Bradley M. Kuhn, Stephen Walli, Stephen O’Grady, Simon Phipps, and the ASF’s Jim Jagielski. Alternative you could just read this tweet.

# Puppet Labs raised $8.5m in series C funding from Cisco, Google Ventures, and VMware as well as Kleiner Perkins, True Ventures, and Radar Partners.

# YaCy, a free distributed search engine was launched.

# Alex Pinchev, Red Hat’s Executive Vice President of Sales, Services & Field Marketing, will be stepping down in January to become the chief executive officer of a data protection software company.

# Tasktop Technologies announced Tasktop Sync 2.0.

# Interesting statistics on Apache Hadoop adoption based on LinkedIn data, from NC State University’s Institute for Advanced Analytics.

451 CAOS Links 2011.11.18

Rapid7 secures new funding. Microsoft drops Dryad. And more.

# Rapid7 secured $50m in series C funding.

# Microsoft confirmed that it is ditching its Dryad project in favour of Apache Hadoop.

# Arun Murthy provided more details of Apache Hadop 0.23.

# The Google Plugin for Eclipse and GWT Designer projects are now fully open source.

# openSUSE released version 12.1.

# Amazon released the source code of the Kindle Fire.

# Black Duck Software joined the GENIVI Alliance.

# dotCloud announced the availability of the top three databases MySQL, MongoDB and Redis on its PaaS.

451 CAOS Links 2011.11.15

Funding for Vyatta and Hortonworks. Ice Cream Sandwich source code. And more.

# Vyatta raised $12m in new funding from HighBAR Partners and existing investors JPMorgan, Arrowpath Venture Partners and Citrix Systems.

# Index Ventures announced that it has invested in Hortonworks, reportedly as part of a substantial B round.

# Google released the source code to Ice Cream Sandwich.

# SugarCRM announced billings growth of 69% in Q3

# Apache Hadoop 0.23 has been released.

# Revolution Analytics announced the general availability of Revolution R Enterprise 5.0.

# Adobe and the Spoon Foundation are working together to donate the Flex SDK to an established open source foundation.

# Glyn Moody explained why Barnes & Noble is an open source hero.

# Red Hat added support for Jenkins, Maven and integration with JBoss Tools to its OpenShift Platform-as-a-Service.

# Zend Technologies announced the general availability of Zend Studio 9.0.

# WSO2 updated both the WSO2 Carbon enterprise middleware platform and WSO2 Stratos cloud middleware platform.

# Mozilla published Mozilla Public License Version 2.0, Release Candidate 2.

# DigitalPersona open sourced its new FingerJetFX fingerprint feature extraction technology.

# AquaFold launched AquaClusters.com, a new social collaboration tool for software developers that is free for open source developers.

# Xyratex joined Open Scalable File Systems (OpenSFS) as a formal member.

VC funding for Hadoop and NoSQL tops $350m

451 Research has today published a report looking at the funding being invested in Apache Hadoop- and NoSQL database-related vendors. The full report is available to clients, but non-clients can find a snapshot of the report, along with a graphic representation of the recent up-tick in funding, over at our Too Much Information blog.

CAOS Theory Podcast 2011.11.11

Topics for this podcast:

*Continuent extends MySQL replication to Oracle Database
*CFEngine updates server automation software
*Devops moving mainstream
*Neo Technology integrates with Spring
*451 CAOS report from Hadoop World

iTunes or direct download (26:56, 4.6MB)

451 CAOS Links 2011.11.11

B&N asks DoJ to investigate Microsoft patent tactics. Fedora 16. And more.

# Barnes & Noble asked the U.S. Department of Justice to investigate Microsoft’s patent-licensing tactics.

# The team behind Strobe is moving to Facebook. Sproutcore will continue as an independent project.

# The UK government’s Cabinet Office dispelled concerns about the security of open source software.

# The Fedora Project announced the availability of Fedora 16.

# Google offered support to Android firms in lawsuits.

# HStreaming updated its scalable continuous data analytics platform built on Hadoop.

# Dell is releasing its Apache Hadoop Crowbar barclamps as open source software.

# ActiveState added new management and monitoring features to ActiveState Stackato.

# Talend provided information on all contributions made by Talend to open source community projects.

# StackIQ announced the availability of Rocks+ 6.

451 CAOS Links 2011.10.21

Google unwraps Ice Cream Sandwich. Source code to follow. And more.

# Google and Samsung unveiled Galaxy Nexus, the first phone designed for Android 4.0, also known as Ice Cream Sandwich.

# Meanwhile Google indicated that it plans to publish the Ice Cream Sandwich source code soon after it is available on devices.

# BonitaSoft announced that it has surpassed one million downloads and now has more than 250 customers.

# Gemini Technologies joined the OpenStack community, bringing its Amazon S3 compatibility, provisioning and billing APIs to OpenStack.

# Canonical re-aligned its corporate and professional services.

# The Document Foundation announced the preliminary results of its board election.

# Cloudera released CDH3 update 2, adding Apache Mahout to its Cloudera Distribution Including Apache Hadoop.

# Cloudera also announced the new Cloudera University brand for its training and certification programs.

# Zend Technologies announced phpcloud.com and a partnership with 10gen including the integration of the MongoDB PHP driver with Zend Server

# Hadapt reportedly closed an $8m series A financing round – or is that $9.5m

# Bacula Systems announced the availability of its Linux bare metal restore feature.

# Virtustream added support for Red Hat Enterprise Virtualization to its xStream cloud platform

# The Outercurve Foundation announced the acceptance of the .Net Bio project into the Research Accelerators Gallery.

# ForgeRock announced a partnership with Radiant Logic to join RadiantOne’s Virtual Directory Server and OpenAM.

# OStatic published an introduction to Amdatu, an open cloud platform powered by Apache.

# Talend announced an expanded OEM partner program.

451 CAOS Links 2011.10.18

DOCOMO adopts, invests in Couchbase. Apache Cassandra reaches 1.0. And more.

# DOCOMO Innovations adopted Couchbase as DOCOMO Capital invested in the NoSQL database vendor.

# The Apache Software Foundation announced Apache Cassandra v1.0.

# Nuxeo announced the availability of Nuxeo Cloud.

# SGI formed a distribution relationship with Cloudera and announced a record-breaking performance benchmark.

# Rapid7 announced the launch of Metasploit Community Edition.

# VoltDB announced the general availability of VoltOne.

# Juniper Networks licensed OpenNMS to add fault and performance management capabilities to the Junos Space software platform.

# The Free Software Foundation warned against Microsoft’s “Secure Boot” system.

CAOS Theory Podcast 2011.10.14

Topics for this podcast:

*Our latest special report on ‘The Changing Linux Landscape’
*Oracle’s Hadoop-based appliance and big-data strategy
*Rackspace’s plan for the OpenStack Foundation
*2011 Q3 funding for open source companies
*Red Hat buys open source storage player Gluster

iTunes or direct download (27:38, 4.7MB)

451 CAOS Links 2011.10.14

Dennis Ritchie RIP. Microsoft adopts Hadoop. And more.

# Dennis Ritchie, creator of C and co-creator of Unix, died aged 70. This article from Joe Brockmeier puts his influence into perspective.

# Microsoft announced plans to team up with Hortonworks and the Apache Hadoop community to create a distribution of Hadoop for Windows Server and Windows Azure.

# Hortonworks explained why it decided to work with Microsoft to support its Hadoop plans.

# Black Duck Software closed a $12m round of financing led by new investor Split Rock Partners.

# OpenOffice.org e.V pleaded for financial support for the OpenOffice.org project, prompting a statement of clarification from the Apache Software Foundation

# Microsoft noted that The Advanced Message Queuing Protocol (AMQP) Working Group confirmed the availability of the AMQP 1.0 specification. Red Hat confirmed its support.

# Red Hat updated its JBoss Enterprise SOA Platform, JBoss Enterprise Data Services Platform and JBoss Enterprise Business Rules Management System product lines.

# Cloudera announced an integration partnership with MicroStrategy.

# Monsanto is creating is data integration and visualization platform based on the Cloudant suite.

# Samba can now accept code from corporations.

# VMware Micro Cloud Factory now includes PostgreSQL and RabbitMQ.

# Univa announced StackIQ will market, sell and support Univa Grid Engine to its customer and reseller channels.

# Openwave Systems is going to integrate Open-Xchange’s email technology into the Openwave Rich Mail product.

# X.commerce, a new business at eBay combining PayPal and Magento, joined the OpenStack community.

451 CAOS Links 2011.10.07

OpenStack Foundation. New Pentaho CEO. And more.

# Rackspace announced its intention to form an independent OpenStack Foundation.

# HP has chosen Ubuntu as the lead host and guest operating system for its Public Cloud.

# Pentaho appointed Quentin Gallivan as its new CEO.

# Hortonworks continued the discussion about contributions to Apache Hadoop.

# Bob Bickel explained why CloudBees is not, itself, open source.

# Google announced the limited preview release of Google Cloud SQL.

# Eucalyptus Systems, Nebula and Virtual Bridges joined the Linux Foundation.

# Dave Neary discussed the different types of community in relation to the Tizen project.

# Akamai joined the OpenStack community.

# Daniel Abadi provided his perspective on Oracle’s NoSQL Database.

# One more thing…
Apple’s relationship with open source may be somewhat tenuous – Paul Rooney provides some background – but given the impact Steve Jobs has made on the industry as a whole it seems wrong not to mark his passing in some way. We’ll leave the words to the company he created.

451 CAOS Links 2011.09.23

Red Hat revenue up 28% in Q2. Funding for NoSQL vendors. And more.

# Red Hat reported net income of $40m in the second quarter on revenue up 28% to $281.3m.

# 10gen raised $20m in funding, while DataStax closed an $11m series B round, while also releasing its DataStax Enterprise and Community products. Additionally Neo Technology raised $10.6m series A funding.

# Oracle announced the addition of new extended capabilities in MySQL Enterprise Edition. The move confirmed the adoption of the open core licensing strategy, and was both welcomed and derided.

# BonitaSoft announced an $11m series B funding round.\

# Platfora raised $5.7m in series A funding to accelerate development of its BI and analytics platform for data stored in Hadoop.

# EMC launched its EMC Greenplum Modular Data Computing Appliance, which includes both the Greenplum Database and Greenplum HD (Hadoop), and introduced the Greenplum Analytics Workbench, a test bed cluster for integration testing Apache Hadoop.

# Oracle acquired GoAhead Software, which offers a commercial distribution of OpenSAF.

# Ingres changed its name to Actian and launched its Action Apps and Cloud Action Platform.

# Richard Stallman asked ‘Is Android really free software?’. Predictably enough the answer is ‘no’. Carlo Daffara called FUD.

# LexisNexis Risk Solutions’ HPCC Systems released the source code for its HPCC Systems platform, and introduced a covenant to keep contributed code open source for three years.

# OpenStack released Diablo, the fourth version of its open source cloud software.

# The PostgreSQL Global Development Group announced the release of PostgreSQL 9.1.

# VoltDB announced the general availability of VoltDB version 2.0.

# Samsung is reportedly planning to release its Bada mobile operating system under an open source license.

# Karmasphere updated its Karmasphere Analyst Big Data analytics product with new workflow capabilities for Apache Hadoop.

# The Open Virtualization Alliance now has more than 200 members.

# The Outercurve Foundation announced the acceptance of the GADS open source project into its Data, Language and System Interoperability Gallery.

# Openbravo announced that customer deployments of its ERP product on Amazon have increased over 187% in the last 12 months.

# The Apache Software Foundation confirmed Apache Whirr as a top-level project.

# Qt gained more independence from Nokia.

# SUSE Linux Enterprise Server has been selected for Use with SAP HANA.

# Red Hat Enterprise Linux 6 was certified by SAP to run SAP business applications, as well as support for SAP running on Red Hat Enterprise Linux on Amazon EC2.

# 10gen’s MongoDB was chosen by SAP as a core component of SAP’s platform-as-a-service (PaaS) offering.

# Puppet Labs announced Puppet Enterprise 2.0.

# Microsoft added Casio to its list of Linux-related patent agreement signees.

# Dries Buytaert explained why Acquia acquired Cyrve and GVS and addressed concern that Acquia is sucking up all the Drupal talent.

# Medsphere Systems announced the generally availability of the enhanced OpenVista electronic health record (EHR) platform.

# Stormy Peters asked whether open source is excluding high context cultures.

# OpenIndiana’s fork of OpenSolaris added support for the Illumos kernel.

# Cenatic released the results of its research into public administration involvement in open source communities.

# Spring Roo is shifting to be 100% Apache licensed.

# VLC developers are looking for anyone who has contributed to libVLC so that they can approve the change in licence from GPLv2 to LGPLv2.

# Virtual Bridges joined OpenStack.

# Github now has over one million users.

# Splunk open sourced the code for docs.splunk.com.

451 CAOS Links 2011.09.02

Strobe launches App Delivery Network. Red Hat launches Aeolus. And more.

# Strobe launched a new platform that helps developers build HTML5-based Web applications for desktops, smartphones and tablets

# Red Hat launched Aeolus, an open source project for managing virtual machines across private and public clouds.

# Jedox introduced Community Edition 3.2 of its Palo open source business intelligence software.

# Eric Baldeschwieler blogged about best practices for selecting Apache Hadoop hardware.

# Dave Neary discussed the cost of going it alone in modifying and maintaining free software.

# The H published a conversation with Jeremy Allison of Samba.

451 CAOS Links 2011.08.23

Engine Yard acquires Orchestra. Red Hat considers NoSQL move. And more.

# Engine Yard announced a definitive agreement to acquire Orchestra, bringing PHP expertise to the Engine Yard platform.

# Red Hat’s CEO indicated the company is interested in a NoSQL or Hadoop acquisition.

# Gluster announced Apache Hadoop compatibility in the next GlusterFS release.

# Microsoft signed an agreement with China Standard Software Co (CS2C) to support CS2C NeoKylin Linux Server running on Microsoft’s Hyper-V.

# Mitchell Baker kicked off a discussion on Mozilla’s future.

# Hortonworks announced that the next generation of Apache Hadoop MapReduce has been merged to the Apache Hadoop mainline

# Rapid7 offered a $100,000 investment fund to support the development of up to seven promising open source projects in the security industry.

# ReadWrite Enterprise examined the changing Linux landscape, covering Jay Lyman’s recent LinuxCon session.

# CloudBees announced a partnership with MongoHQ, a provider of cloud-based MongoDB data hosting and services.

# Twitter announced Bootstrap, a front-end toolkit for rapidly developing web applications.

# The UK Cabinet Office began building its open source strategy on proprietary software.