7 Hadoop questions. Q3: Hadoop responsibility

Who is responsible for managing Hadoop clusters within your organisation? That’s one of the primary questions being asked in the 451 Research 2013 Hadoop survey.

hadoop-elephant

While many organisations are adopting or evaluating Hadoop, our research to date indicates that much of that adoption is tactical rather than strategic at this stage.

As such much of the adoption is led by functional business units (particularly the marketing department), rather than the central IT organisation.

However, as adoption of Hadoop grows, we increasingly see central IT departments looking to standardise their approach to Hadoop with reference configurations and strategic relationships with key suppliers.

survey

The early responses to our Hadoop survey are therefore interesting: so far a higher proportion of respondents indicate that the central IT department is (or will be in the case of those in the process of adoption) responsible for managing Hadoop clusters within their organisation.

This is quite surprising to us. We’ll have to dig into the results to evaluate properly once the survey closes, but interestingly the early results show that the proportion indicating that the central IT department is/will be responsible for managing Hadoop is significantly higher among respondents that are still in the process of evaluating and deploying Hadoop, compared to those that already have deployments up and running.

This would suggest that the next generation of Hadoop adopters are taking a more strategic view of the data processing platform.

To give your view on this and other questions related to the adoption of Hadoop, please take our 451 Research 2013 Hadoop survey.

7 Hadoop questions. Q2: Hadoop infrastructure choices

What is your preferred infrastructure for Hadoop deployments? That’s one of the primary questions being asked in the 451 Research 2013 Hadoop survey. The answer will have significant implications for the future direction of Hadoop.

hadoop-elephant

While one of the primary benefits of Hadoop – low cost data storage – means that for many organisations the primary infrastructure for Hadoop has been commodity hardware, many systems and storage vendors now offer their own dedicated appliances and/or reference architecture for Hadoop.

We expect to see more of these dedicated Hadoop configurations as the incumbent infrastructure vendors look to cash-in on Hadoop adoption and try to add greater value.

We also see some companies exploring the potential for Hadoop in the cloud, as well as hosted deployments, and on virtual infrastructure – although those are arguably in the early stages of technical maturity, and adoption.

survey

Which infrastructure configurations are most popular? That’s one of the things our survey is designed to find out. The early results perhaps unsurprisingly indicate a greater preference for Hadoop being deployed on commodity hardware. However, cloud and virtual deployments have also scored well.

Interestingly, the early results show the preference for Hadoop on cloud infrastructure is significantly higher among respondents that are still in the development and test stage with Hadoop, which supports our anecdotal evidence about the use-cases for Hadoop in the cloud.

In order to get a little more detail on deployment preferences, the survey also asks about the level of consideration, testing and adoption for dedicated Hadoop hardware and Hadoop-as-a-service offerings respectively.

Among the choices in the dedicated hardware category are offerings from DataDirect Networks, Dell, HP, Oracle, IBM, Pivotal, Teradata, Cisco and NetApp.

The choices in the Hadoop-as-a-service category include Altiscale, Amazon EMR (including MapR), MapR on Google Compute Engine, Microsoft Windows Azure HDInsight Service, Mortar Data, Qubole, Rackspace Big Data, SunGard Unified Analytics Services and Treasure Data.

To give your view on this and other questions related to the adoption of Hadoop, please take our 451 Research 2013 Hadoop survey.

The Data Day, A few days: September 7-13 2013

Google confirms move to MariaDB. SAP acquires KXEN. And more.

And that’s the data day, today.

7 Hadoop questions. Q1: Hadoop and the data warehouse

What is the relationship between Hadoop and the data warehouse? That’s one of the primary questions being asked in the 451 Research 2013 Hadoop survey. Through our conversations with Hadoop users to date we’ve seen that the answer to that question differs from company to company, depending on how far advanced they are in terms of their adoption.

hadoop-elephant

For the most part we see that Hadoop is being used for workloads that were not previously on the data warehouse as part of a strategy of storing, processing and analyzing data that was previous ignored due to being unsuitable – either in terms of cost or data format – for analysis using a relational data warehouse.

However, we also see some companies taking advantage of the cost advantages of storing data in Hadoop to offload workloads from the data warehouse, either temporarily or permanently.

And at the other end of the spectrum we also see companies in which Hadoop is being used, or at least considered at this stage, as a replacement for the data warehouse.
survey

Which use-cases are most popular? That’s one of the things our survey is designed to find out. The early results indicate a greater preference for Hadoop being used for workloads that were not previously on the data warehouse and also Hadoop being used to permanently migrate some workloads from the data warehouse, but it is still early stages.

While that accounts for the way in which Hadoop is being used today, it doesn’t get to the heart of the long-term potential for Hadoop in relation to the data warehouse. Therefore, the survey also asks about the long-term potential to replace the data warehouse.

Again we see a spectrum of strategies in action, from some companies planning for Hadoop to eventually completely replace the data warehouse, through some moving the majority of workloads to Hadoop, through others moving a minority of workloads to Hadoop, to those that believe Hadoop will never replace the data warehouse.

Again the early survey results are interesting, with ‘a minority of workloads will move to Hadoop’ and ‘Hadoop will never replace the data warehouse’ the most popular answers at this early stage.

To give your view on this and other questions related to the adoption of Hadoop, please take our 451 Research 2013 Hadoop survey.

The Data Day, A few days: September 2-6 2013

Where database startups go to die. And more.

And that’s the data day, today.

451 Research Hadoop survey is now live

If you’re using or considering using Hadoop, please help shape our understanding of global Hadoop usage by taking our 2013 Hadoop survey, which can be found at http://www.surveymonkey.com/s/451Hadoop

The aim of this survey is to identify trends in Hadoop usage, as well as attitudes to Hadoop as it relates to data warehousing.

There are a minimum of 15 questions to answer, and a maximum of 24 (including three optional questions) depending on your organisation’s level of adoption, and the entire survey should take no longer than fifteen minutes to complete.

Some of the specific aspects covered by the survey are:

  • Current and planned Hadoop usage
  • Responsibility for managing Hadoop clusters
  • Preferred infrastructure for Hadoop deployments
  • Hadoop and the data warehouse
  • Potential Hadoop improvements
  • Hadoop-as-a-Service
  • Hadoop hardware
  • Alternative file systems
  • SQL-on/in-Hadoop

All individual responses are of course confidential. The results will be published as part of a major research report due during Q4 which will include market sizing estimates for the analytic database sector, as well as Hadoop. The full report will be available to 451 Research clients, while the results of the survey will also be made freely available.

Thank you in advance for your participation.

http://www.surveymonkey.com/s/451Hadoop

The Data Day, A few days: August 23-30 2013

Couchbase raises $25m. 10gen becomes MongoDB. And more.

And that’s the data day, today.

The Data Day, A few days: August 1-7 2013

MySQL, NoSQL, NewSQL, DBaaS market sizing. And more

The Data Day, A few days: July 24-31 2013

Next-Gen DB market sizing. Total Data Integration. And more.

And that’s the data day, today.

Forthcoming Webinar: Get Down to Serious Business with Hadoop

On Wednesday, July 17, at 11:00am ET / 8:00am PT, I’ll be taking part in a webinar in association with MarkLogic on the subject of Hadoop.

As we’ve stated a few times, we believe that the flexibility of Apache Hadoop is one of its biggest assets – enabling organizations to generate value from data that was previously considered too expensive to be stored and processed in traditional databases – but it also results in “Hadoop” meaning different things to different people.

The result is that organizations still struggle over which Hadoop ecosystem components to adopt in order to obtain the greatest value, which application workloads might be suitable for deployment on Hadoop, and how to deploy Hadoop in conjunction with existing relational and non-relational databases.

On the webinar I’ll be providing an overview of the current state of the Hadoop ecosystem, geographic adoption, use cases, while MarkLogic’s Director of Product Management Justin Makeig to will provide an introduction to complementary technology from MarkLogic that can help your organization achieve real-time analysis, transactional data updates, integrity, granular security, and full-text search.

For full details, and to register, click here.