The Data Day, A few days: March 25-28 2013

Google pledges patent support for OSS. Basho open sources Riak CS. And more

And that’s the data day, today.

The Data Day, Two days: August 9/10 2012

HP’s Autonomy problem. Excel 2013. And more.

And that’s the Data Day, today.

The distribution of Hadoop and MapReduce skills, according to LinkedIn

Back in December we published an assessment on the distribution of Hadoop skills, based on LinkedIn search results.

Thanks to a temporary lull in GB’s Olympic medal-winning exploits I ran the search again the other day. The results are pretty interesting for a number of reasons.

First the headline stats:

  • There are over 22,000 people with Hadoop in their LinkedIn profiles, up from just over 9,000 in December 2011, an increase of over 144% in eight months.
  • Hadoop skills are becoming more evenly distributed: 60.4% of LinkedIn members with Hadoop in the profiles are based in the US, compared to 64.5% in December 2011.
  • Growth areas include India (11.9% from 9.7%), China (4.4% from 3.6%, and the UK (3.4% from 3.0%).
  • However, the Bay Area remains the best place to find Hadoop enthusiasts, with 24.9% (albeit down from 28.2% eight months ago)

This time I also ran a similar search for MapReduce skills. The headline results:

  • MapReduce is mentioned in 6,424 LinkedIn profiles.
  • MapReduce skills are more evenly distributed: 61.9% of LinkedIn members with MapReduce in their profiles are based in the US, 38.1% in the rest of the world.
  • That said, over a quarter (25.9%) of LinkedIn members with MapReduce in their profiles are based in the Bay area.
  • Other geographic hotspots are India (8.7%), Seattle area (5.8%), and NYC area (4.8%).

This time I also looked at which vendors are listed as the current employers of LinkedIn members citing Hadoop and MapReduce. One really surprising result stood out:

  • Microsoft is the second largest employer of both Hadoop and MapReduce skills, according to LinkedIn member profiles
  • Redmond employs 1.7% of all LinkedIn members with Hadoop in their profiles, and 3.0% of members with MapReduce in their profiles
  • Yahoo is the largest employer of Hadoop skills (2.9%), with other ‘non-vendors’ also well represented: Google (1.3%), eBay (also 1.3%), Amazon (1.2%), LinkedIn (1.1%).
  • Google is by far the largest employer of MapReduce skills, as you might expect, with 7.1%. Also well represented are Yahoo (2.7%), Amazon (1.8%) and LinkedIn (1.4%).


The Data Day, Two days: August 2/3 2012

StormDB looks to define database cloud. YARN becomes Hadoop sub-project. And more.

And that’s the Data Day, today.

The Data Day, Today: July 20 2012

StreamBase LiveView. ADVIZOR acquirers. The Accumulo Challenge. And more.

And that’s the Data Day, today.

Top Issues IT faces with Hadoop MapReduce: a Webinar with Platform Computing

Next Tuesday, August 3, at 8.30 AM PDT I’ll be taking part in a Webinar with Platform Computing to discuss the the benefits and challenges of Hadoop and MapReduce. Here’s the details:

With the explosion of data in the enterprise, especially unstructured data which constitutes about 80% of the total data in the enterprise, new tools and techniques are needed for business intelligence and big data processing. Apache Hadoop MapReduce is fast becoming the preferred solution for the analysis and processing of this data.

The speakers will address the issues facing enterprises deploying open source solutions. They will provide an overview of the solutions available for Big Data, discuss best practices, lessons learned, case studies and actionable plans to move your project forward.

To register for the event please visit the registration page.

Who is hiring Hadoop and MapReduce skills?

Continuing my recent exploration of Indeed.com’s job posting trends and data I have recently been taking a look at which organizations (excluding recruitment firms) are hiring Hadoop and MapReduce skills. The results are pretty interesting.

When it comes to who is hiring Hadoop skills, the answer, put simply, is Amazon, or more generally new media:


Source: Indeed.com Correct as of August 2, 2011

This is indicative of the early stage of adoption, and perhaps reflects the fact that many new media Hadoop adopters have chosen to self-support rather than turn to the Hadoop support providers/distributors.

It is no surprise to see those vendors also listed as they look to staff up to meet the expected levels of enterprise adoption (and it is worth noting that Amazon could also be included in the vendors category, given its Elastic MapReduce service).

Fascinating to see that of the vendors, VMware currently has the most job postings on Indeed.com referencing Hadoop, while Microsoft also makes an appearance.

Meanwhile the appearance of Northrop Grumman and Sears Holdings on this list indicates the potential for adoption in more traditional data management adopters, such as government and retail.

It is interesting to compare the results for Hadoop job postings with those mentioning Teradata, which shows a much more varied selection of retail, health, telecoms, and financial services providers, as well as systems integrators, government contractors, new media and vendors.

It is also interesting to compare Hadoop-related bog postings with those specifying MapReduce skills. There are a lot less of them, for a start, and while new media companies are well-represented, there is much greater interest from government contractors.


Source: Indeed.com Correct as of August 2, 2011

Google Trends: Hadoop versus Big Data versus MapReduce

I was just looking through some slides handed to me by Cloudera’s John Kreisa and Mike Olson during our meeting last week and one of them jumped out at me.

It contains a graphic showing a Google Trends result for searches for Hadoop and “Big Data”. See for yourself why the graphic stood out:

In case you hadn’t guessed, Hadoop is in blue and “Big Data” is in red.

Even taken with a pinch of salt it’s a huge validation of the level of interest in Hadoop. Removing the quotes to search for Big Data (red) doesn’t change the overall picture.

See also Hadoop (blue) versus MapReduce (red).

UPDATE: eBay’s Oliver Ratzesberger puts the comparisons above in perspective somewhat by comparing Joomla vx Hadoop.