Jaspersoft has released the results of its latest Big Data Survey and was good enough to share with us a few additional details. It makes for interesting reading.
The first thing to take into account is the sample bias. The survey was conducted with over 600 Jaspersoft community members. 63% of respondents are application developers, and 37% are in the software and Internet industry.
This already speaks volumes about the sectors with interest in big data, and it is interesting to compare the state of big data adoption with the recent results of 451 Research’s TheInfoPro storage study, which is conducted with storage professionals.
According to that study, 24% of storage respondents had already implemented solutions for big data, while 56% had no plans. As you might expect, Jaspersoft’s sample was more keen, with 36% having already deployed or in development, and 38% with no plans.
That’s still a good proportion of respondents with no plans to adopt a big data analytics project, however, with the biggest reasons not to adopt being a reliance on structured data (37%) and no clear understanding of what ‘big data’ is (35%).
Sceptics might suggest that the respondents to Jaspersoft’s survey that do have plans for big data are also somewhat confused about what constitutes a big data project.
Certainly they are using some fairly traditional technologies and approaches. Looking at the most popular answers to a range of questions we find that those with big data plans are:
- creating reports (76%)
- to analyze customer experience (48%)
- based on data from enterprise applications (79%)
- stored on relational databases (60%)
- processed using ETL (59%)
- running on-premises (60%)
So far, so what. The characteristics above could be used to describe many existing business intelligence projects.
It’s not even as if respondents are looking at huge volumes of data, with 38% expecting total data volume to be in the gigabytes, 40% expecting terabytes, and just 10% expecting petabytes and above.
So what makes these big data projects? It’s not until you look at the source of the data that you get any sense that the respondents with ongoing big data projects are doing anything different from those without: 68% are using machine-generated content (web logs, sensor data) as the a source for their big data projects, and 46% are using human-generated text (social media, blogs).
The results do suggest that some non-traditional analytics and data processing approaches are gaining ground, with 64% citing the importance of data visualization, 54% statistical/predictive analytics, 50% search, and 45% text analytics. However, just 18% are using Hadoop HDFS at this point (behind MongoDB with 19%).