The Top 5 Hadoop Distributions
A new report by Forrester Research’s big data analysts says that adopting Hadoop is “mandatory” for any organization that wishes to do advanced analytics and get actionable insights on their data.
Forrester estimates that between 60% and 73% of data that enterprises have access to goes unused for business intelligence and analytics. “That’s unacceptable in an age where deeper, actionable insights, especially about customers, are a competitive necessity,” analysts Mike Gualtieri and Noel Yuhanna write in their Wave report on Hadoop distributions that’s out this week. Application developer and delivery professionals are adopting Hadoop “en masse” they say, and the analysts predict that 100% of large enterprises will eventually adopt Hadoop.
There is no one absolute winner in the market. Instead, there’s a cluster of vendors who each are competent, but have different strengths. There’s a second tier of vendors who are compelling in their own right too.
The top tier includes Cloudera, Hortonworks and MapR. IBM and Pivotal round out Forrester’s picks as the top five vendors for distributions of Hadoop software. All of these vendors focus their software on key enterprise features such as security, scale, integration, governance and performance, Forrester says. They can be deployed on customers’ premises, in a private cloud or in a public cloud, but customers manage the software. Forrester’s Wave report did not evaluate cloud-based Hadoop distributions like Amazon Web Service’s Elastic MapReduce, or Microsoft Azure’s HDInsight, because those are public-cloud only based products that customers can not run on their own hardware.
- Cloudera, founded in 2008, is listed as the leader in the report, getting the highest score for its current offering and market presence, based on an evaluation of 30 criteria Forrester used to compare the vendors. Cloudera takes the open source Hadoop software and makes some proprietary changes to it to improve the security, high availability, governance and administration of the software, the analysts say.
2.Hortonworks is perhaps Cloudera’s largest competitor and is second in market presence. Hortonworks is committed to a 100% open source distribution of Hadoop. Everything in the Hortonworks distribution is open source, which gives customers ultimate flexibility when using the software in case they want to migrate away from it, but comes at the expense of some functionality. Whereas Cloudera scored a 4.53 (out of a scale of one to five) on current offering, Hortworks got a 3.82.
- MapR is another leader that scored the second highest for its current offering, marking at 4.34. Gualtieri and Yuhanna say MapR is committed to the best balance between high performance and scalability, while maximizing ease of use.
- IBM is a strong competitor, especially for existing IBM Data customers who are looking to extend their existing analytics to include Hadoop.
- Pivotal is another vendor worth considering, but the company’s current offering and market presence scored the lowest among the five vendors. Pivotal’s Hadoop distribution integrates well with customers who use the company’s other data management and application developer products and services, such as Cloud Foundry PaaS and Greenplum data management software.
The market is not yet saturated with Hadoop users though. Last year, Gartner analyst Merv Adrian attempted to cool the hype about Hadoop, noting that based on a survey that research firm did that up to 54% of responded said they didn’t have plans to adopt Hadoop in the coming year. Source