16 Top Big Data Analytics Platforms
Revolutionary. That pretty much describes the data analysis time in which we live. Businesses grapple with huge quantities and varieties of data on one hand, and ever-faster expectations for analysis on the other. The vendor community is responding by providing highly distributed architectures and new levels of memory and processing power. Upstarts also exploit the open-source licensing model, which is not new, but is increasingly accepted and even sought out by data-management professionals.
Apache Hadoop, a nine-year-old open-source data-processing platform first used by Internet giants including Yahoo and Facebook, leads the big-data revolution. Cloudera introduced commercial support for enterprises in 2008, and MapR and Hortonworks piled on in 2009 and 2011, respectively. Among data-management incumbents, IBM and EMC-spinout Pivotal each has introduced its own Hadoop distribution. Microsoft and Teradata offer complementary software and first-line support for Hortonworks’ platform. Oracle resells and supports Cloudera, while HP, SAP, and others act more like Switzerland, working with multiple Hadoop software providers.
In-memory analysis gains steam as Moore’s Law brings us faster, more affordable, and more-memory-rich processors. SAP has been the biggest champion of the in-memory approach with its Hana platform, but Microsoft and Oracle are now poised to introduce in-memory options for their flagship databases. Focused analytical database vendors including Actian, HP Vertica, Kognitio, and Teradata have introduced options for high-RAM-to-disk ratios, along with tools to place specific data into memory for ultra-fast analysis.
Advances in bandwidth, memory, and processing power also have improved real-time stream-processing and stream-analysis capabilities, but this technology has yet to see broad adoption. Several vendors here complex event processing, but outside of the financial trading, national intelligence, and security communities, deployments have been rare. Watch this space and, particularly, new open source options as breakthrough applications in ad delivery, content personalization, logistics, and other areas push broader adoption.
Our slideshow includes broad-based data-management vendors — IBM, Microsoft, Oracle, SAP — that offer everything from data-integration software and database-management systems (DBMSs) to business intelligence and analytics software, to in-memory, stream-processing, and Hadoop options. Teradata is a blue chip focused more narrowly on data management, and like Pivotal, it has close ties with analytics market leader SAS.
Plenty of vendors covered here offer cloud options, but 1010data and Amazon Web Services (AWS) have staked their entire businesses on the cloud model. Amazon has the broadest selection of products of the two, and it’s an obvious choice for those running big workloads and storing lots of data on the AWS platform. 1010data has a highly scalable database service and supporting information-management, BI, and analytics capabilities that are served up private-cloud style.
The jury is still out on whether Hadoop will become as indispensable as database management systems. Where volume and variety are extreme, Hadoop has proven its utility and cost advantages. Cloudera, Hortonworks, and MapR are doing everything they can to move Hadoop beyond high-scale storage and MapReduce processing into the world of analytics. Read more