Apache Mahout is moving on from MapReduce

Apache Mahout, a machine learning library for Hadoop since 2009, is joining the exodus away from MapReduce. The project’s community has decided to rework Mahout to support the increasingly popular Apache Spark in-memory data-processing framework, as well as the H2O engine for ...

28 March 2014 Analytics, Big Data, Cloudera, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

Apache Falcon-Data Governance for Hadoop

Apache Falcon is a data governance engine that defines, schedules, and monitors data management policies. Falcon allows Hadoop administrators to centrally define their data pipelines, and then Falcon uses those definitions to auto-generate workflows in Apache Oozie. InMobi is one of ...

27 March 2014 Analytics, Big Data, Cassandra, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

Google Launches BigQuery Streaming For Real-Time, Big-Data Analytics

BigQuery, Google’s cloud-based tool for quickly analyzing very large datasets, is getting a massive price cut today (up to 85 percent). But Google is also adding an important new feature that will make it more competitive with the big data service ...

26 March 2014 Analytics, Big Data, Cloud Computing, Google, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

Pivotal juices Hadoop with in-memory database and SQL querying

Pivotal, an EMC/VMware spin-off that has big plans to deliver big data analytics through platform as a service, has whisked the drapes off Pivotal HD 2.0, its commercially supported enterprise-grade distribution of Hadoop. But Pivotal's ambitions for HD don't simply involve ...

25 March 2014 Analytics, Big Data

How to Contribute to HBase and Hadoop2

By Nick Dimiduk In case you haven’t heard, Hadoop2 is on the way! There are loads more new features than I can begin to enumerate, including lots of interesting enhancements to HDFS for online applications like HBase. One of the most ...

24 March 2014 Analytics, Big Data, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

HBase BlockCache Showdown

The HBase BlockCache is an important structure for enabling low latency reads. As of HBase 0.96.0, there are no less than three different BlockCache implementations to choose from. But how to know when to use one over the other? There’s ...

22 March 2014 Analytics, Big Data, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics, SAS

What Can GPFS on Hadoop Do For You?

The Hadoop Distributed File System (HDFS) is considered a core component of Hadoop, but it’s not an essential one. Lately, IBM has been talking up the benefits of hooking Hadoop up to the General Parallel File System (GPFS). IBM has ...

20 March 2014 Big Data, Cassandra, Cloud Computing, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, NoSQL News, Pig, Predictive Analytics, SAS

Hadoop Alternative Hydra Re-Spawns as Open Source

It may not have the name recognition or momentum of Hadoop. But Hydra, the distributed task processing system first developed six years ago by the social bookmarking service maker AddThis, is now available under an open source Apache license, just ...

17 March 2014 Analytics, Big Data, Cassandra, Cloud Computing, Cloudera, Google, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News

Hadoop and NoSQL Now Data Warehouse-Worthy-Gartner

Not long ago, the rules for what constituted a data warehouse were fairly well defined. The schema was fixed, you could say, and was based primarily on relational database technology designed to process structured data. My, how times have changed. ...

14 March 2014 Analytics, Big Data, Cloud Computing, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics, SAS, Splunk

Big Workflow-The Future of Big Data Computing

How can organizations embrace — instead of brace for — the rapidly intensifying collision of public and private clouds, HPC environments and Big Data? The current go-to solution for many organizations is to run these technology assets in siloed, specialized ...