A guide to NoSQL offerings

Amazon Web Services: DynamoDB is a NoSQL database service that makes it simple and cost-effective to store and retrieve any amount of data and serve any level of request traffic. Users simply tell the service how many requests need to ...

29 March 2014 Analytics, Big Data, Couchbase, Google, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig

Apache Mahout is moving on from MapReduce

Apache Mahout, a machine learning library for Hadoop since 2009, is joining the exodus away from MapReduce. The project’s community has decided to rework Mahout to support the increasingly popular Apache Spark in-memory data-processing framework, as well as the H2O engine for ...

28 March 2014 Analytics, Big Data, Cloudera, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

Apache Falcon-Data Governance for Hadoop

Apache Falcon is a data governance engine that defines, schedules, and monitors data management policies. Falcon allows Hadoop administrators to centrally define their data pipelines, and then Falcon uses those definitions to auto-generate workflows in Apache Oozie. InMobi is one of ...

27 March 2014 Analytics, Big Data, Cassandra, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

Google Launches BigQuery Streaming For Real-Time, Big-Data Analytics

BigQuery, Google’s cloud-based tool for quickly analyzing very large datasets, is getting a massive price cut today (up to 85 percent). But Google is also adding an important new feature that will make it more competitive with the big data service ...

26 March 2014 Analytics, Big Data, Cloud Computing, Google, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

Pivotal juices Hadoop with in-memory database and SQL querying

Pivotal, an EMC/VMware spin-off that has big plans to deliver big data analytics through platform as a service, has whisked the drapes off Pivotal HD 2.0, its commercially supported enterprise-grade distribution of Hadoop. But Pivotal's ambitions for HD don't simply involve ...

25 March 2014 Analytics, Big Data

How to Contribute to HBase and Hadoop2

By Nick Dimiduk In case you haven’t heard, Hadoop2 is on the way! There are loads more new features than I can begin to enumerate, including lots of interesting enhancements to HDFS for online applications like HBase. One of the most ...

24 March 2014 Analytics, Big Data, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

HBase BlockCache Showdown

The HBase BlockCache is an important structure for enabling low latency reads. As of HBase 0.96.0, there are no less than three different BlockCache implementations to choose from. But how to know when to use one over the other? There’s ...

22 March 2014 Analytics, Big Data, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics, SAS

How-to Implement Role-based Security in Impala using Apache Sentry

Apache Sentry (incubating) is the Apache Hadoop ecosystem tool for role-based access control (RBAC). In this how-to, I will demonstrate how to implement Sentry for RBAC in Impala. I feel this introduction is best motivated by a use case. Data warehouse ...

21 March 2014 Analytics, Big Data, Cassandra, Cloudera, Couchbase, Google, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics, SAS

What Can GPFS on Hadoop Do For You?

The Hadoop Distributed File System (HDFS) is considered a core component of Hadoop, but it’s not an essential one. Lately, IBM has been talking up the benefits of hooking Hadoop up to the General Parallel File System (GPFS). IBM has ...

20 March 2014 Big Data, Cassandra, Cloud Computing, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, NoSQL News, Pig, Predictive Analytics, SAS

Using Oozie 4.4.0 with Hadoop 2.2

The current version of Oozie (4.0.0) doesn’t build correctly when you try and target Hadoop 2.2. The Oozie team have a fix going into release 4.0.1 (see OOZIE-1551), but until then you can hack the Maven files to get it ...