Best resources to learn and understand Hadoop

Here are some best resource to learn and understand Hadoop. Tutorials Free videos - MapR Academia Udacity course Hortonworks Sandbox Hadoop Ecosystem Running Hadoop Map-Reduce Hadoop Screencasts Reza Shiftehfar's blog I Reza Shiftehfar's blog II Reza Shiftehfar's blog III Reza Shiftehfar's blog IV Reza Shiftehfar's blog V Reza Shiftehfar's blog VI Reza Shiftehfar's blog ...

15 May 2014 Analytics, Big Data, Cassandra, Cloudera, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

7 Facts About Hadoop That You Should Know

Where there is Big Data, there is Hadoop and vice versa. With Big Data analytics becoming as big as they have, Hadoop has become a mainstay in the technology industry. Hereare a few facts that you should keep in mind when ...

10 May 2014 Analytics, Big Data, Cloud Computing, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

5 technologies that will help big data cross the chasm

We’re on the cusp of a real turning point for big data. Its applications are becoming clearer, its tools are getting easier and its architectures are maturing in a hurry. It’s no longer just about log files, clickstreams and tweets. ...

07 May 2014 Analytics, Big Data, Cloud Computing, Cloudera, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

A New Python Client for Impala

The new Python client for Impala will bring smiles to Pythonistas! As a data scientist, I love using the Python data stack. I also love using Impala to work with very large data sets. But things that take me out of ...

02 May 2014 Analytics, Big Data, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics, SAS

Apache Ambari 1.5.1 is Released!

Apache Ambari community proudly released version 1.5.1. This is the result of constant, concerted collaboration among the Ambari project’s many members. This release represents the work of over 30 individuals over 5 months and, combined with the Ambari 1.5.0 release, ...

26 April 2014 Analytics, Big Data, Cloudera, Couchbase, Google, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

Apache Hive Updated with SQL-on-Hadoop Features

The Apache Hive community has voted on and released version 0.13. This is a significant release that represents a major effort from over 70 members who worked diligently to close out over 1080 JIRA tickets. Hive 0.13 also delivers the third and ...

23 April 2014 Analytics, Big Data, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

How to Run a Simple Apache Spark App in CDH 5

Getting started with Spark (now shipping inside CDH 5) is easy using this simple example. Apache Spark is a general-purpose, cluster computing framework that, like MapReduce in Apache Hadoop, offers powerful abstractions for processing large datasets. For various reasons pertaining to ...

22 April 2014 Analytics, Big Data, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics, Splunk

How Accurate is Mahout for Summing Numbers?

A question was recently posted on the Mahout mailing list suggesting that the Mahout math library was "unwashed" because it didn't use Kahan summation. My feeling is that this complaint is not founded and Mahout is considerably more washed than ...

19 April 2014 Analytics, Big Data, Cloud Computing, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

Google BigQuery and Datastore Connectors for Hadoop

Users of Google’s cloud platform should find it easier to run Hadoop jobs directly against data in Google BigQuery and Google Cloud Datastore from now on. we are making it easier for you to run Hadoop jobs directly against your data ...

18 April 2014 Analytics, Big Data, Cloud Computing, Google, Hadoop News, Hadoop Tutorials, HBase, Hive, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics, SAS

Using Scala To Work With Hadoop

Cloudera has a great toolkit to work with Hadoop. Specifically it is focused on building distributed systems and services on top of the Hadoop Ecosystem. http://cloudera.github.io/cdk/docs/0.2.0/cdk-data/guide.html And the examples are in Scala!!!! Here is how you you work with generic stuff on the ...