Snappy compression with Pig and native MapReduce

Assuming you have installed Hadoop on your cluster, if not please follow http://code.google.com/p/hadoop-snappy/ This is the machine config of my cluster nodes, though the steps that follow could be followed with your installation/machine configs pkommireddi@pkommireddi-wsl:/tools/hadoop/pig-0.9.1/lib$ uname -a Linux pkommireddi-wsl 2.6.32-37-generic #81-Ubuntu SMP Fri ...

03 May 2014 Analytics, Big Data, Cloudera, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

Configure Eclipse for MapReduce

1. Download load eclipse Europa or Indigo 2. Download Hadoop eclipse plugin eg: hadoop-eclipse-plugin-1.0.3.jar 3. Copy jar in eclipse plugin folder 4. Open eclipse 5. Add Map/Reduce server 6. Add New DFS Location Location name: localhost Map/Reduce Master: port: 9001 DFS Master port: 9000 Finish 7. New -> others -> Map/Reducer Project -> ...

31 March 2014 Analytics, Big Data, Hadoop News, Hadoop Tutorials, MapReduce News

How MapR’s M7 Platform Improves NoSQL and Hadoop

The M7 Edition. Sounds like a high performance sports car, doesn’t it? In reality, M7 is MapR’s enterprise-grade platform that provides its own unique brand of high-performance, dependability and ease of use to both NoSQL and Hadoop applications. M7 removes ...

27 February 2014 Analytics, Big Data, Cassandra, Cloudera, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News

Anatomy of a MapReduce Job

Hadoop Ecosystem and MapReduce There is an extensive list of products and projects that either extend Hadoop’s functionality or expose some existing capability in new ways. For example, executing SQL-like queries on top of Hadoop has spwaned several products. Facebook started ...

13 February 2014 Big Data, Hadoop News, Hadoop Tutorials, MapReduce News, NoSQL News, Predictive Analytics

HBase Architecture

HBase – The Basics: HBase is an open-source, NoSQL, distributed, non-relational, versioned, multi-dimensional, column-oriented store which has been modeled after Google BigTable that runs on top of HDFS. ‘’NoSQL” is a broad term meaning that the database isn’t an RDBMS which ...

24 January 2014 Big Data, Cassandra, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig

Introduction to Impala

Impala in terms of Hadoop has got the significance because of its, Scalability Flexibility Efficiency What’s Impala? Impala is… Interactive SQL–Impala is typically 5 to 65 times faster than Hive as it minimized the response time to just seconds, not minutes. Nearly ANSI-92 standard and compatible with ...

22 January 2014 Big Data, Cassandra, Cloudera, Couchbase, Google, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, SAS

Hadoop Interview Questions – MapReduce

Looking out for Hadoop Interview Questions that are frequently asked by employers? What is MapReduce? It is a framework or a programming model that is used for processing large data sets over clusters of computers using distributed programming. What are 'maps' and 'reduces'? 'Maps' ...

03 January 2014 Big Data, Cassandra, Couchbase, Hadoop News, Hadoop Tutorials, MapReduce News, MongoDB News, NoSQL News