Assuming you have installed Hadoop on your cluster, if not please follow http://code.google.com/p/hadoop-snappy/
This is the machine config of my cluster nodes, though the steps that follow could be followed with your installation/machine configs
pkommireddi@pkommireddi-wsl:/tools/hadoop/pig-0.9.1/lib$ uname -a
Linux pkommireddi-wsl 2.6.32-37-generic #81-Ubuntu SMP Fri ...
The M7 Edition. Sounds like a high performance sports car, doesn’t it? In reality, M7 is MapR’s enterprise-grade platform that provides its own unique brand of high-performance, dependability and ease of use to both NoSQL and Hadoop applications. M7 removes ...
Hadoop Ecosystem and MapReduce
There is an extensive list of products and projects that either extend Hadoop’s functionality or expose some existing capability in new ways. For example, executing SQL-like queries on top of Hadoop has spwaned several products. Facebook started ...
HBase – The Basics:
HBase is an open-source, NoSQL, distributed, non-relational, versioned, multi-dimensional, column-oriented store which has been modeled after Google BigTable that runs on top of HDFS. ‘’NoSQL” is a broad term meaning that the database isn’t an RDBMS which ...
Impala in terms of Hadoop has got the significance because of its,
Scalability
Flexibility
Efficiency
What’s Impala?
Impala is…
Interactive SQL–Impala is typically 5 to 65 times faster than Hive as it minimized the response time to just seconds, not minutes.
Nearly ANSI-92 standard and compatible with ...
Looking out for Hadoop Interview Questions that are frequently asked by employers?
What is MapReduce?
It is a framework or a programming model that is used for processing large data sets over clusters of computers using distributed programming.
What are 'maps' and 'reduces'?
'Maps' ...