12 Very Important Tools For Hadoop Users
When it comes to Big Data analysis, Hadoop is at the forefront of things. Almost every company worth mentioning is looking for people specialised in Hadoop. These tools manage various aspects of big data analysis using Hadoop.
1. Ambari
The Apache Ambari ...
7 Facts About Hadoop That You Should Know
Where there is Big Data, there is Hadoop and vice versa. With Big Data analytics becoming as big as they have, Hadoop has become a mainstay in the technology industry.
Hereare a few facts that you should keep in mind when ...
Snappy compression with Pig and native MapReduce
Assuming you have installed Hadoop on your cluster, if not please follow http://code.google.com/p/hadoop-snappy/
This is the machine config of my cluster nodes, though the steps that follow could be followed with your installation/machine configs
pkommireddi@pkommireddi-wsl:/tools/hadoop/pig-0.9.1/lib$ uname -a
Linux pkommireddi-wsl 2.6.32-37-generic #81-Ubuntu SMP Fri ...
Apache Tez 0.3 Released
The Apache Tez community has voted to release 0.3 of the software.
Apacheâ„¢ Tez is a replacement of MapReduce that provides a powerful framework for executing a complex topology of tasks. Tez 0.3.0 is an important release towards making the software ...
Avoiding Split Brainedness in HA Hadoop Clusters
The US Patent Office recently granted Zettaset a patent for the underlying technology in its Hadoop high availability that prevents a "split-brain" situation where multiple master nodes think they're in control of the Hadoop cluster. It's a feather in the ...
Introduction to Apache Hive and Pig
Apache Hive is a framework that sits on top of Hadoop for doing ad-hoc queries on data in Hadoop. Hive supports HiveQL which is similar to SQL, but doesn't support the complete constructs of SQL.
Hive coverts the HiveQL query into ...
When to use Pig Latin versus Hive SQL?
Once your big data is loaded into Hadoop, what’s the best way to use that data? You’ll need some way to filter and aggregate the data, and then apply the results for something useful. Collecting terabytes and petabytes of web ...
Free Cloudera Impala Book
Get free Cloudera Impala, in PDF format, for free from the Cloudera website, in association with the Strata Conference and Hadoop World. See the below link for the book info from the publisher as well as the link to download ...
In-demand big data skills: a mix of old and new
In fall of 2012, MIT's Sloan Management School issued a report discussing the differences between big and regular data, and also differences in skills that the two demanded.
Organizations that utilize big data differ from those with traditional data practices in ...
Use big data to fight cybercrime
 While organizations don't always need to understand how an attack works from an in-depth technical perspective, they do need to understand how the attacks get past their defenses. A successful CISO will arm himself with analytics and learn from others' ...






