Exploring The Hadoop Network Topology
Hadoop is designed to run on large clusters of commodity servers – in many cases spanning many physical racks of servers. A physical rack is in many cases a single point of failure (for example, having typically a single switch ...
Hadoop admin interview questions
Which operating system(s) are supported for production Hadoop deployment?
The main supported operating system is Linux. However, with some additional software Hadoop can be deployed on Windows.
What is the role of the namenode?
The namenode is the "brain" of the Hadoop cluster ...
How to learn Hadoop like a boss
Big data is reshaping the landscape of business IT. Thanks to cheap storage, the massive processing power of the latest technology, and tools like Hadoop, organisations are now able to mine terabytes of information and derive useful business intelligence from ...
Hadoop Cluster Interview Questions
Which are the three modes in which Hadoop can be run?
The three modes in which Hadoop can be run are:
1. standalone (local) mode
2. Pseudo-distributed mode
3. Fully distributed mode
What are the features of Stand alone (local) mode?
In stand-alone mode there are ...
Hadoop Resources
History of Hadoop
Spotlight on the early history of Hadoop
The history of Hadoop: From 4 nodes to the future of data
Big Ideas: Demystifying Hadoop
What is MapReduce?
"Cluster Computing and MapReduce Lecture" series in YouTube
http://www.youtube.com/watch?v=yjPBkvYh-ss
http://www.youtube.com/watch?v=-vD6PUdf3Js
http://www.youtube.com/watch?v=5Eib_H_zCEY
http://www.youtube.com/watch?v=1ZDybXl212Q
http://www.youtube.com/watch?v=BT-piFBP4fE
http://labs.google.com/papers/mapreduce.html
http://code.google.com/edu/parallel/mapreduce-tutorial.htmlÂ
What is Hadoop?
http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html
http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/
What is HDFS?
The paper covers most of ...
Top 10 Big Data Trends in 2014
In January 2014, IDG published their latest big data enterprise survey and predictions for 2014 finding that on average, enterprises will spend $8M on big data –related initiatives in 2014. The study also found that 70% of enterprise organizations have ...
Introduction to Impala
Impala in terms of Hadoop has got the significance because of its,
Scalability
Flexibility
Efficiency
What’s Impala?
Impala is…
Interactive SQL–Impala is typically 5 to 65 times faster than Hive as it minimized the response time to just seconds, not minutes.
Nearly ANSI-92 standard and compatible with ...
Hadoop FS Shell Commands
Hadoop file system (fs) shell commands are used to perform various file operations like copying file, changing permissions, viewing the contents of the file, changing ownership of files, creating directories etc.
The syntax of fs shell command is
hadoop fs <args>
All the ...
Apache Spark for Big Analytics
by Thomas Dinsmore, Director of Product Management at Revolution Analytics
The emergence of Apache Spark is a key development for Big Analytics in 2013.  Spark, an Apache incubator project, is an open source distributed computing framework for advanced analytics in Hadoop. ...
The 3 most common ways data junkies are using Hadoop
Just a few weeks ago, Apache Hadoop 2.0 was declared generally available–a huge milestone for the Hadoop market as it unlocks the vision of interacting with stored data in unprecedented ways. Hadoop remains the typical underpinning technology of “Big Data,” ...






