Hadoop Cluster Interview Questions
Which are the three modes in which Hadoop can be run?
The three modes in which Hadoop can be run are:
1. standalone (local) mode
2. Pseudo-distributed mode
3. Fully distributed mode
What are the features of Stand alone (local) mode?
In stand-alone mode there are ...
Bloom Filters in HBase and Chrome
Bloom Filters allows to efficiently check if a particular element/record is there in the set/table or not. It has very minimal impact on the insert operations. The only caveat is that it might return a false positive, Bloom filter might ...
16 Top Big Data Analytics Platforms
Revolutionary. That pretty much describes the data analysis time in which we live. Businesses grapple with huge quantities and varieties of data on one hand, and ever-faster expectations for analysis on the other. The vendor community is responding by providing ...
Hadoop Resources
History of Hadoop
Spotlight on the early history of Hadoop
The history of Hadoop: From 4 nodes to the future of data
Big Ideas: Demystifying Hadoop
What is MapReduce?
"Cluster Computing and MapReduce Lecture" series in YouTube
http://www.youtube.com/watch?v=yjPBkvYh-ss
http://www.youtube.com/watch?v=-vD6PUdf3Js
http://www.youtube.com/watch?v=5Eib_H_zCEY
http://www.youtube.com/watch?v=1ZDybXl212Q
http://www.youtube.com/watch?v=BT-piFBP4fE
http://labs.google.com/papers/mapreduce.html
http://code.google.com/edu/parallel/mapreduce-tutorial.htmlÂ
What is Hadoop?
http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html
http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/
What is HDFS?
The paper covers most of ...
Introduction to Apache Hive and Pig
Apache Hive is a framework that sits on top of Hadoop for doing ad-hoc queries on data in Hadoop. Hive supports HiveQL which is similar to SQL, but doesn't support the complete constructs of SQL.
Hive coverts the HiveQL query into ...
How to Get the Best Out of Big Data Solutions
Data has become the new raw material for businesses. And thatâs how it should be in order to meet the dynamic needs of the current age. Thus, access to considerably large amounts of data and information always helps an organisation ...
Top 10 Big Data Trends in 2014
In January 2014, IDG published their latest big data enterprise survey and predictions for 2014 finding that on average, enterprises will spend $8M on big data ârelated initiatives in 2014. The study also found that 70% of enterprise organizations have ...
5 Keys to Big Data Success
As Big Data becomes more pervasive throughout farming, itâs important to understand what farmers need to make data collection and analysis successful, says John Fulton, Extension specialist at Auburn University.
"For many, there is no clear incentive or objective about the ...
When to use Pig Latin versus Hive SQL?
Once your big data is loaded into Hadoop, whatâs the best way to use that data? Youâll need some way to filter and aggregate the data, and then apply the results for something useful. Collecting terabytes and petabytes of web ...
HBase Architecture
HBase â The Basics:
HBase is an open-source, NoSQL, distributed, non-relational, versioned, multi-dimensional, column-oriented store which has been modeled after Google BigTable that runs on top of HDFS. ââNoSQLâ is a broad term meaning that the database isnât an RDBMS which ...






