Integrating Hadoop into Business Intelligence and Data Warehousing
Information from SAS and TDWI Research
The purpose of this report is to accelerate users’ understanding of the many new products and practices based on Hadoop technologies that have emerged in recent years. While Hadoop usage is a minority practice today, ...
Things to Learn in R
By Joshua Burkhow
You up for learning something new? or maybe like me want to “do” statistics? If you are looking to do data analysis then R is a great tool/language to learn.
What is R?
R is a system for statistical ...
Vital Hadoop tools for crunching Big Data
Today, the most popularly term in IT world is ‘Hadoop’. Within a short span of time, Hadoop has grown massively and has proved to be useful for a large collection of diverse projects. The Hadoop community is fast evolving and ...
7 Tips to Succeed with Big Data in 2014
Information from Tableau Software
What a year 2013 was for big data. From the White House to your house, it’s hard to find an organisation or consumer who has less data today than a year ago. Database options proliferate, and business ...
Neo4j, A Graph Database For Building Recommendation Engines, Gets A Visual Overhaul
Part of the problem with any powerful technology is how it is perceived. It might be something that is too early for its time or it may just need those years of development and use for the market to catch ...
Hadoop Cluster Interview Questions
Which are the three modes in which Hadoop can be run?
The three modes in which Hadoop can be run are:
1. standalone (local) mode
2. Pseudo-distributed mode
3. Fully distributed mode
What are the features of Stand alone (local) mode?
In stand-alone mode there are ...
Bloom Filters in HBase and Chrome
Bloom Filters allows to efficiently check if a particular element/record is there in the set/table or not. It has very minimal impact on the insert operations. The only caveat is that it might return a false positive, Bloom filter might ...
16 Top Big Data Analytics Platforms
Revolutionary. That pretty much describes the data analysis time in which we live. Businesses grapple with huge quantities and varieties of data on one hand, and ever-faster expectations for analysis on the other. The vendor community is responding by providing ...
Hadoop Resources
History of Hadoop
Spotlight on the early history of Hadoop
The history of Hadoop: From 4 nodes to the future of data
Big Ideas: Demystifying Hadoop
What is MapReduce?
"Cluster Computing and MapReduce Lecture" series in YouTube
http://www.youtube.com/watch?v=yjPBkvYh-ss
http://www.youtube.com/watch?v=-vD6PUdf3Js
http://www.youtube.com/watch?v=5Eib_H_zCEY
http://www.youtube.com/watch?v=1ZDybXl212Q
http://www.youtube.com/watch?v=BT-piFBP4fE
http://labs.google.com/papers/mapreduce.html
http://code.google.com/edu/parallel/mapreduce-tutorial.html
What is Hadoop?
http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html
http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/
What is HDFS?
The paper covers most of ...
Introduction to Apache Hive and Pig
Apache Hive is a framework that sits on top of Hadoop for doing ad-hoc queries on data in Hadoop. Hive supports HiveQL which is similar to SQL, but doesn't support the complete constructs of SQL.
Hive coverts the HiveQL query into ...






