Hadoop and NoSQL Now Data Warehouse-Worthy-Gartner
Not long ago, the rules for what constituted a data warehouse were fairly well defined. The schema was fixed, you could say, and was based primarily on relational database technology designed to process structured data. My, how times have changed. ...
Why Apache Spark is a Crossover Hit for Data Scientists
Spark is a compelling multi-purpose platform for use cases that span investigative, as well as operational, analytics.
Data science is a broad church. I am a data scientist — or so I’ve been told — but what I do is actually ...
Data transfer between MySql and Cassandra using Sqoop
Sqoop is a tool designed to transfer data between Hadoop and relational databases. You can use Sqoop to import data from a relational database management system (RDBMS) such as MySQL or Oracle into the Hadoop Distributed File System (HDFS), transform ...
Apache Hadoop 2.3.0 was released
Hadoop-2.3.0 is the first release for the year 2014, and brings a number of enhancements to the core platform, in particular to HDFS. There are a lot of bug fixes and small changes in this one - you can read ...
Exploring The Hadoop Network Topology
Hadoop is designed to run on large clusters of commodity servers – in many cases spanning many physical racks of servers. A physical rack is in many cases a single point of failure (for example, having typically a single switch ...
Anatomy of a MapReduce Job
Hadoop Ecosystem and MapReduce
There is an extensive list of products and projects that either extend Hadoop’s functionality or expose some existing capability in new ways. For example, executing SQL-like queries on top of Hadoop has spwaned several products. Facebook started ...
Hadoop Resources
History of Hadoop
Spotlight on the early history of Hadoop
The history of Hadoop: From 4 nodes to the future of data
Big Ideas: Demystifying Hadoop
What is MapReduce?
"Cluster Computing and MapReduce Lecture" series in YouTube
http://www.youtube.com/watch?v=yjPBkvYh-ss
http://www.youtube.com/watch?v=-vD6PUdf3Js
http://www.youtube.com/watch?v=5Eib_H_zCEY
http://www.youtube.com/watch?v=1ZDybXl212Q
http://www.youtube.com/watch?v=BT-piFBP4fE
http://labs.google.com/papers/mapreduce.html
http://code.google.com/edu/parallel/mapreduce-tutorial.htmlÂ
What is Hadoop?
http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html
http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/
What is HDFS?
The paper covers most of ...






