Introduction to Apache Hive and Pig
Apache Hive is a framework that sits on top of Hadoop for doing ad-hoc queries on data in Hadoop. Hive supports HiveQL which is similar to SQL, but doesn't support the complete constructs of SQL.
Hive coverts the HiveQL query into ...
Top 10 Big Data Trends in 2014
In January 2014, IDG published their latest big data enterprise survey and predictions for 2014 finding that on average, enterprises will spend $8M on big data ārelated initiatives in 2014. The study also found that 70% of enterprise organizations have ...
HBase Architecture
HBase ā The Basics:
HBase is an open-source, NoSQL, distributed, non-relational, versioned, multi-dimensional, column-oriented store which has been modeled after Google BigTable that runs on top of HDFS.Ā āāNoSQLā is a broad term meaning that the database isnāt an RDBMS which ...
Use Cases Of MongoDB
MongoDB is a relatively new contender in the data storage circle compared to giant like Oracle and IBM DB2, but it has gained huge popularity with their distributed key value store, MapReduce calculation capability and document oriented NoSQL features.
MongoDB has ...
Introduction to Impala
Impala in terms of Hadoop has got the significance because of its,
Scalability
Flexibility
Efficiency
Whatās Impala?
Impala isā¦
Interactive SQLāImpala is typically 5 to 65 times faster than Hive as it minimized the response time to just seconds, not minutes.
Nearly ANSI-92 standard and compatible with ...
Free Cloudera Impala Book
Get free Cloudera Impala, in PDF format, for free from the Cloudera website, in association with the Strata Conference and Hadoop World.Ā See the below link for the book info from the publisher as well as the link to download ...
Hadoop Cluster Commissioning and Decommissioning Nodes
To add new nodes to the cluster:
1. Add the network addresses of the new nodes to the include file.
hdfs-site.xml
<property>
<name>dfs.hosts</name>
<value>/<hadoop-home>/conf/includes</value>
<final>true</final>
</property>
mapred-site.xml
<property>
<name>mapred.hosts</name>
<value>/<hadoop-home>/conf/includes</value>
<final>true</final>
</property>
Datanodes that are permitted to connect to the namenode are specified in a
file whose name is specified by the dfs.hosts property.
Includes file ...
The 3 most common ways data junkies are using Hadoop
Just a few weeks ago, Apache Hadoop 2.0 was declared generally availableāa huge milestone for the Hadoop market as it unlocks the vision of interacting with stored data in unprecedented ways. Hadoop remains the typical underpinning technology of āBig Data,ā ...
The secrets of designing and building big data apps
Software applications have traditionally been perceived as a unit of computation designed and used to solve a problem. Whether an application is a CRM tool that helps manage customer information or a complex supply-chain management system, the problem it solves ...
7 Ways Big Data Could Revolutionize Our Lives
We know that big data is changing the way we live and work. It is changing societies and changing businesses. On this platform we have included many best practices that show concrete example of how this is happening. To illustrate ...






