5 technologies that will help big data cross the chasm

We’re on the cusp of a real turning point for big data. Its applications are becoming clearer, its tools are getting easier and its architectures are maturing in a hurry. It’s no longer just about log files, clickstreams and tweets. ...

07 May 2014 Analytics, Big Data, Cloud Computing, Cloudera, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

5 tips to get started with big data

Everyone seems to be talking about "big data" these days. Do you wonder what you’re missing out on? Let’s take a look at how you can get started with Big Data. Learn what it is, and what it is not. While ...

06 May 2014 Analytics, Big Data, Cloud Computing, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

6 big data trends in 2014

Data are being generated by every device imaginable. Big data are arriving from multiple sources at an alarming velocity, volume, variety and veracity. It is estimated that 2.5 quintillion bytes of data are created each day—so much that 90 percent of ...

05 May 2014 Analytics, Big Data, Cloud Computing, Cloudera, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

A New Python Client for Impala

The new Python client for Impala will bring smiles to Pythonistas! As a data scientist, I love using the Python data stack. I also love using Impala to work with very large data sets. But things that take me out of ...

02 May 2014 Analytics, Big Data, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics, SAS

Cloudera, MongoDB partner to mash up NoSQL, Hadoop

Hadoop specialist Cloudera announced a strategic partnership with MongoDB this week that will allow Cloudera customers to store Hadoop data in their NoSQL MongoDB databases. The move is a huge win for MongoDB, which is quickly emerging as one of ...

01 May 2014 Analytics, Big Data, Cloudera, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics, Splunk

Bringing the Best of Apache Hive 0.13 to CDH Users

More than 300 bug fixes and stable features in Apache Hive 0.13 have already been backported into CDH 5.0.0. Last week, the Hive community voted to release Hive 0.13. We’re excited about the continued efforts and progress in the project and ...

29 April 2014 Analytics, Big Data, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

Apache Ambari 1.5.1 is Released!

Apache Ambari community proudly released version 1.5.1. This is the result of constant, concerted collaboration among the Ambari project’s many members. This release represents the work of over 30 individuals over 5 months and, combined with the Ambari 1.5.0 release, ...

26 April 2014 Analytics, Big Data, Cloudera, Couchbase, Google, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Predictive Analytics

10 Hadoop Hardware Leaders

Hadoop software is designed to orchestrate massively parallel processing on relatively low-cost servers that pack plenty of storage close to the processing power. All the power, reliability, redundancy, and fault tolerance is built into the software, which distributes the data ...

25 April 2014 Analytics, Big Data, Cassandra, Cloud Computing, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from ...

24 April 2014 Analytics, Big Data, Cloud Computing, Cloudera, Couchbase, Hadoop News, Hadoop Tutorials, HBase, Hive, Impala, MapReduce News, MongoDB News, NoSQL News, Pig, Predictive Analytics, SAS, Splunk

How to Run a Simple Apache Spark App in CDH 5

Getting started with Spark (now shipping inside CDH 5) is easy using this simple example. Apache Spark is a general-purpose, cluster computing framework that, like MapReduce in Apache Hadoop, offers powerful abstractions for processing large datasets. For various reasons pertaining to ...