10 Hadoop Hardware Leaders
Hadoop software is designed to orchestrate massively parallel processing on relatively low-cost servers that pack plenty of storage close to ...
Using Apache Hadoop and Impala together with MySQL for data analysis
Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post ...
Apache Hive Updated with SQL-on-Hadoop Features
The Apache Hive community has voted on and released version 0.13. This is a significant release that represents a major ...
How to Run a Simple Apache Spark App in CDH 5
Getting started with Spark (now shipping inside CDH 5) is easy using this simple example.
Apache Spark is a general-purpose, cluster ...
Hadoop or Warehousing, or Both?
One of the thornier questions facing enterprise executives in these days of broad infrastructural change is how to deal with ...
How Accurate is Mahout for Summing Numbers?
A question was recently posted on the Mahout mailing list suggesting that the Mahout math library was "unwashed" because it ...
Google BigQuery and Datastore Connectors for Hadoop
Users of Google’s cloud platform should find it easier to run Hadoop jobs directly against data in Google BigQuery and ...
10 Hot Hadoop Startups to Watch in 2025
It's no secret that data volumes are growing exponentially. What's a bit more mysterious is figuring out how to unlock ...
Top 30 Big Data Companies to watch in 2025
. The Big Data space is heating up – to the point that many pundits already see it as the ...
Cassandra-Database Solution for modern day applications?
Cassandra is a one stop choice for data driven organizations dealing with real-time Big Data operations for their core functionalities. ...






