Apache Hive Updated with SQL-on-Hadoop Features
The Apache Hive community has voted on and released version 0.13. This is a significant release that represents a major effort from over 70 members who worked diligently to close out over 1080Â JIRA tickets.
Hive 0.13 also delivers the third and ...
How to Run a Simple Apache Spark App in CDH 5
Getting started with Spark (now shipping inside CDH 5) is easy using this simple example.
Apache Spark is a general-purpose, cluster computing framework that, like MapReduce in Apache Hadoop, offers powerful abstractions for processing large datasets. For various reasons pertaining to ...
Hadoop or Warehousing, or Both?
One of the thornier questions facing enterprise executives in these days of broad infrastructural change is how to deal with Big Data. On the surface, it may seem like a no-brainer: No matter how big the data load becomes, there ...
How Accurate is Mahout for Summing Numbers?
A question was recently posted on the Mahout mailing list suggesting that the Mahout math library was "unwashed" because it didn't use Kahan summation. Â My feeling is that this complaint is not founded and Mahout is considerably more washed than ...
Google BigQuery and Datastore Connectors for Hadoop
Users of Google’s cloud platform should find it easier to run Hadoop jobs directly against data in Google BigQuery and Google Cloud Datastore from now on.
we are making it easier for you to run Hadoop jobs directly against your data ...
Top 30 Big Data Companies to watch in 2025
. The Big Data space is heating up – to the point that many pundits already see it as the over-hyped heir to "cloud." The hype may be a bit much, but Big Data is already living up to its ...
Cassandra-Database Solution for modern day applications?
Cassandra is a one stop choice for data driven organizations dealing with real-time Big Data operations for their core functionalities. Now what makes it so dear to the developers and organizations dealing huge databases is a bunch of features that ...
Top 7 Tips to Succeed with Big Data
Today all the businesses are focusing and investing on big data Analytics to offer reliable services and to get profits. Big data is playing vital role in making the better business decisions by enabling data scientists and other users to ...
10 Big Data Predictions for 2014
Big data was seen as one of the biggest buzzwords of 2013 and companies are spending a lot on Big data analytics. The storage and analysis of large and/or complex data sets using a series of techniques including, but not ...
Apache Spark is now part of MapR’s Hadoop distribution
Hadoop vendor MapR is getting in early on the Apache Spark action, too, announcing on Thursday that it’s adding the Spark stack to its Hadoop distribution as part of a partnership with Spark startup Databricks (Ion Stoica, the co-founder and CEO of ...






