How to Run a Simple Apache Spark App in CDH 5
Getting started with Spark (now shipping inside CDH 5) is easy using this simple example.
Apache Spark is a general-purpose, cluster computing framework that, like MapReduce in Apache Hadoop, offers powerful abstractions for processing large datasets. For various reasons pertaining to ...
Hadoop or Warehousing, or Both?
One of the thornier questions facing enterprise executives in these days of broad infrastructural change is how to deal with Big Data. On the surface, it may seem like a no-brainer: No matter how big the data load becomes, there ...
Google BigQuery and Datastore Connectors for Hadoop
Users of Google’s cloud platform should find it easier to run Hadoop jobs directly against data in Google BigQuery and Google Cloud Datastore from now on.
we are making it easier for you to run Hadoop jobs directly against your data ...
10 Hot Hadoop Startups to Watch in 2025
It's no secret that data volumes are growing exponentially. What's a bit more mysterious is figuring out how to unlock the value of all of that data. A big part of the problem is that traditional databases weren't designed for ...
Top 30 Big Data Companies to watch in 2025
. The Big Data space is heating up – to the point that many pundits already see it as the over-hyped heir to "cloud." The hype may be a bit much, but Big Data is already living up to its ...
Top 7 Tips to Succeed with Big Data
Today all the businesses are focusing and investing on big data Analytics to offer reliable services and to get profits. Big data is playing vital role in making the better business decisions by enabling data scientists and other users to ...
10 Big Data Predictions for 2014
Big data was seen as one of the biggest buzzwords of 2013 and companies are spending a lot on Big data analytics. The storage and analysis of large and/or complex data sets using a series of techniques including, but not ...
Using Scala To Work With Hadoop
Cloudera has a great toolkit to work with Hadoop. Â Specifically it is focused on building distributed systems and services on top of the Hadoop Ecosystem.
http://cloudera.github.io/cdk/docs/0.2.0/cdk-data/guide.html
And the examples are in Scala!!!!
Here is how you you work with generic stuff on the ...
MongoDB 2.6 Released
In the five years since the initial release of MongoDB, and after hundreds of thousands of deployments, we have learned a lot. The time has come to take everything we have learned and create a basis for continued innovation over ...
How To Choose The Best Tool For Your Big Data Project
Trying to choose the right tool for a big data project? This chart (and three simple rules) can help guide you through the options. This chart is based on one shown by Microsoft Research senior research program manager Wenming Ye ...






