How to Run a Simple Apache Spark App in CDH 5

Getting started with Spark (now shipping inside CDH 5) is easy using this simple example. Apache Spark is a general-purpose, cluster computing framework that, like MapReduce in Apache Hadoop, offers powerful abstractions for processing large datasets. For various reasons pertaining to ...

Cassandra-Database Solution for modern day applications?

Cassandra is a one stop choice for data driven organizations dealing with real-time Big Data operations for their core functionalities. Now what makes it so dear to the developers and organizations dealing huge databases is a bunch of features that ...

Apache Spark is now part of MapR’s Hadoop distribution

Hadoop vendor MapR is getting in early on the Apache Spark action, too, announcing on Thursday that it’s adding the Spark stack to its Hadoop distribution as part of a partnership with Spark startup Databricks (Ion Stoica, the co-founder and CEO of ...

Using Scala To Work With Hadoop

Cloudera has a great toolkit to work with Hadoop.  Specifically it is focused on building distributed systems and services on top of the Hadoop Ecosystem. http://cloudera.github.io/cdk/docs/0.2.0/cdk-data/guide.html And the examples are in Scala!!!! Here is how you you work with generic stuff on the ...
1 24 25 26 27 28 30