The Hadoop Ecosystem: HDFS, Yarn, Hive, Pig, HBase and growing

Hadoop is the leading open-source software framework developed for scalable, reliable and distributed computing. With the world producing data in the zettabyte range there is a growing need for cheap, scalable, reliable and fast computing to process and make sense ...

5 Questions Enterprises Should Ask When Selecting a NoSQL Database

By Barry Perkins With the need for more flexibility when it comes to defining and handling large amounts of data, NoSQL has emerged as a feasible alternative to relational databases. NoSQL databases enable better application development productivity, greater ability to scale dynamically ...

Data Lake Showdown: Object Store or HDFS?

The explosion of data is causing people to rethink their long-term storage strategies. Most agree that distributed systems, one way or another, will be involved. But when it comes down to picking the distributed system–be it a file-based system like ...

MongoDB, Cassandra, and HBase-the three NoSQL databases to watch

Hadoop gets much of the big data credit, but the reality is that NoSQL databases are far more broadly deployed -- and far more broadly developed. In fact, while shopping for a Hadoop vendor is relatively straightforward, picking a NoSQL ...

15 important case studies on Big Data

Are you looking for some of good case studies that highlight how large companies leverage Big Data for driving productivity? Check out these 15 important case studies on Big Data. 23andMe 23andMe is a privately held personal genomics and biotechnology company. The ...

Big Data TechCon Welcomes LinkedIn to Technical Program

Announces industry keynote in addition to more than 55 how-to big data classes and tutorials MELVILLE, N.Y., August 25, 2014 —BZ Media LLC today announced its opening keynote at Big Data TechCon, the how-to technical conference for IT professionals implementing ...

Can Super-Fast Apache Spark Light Up Hadoop?

it the Hadoop Swiss Army knife of cluster computing frameworks. The Apache Software Foundation just rolled out Apache Spark v1.0, which it's calling a "super-fast, open-source, large-scale Relevant Products/Services data Relevant Products/Services processing and advanced analytics Relevant Products/Services engine." That's a ...