Apache Parquet paves the way for better Hadoop data storage

Apache Parquet, which provides columnar storage in Hadoop, is now a top-level Apache Software Foundation (ASF)-sponsored project, paving the way for its more advanced use in the Hadoop ecosystem. Already adopted by Netflix and Twitter, Parquet began in 2013 as a ...

3 useful tools for big data log analysis

When looking around the data center, it's difficult to ignore the potential in all of the big data available from infrastructure systems. There are server and application logs, data from network and storage taps, and metadata from databases and applications. ...

Using Hunk with Hadoop and Elastic MapReduce

Hunk is a relatively new product from Splunk for exploring and visualizing Hadoop and other NoSQL data stores. New in this release is support for Amazon’s Elastic MapReduce.  Hunk with Hadoop Hadoop consists of two components, the first being a storage component called HDFS. HDFS can ...

15 important case studies on Big Data

Are you looking for some of good case studies that highlight how large companies leverage Big Data for driving productivity? Check out these 15 important case studies on Big Data. 23andMe 23andMe is a privately held personal genomics and biotechnology company. The ...

5 Big Data Apps with Effective Use Cases

Even if your organization is compelled to become more data-driven, many don’t know how to transform themselves out of the use-your-gut mentality and into a data-first one. The easiest way? Take shortcuts by refusing to reinvent the wheel and following the ...

VMware Updates Big Data Extensions with Hadoop 2 Support

VMware Inc. updated its Big Data Extensions (BDE) for its vSphere virtualization platform, including support for Hadoop 2. BDE's set of integrated management tools -- built into vSphere -- help organizations deploy, run and manage Hadoop. With BDE, vSphere users can ...

Can Super-Fast Apache Spark Light Up Hadoop?

it the Hadoop Swiss Army knife of cluster computing frameworks. The Apache Software Foundation just rolled out Apache Spark v1.0, which it's calling a "super-fast, open-source, large-scale Relevant Products/Services data Relevant Products/Services processing and advanced analytics Relevant Products/Services engine." That's a ...