What Can GPFS on Hadoop Do For You?
The Hadoop Distributed File System (HDFS) is considered a core component of Hadoop, but it’s not an essential one. Lately, IBM has been talking up the benefits of hooking Hadoop up to the General Parallel File System (GPFS). IBM has ...
Using Oozie 4.4.0 with Hadoop 2.2
The current version of Oozie (4.0.0) doesn’t build correctly when you try and target Hadoop 2.2. The Oozie team have a fix going into release 4.0.1 (see OOZIE-1551), but until then you can hack the Maven files to get it ...
Pivotal Brings In-Memory Analysis To Hadoop
Pivotal, the EMC spin-off company pursuing modern application development in the context of cloud computing and big-data analysis, on Monday released Pivotal HD 2.0, an update of its Hadoop distribution incorporating an in-memory database and a battery of new analysis ...
Hadoop Alternative Hydra Re-Spawns as Open Source
It may not have the name recognition or momentum of Hadoop. But Hydra, the distributed task processing system first developed six years ago by the social bookmarking service maker AddThis, is now available under an open source Apache license, just ...
Hadoop and NoSQL Now Data Warehouse-Worthy-Gartner
Not long ago, the rules for what constituted a data warehouse were fairly well defined. The schema was fixed, you could say, and was based primarily on relational database technology designed to process structured data. My, how times have changed. ...
Apache Tez 0.3 Released
The Apache Tez community has voted to release 0.3 of the software.
Apacheâ„¢ Tez is a replacement of MapReduce that provides a powerful framework for executing a complex topology of tasks. Tez 0.3.0 is an important release towards making the software ...
Apache Spark-3 Real-World Use Cases
By Alex Woodie
The Hadoop processing engine Spark has risen to become one of the hottest big data technologies in a short amount of time. And while Spark has been a Top-Level Project at the Apache Software Foundation for barely a ...
Avoiding Split Brainedness in HA Hadoop Clusters
The US Patent Office recently granted Zettaset a patent for the underlying technology in its Hadoop high availability that prevents a "split-brain" situation where multiple master nodes think they're in control of the Hadoop cluster. It's a feather in the ...
Big Workflow-The Future of Big Data Computing
How can organizations embrace — instead of brace for — the rapidly intensifying collision of public and private clouds, HPC environments and Big Data? The current go-to solution for many organizations is to run these technology assets in siloed, specialized ...
Hadoop YARN adds more application threads for big data users
Even Hadoop's most enthusiastic proponents might admit that its marriage to MapReduce has limited what the open source technology can do. But with the advent of Hadoop 2 and its key component, the Hadoop YARN resource manager, the distributed processing ...






