Cascading 3.0 Future-Proofs Data-Centric Application Development on Hadoop

Concurrent, Inc., the company behind Cascading, an open source application development framework for building data applications on Hadoop, has announced Cascading 3.0, which CEO Gary Nakamura says will give enterprises the flexibility to build their data-oriented applications on Hadoop once, and ...

10 Hadoop Hardware Leaders

Hadoop software is designed to orchestrate massively parallel processing on relatively low-cost servers that pack plenty of storage close to the processing power. All the power, reliability, redundancy, and fault tolerance is built into the software, which distributes the data ...

Apache Mahout is moving on from MapReduce

Apache Mahout, a machine learning library for Hadoop since 2009, is joining the exodus away from MapReduce. The project’s community has decided to rework Mahout to support the increasingly popular Apache Spark in-memory data-processing framework, as well as the H2O engine for ...

Apache Falcon-Data Governance for Hadoop

Apache Falcon is a data governance engine that defines, schedules, and monitors data management policies. Falcon allows Hadoop administrators to centrally define their data pipelines, and then Falcon uses those definitions to auto-generate workflows in Apache Oozie. InMobi is one of ...

Avoiding Split Brainedness in HA Hadoop Clusters

The US Patent Office recently granted Zettaset a patent for the underlying technology in its Hadoop high availability that prevents a "split-brain" situation where multiple master nodes think they're in control of the Hadoop cluster. It's a feather in the ...
1 6 7 8 9