Apache Spark-3 Real-World Use Cases
By Alex Woodie
The Hadoop processing engine Spark has risen to become one of the hottest big data technologies in a short amount of time. And while Spark has been a Top-Level Project at the Apache Software Foundation for barely a ...
Avoiding Split Brainedness in HA Hadoop Clusters
The US Patent Office recently granted Zettaset a patent for the underlying technology in its Hadoop high availability that prevents a "split-brain" situation where multiple master nodes think they're in control of the Hadoop cluster. It's a feather in the ...
Big Workflow-The Future of Big Data Computing
How can organizations embrace β instead of brace for β the rapidly intensifying collision of public and private clouds, HPC environments and Big Data? The current go-to solution for many organizations is to run these technology assets in siloed, specialized ...
Hadoop YARN adds more application threads for big data users
Even Hadoop's most enthusiastic proponents might admit that its marriage to MapReduce has limited what the open source technology can do. But with the advent of Hadoop 2 and its key component, the Hadoop YARN resource manager, the distributed processing ...
Top 12 funded Big Data Startup companies
Big data companies are attracting big investments from venture capitalists. But which startups are garnering the most funding? Venture capitalists made more big data investments than ever before in 2012, and a few more deals have already closed in 2013. ...
Apache Hadoop 2.3.0 was released
Hadoop-2.3.0 is the first release for the year 2014, and brings a number of enhancements to the core platform, in particular to HDFS. There are a lot of bug fixes and small changes in this one - you can read ...
How YARN Opens Doors to Easier Programming Tools for Hadoop 2.0 Users
The emergence of YARN for the Hadoop 2.0 platform has opened the door to new tools and applications that promise to allow more companies to reap the benefits of big data in ways never before possible with outcomes possibly never ...
Exploring The Hadoop Network Topology
Hadoop is designed to run on large clusters of commodity servers β in many cases spanning many physical racks of servers. A physical rack is in many cases a single point of failure (for example, having typically a single switch ...
Why NoSQL became MORE SQL
One of the key points I raised was about how many folks were just slapping on Big Data badges to the same old same old, another was that Map Reduce really doesn't work they way traditional IT estates behave which ...
Hadoop admin interview questions
Which operating system(s) are supported for production Hadoop deployment?
The main supported operating system is Linux. However, with some additional software Hadoop can be deployed on Windows.
What is the role of the namenode?
The namenode is the "brain" of the Hadoop cluster ...






