Avoiding Split Brainedness in HA Hadoop Clusters
The US Patent Office recently granted Zettaset a patent for the underlying technology in its Hadoop high availability that prevents a “split-brain” situation where multiple master nodes think they’re in control of the Hadoop cluster. It’s a feather in the cap for Zettaset, which has also innovated in the area of Hadoop security.
The single point of failure of the Hadoop NameNode and the overall lack of native high availability (HA) features in Hadoop should not be a surprise to you. In fact, it’s been a well-documented issue for years, and one that Apache Hadoop community has worked to fill by bolstering Hadoop’s availability and resiliency.
Much progress was made with the launch of Hadoop version 2 last year, which brought automated mechanisms for handling a NameNode failover and maintaining continuous access to HDFS services. Whereas first-gen Hadoop deployment were vulnerable to losing data, Hadoop version 2 promises full protection for the entire Hadoop stack, including MapReduce, Hive, Pig, HBase, and Oozie, according to Hortonworks.
While the Hadoop community has made definite progress, there is still room in the market for vendors like Zettaset to innovate. Zettaset’s flagship offering, called Orchestrator, delivers a management layer over Hadoop with the aim of bolstering not only high availability, but security and monitoring too. In the HA realm, Zettaset aims to provide automated protection against downtime without requiring lots of manual intervention.
The Mountain View, California company says Orchestrator implements an HA failover mechanism that protects not only the Name Node, but the Job Tracker, Oozie, Kerberos, Hive, and the meta data store layers as well, which it argues are not well protected by plan vanilla Hadoop distributions. Upon detecting a failure, the software automatically fails over to the backup, which is kept up-to-day via data synchronization. More than one backup can be designated in a “1-to-n” cascading failover setup for each protected service.
Creating fault-tolerance in computer clusters is nothing new. But in U.S. patent number 8,595,546, Zettaset explains how it went about enabling a failover mechanism for Hadoop that avoids “split-brain” syndrome by leveraging what it calls “quorum-based majority voting strategies with time-limited leases.” Read more