Hadoop YARN adds more application threads for big data users
Even Hadoop’s most enthusiastic proponents might admit that its marriage to MapReduce has limited what the open source technology can do. But with the advent of Hadoop 2 and its key component, the Hadoop YARN resource manager, the distributed processing framework has become a kind of launch pad for new applications incorporating a variety of related tools.
For example, Hadoop 2 is making real-time processing and analysis of streaming data possible for Synapse Wireless Inc., a Huntsville, Ala., maker of intelligent control and monitoring systems connected by a wireless mesh network. In present parlance, the company creates a “network of things” that uses the Internet to collect operational data from sensors and devices at customer sites. Some of the uses it supports are monitoring of healthcare operations and of large-scale commercial and residential lighting systems and solar panel fields.
Now, Synapse Wireless Inc. is looking to combine Hadoop 2 and Storm, an open source streaming data engine, to provide real-time business intelligence and analytics capabilities to its customers.
“Our systems can capture high-velocity data streams coming off all these remote devices,” said Bryan Stone, a cloud architect and lead platform developer at the company. With the pairing of Hadoop 2 and Storm, he added, “we don’t just capture the data. We’re also able to act on it. We can present it in a meaningful way so it can affect our customers’ business decisions.”
Using data integration tools from software vendor Pentaho Corp., Stone and his colleagues at Synapse Wireless have created a pilot healthcare monitoring application that puts Storm on top of YARN in a Hadoop 2 cluster. The application is intended to ensure good hand-washing hygiene in hospitals, as an example of what can happen when big data meets cloud computing and the Internet of Things.
As part of the application, tags on the badges that nurses wear can track their movements around a hospital. Other tags collect data on the use of hand-cleanser dispensers. When a nurse enters a patient’s room, a timer starts on the use of the dispenser there. If the application doesn’t register that the dispenser has been used, Stone said, “We can send an alert down to the badge that the nurse is wearing as a reminder that she needs to wash her hands.”
Hadoop YARN gives batch jobs some company
While the original MapReduce-dependent version of Hadoop allowed Synapse Wireless to gather and analyze hand-washing data, the company couldn’t act upon it immediately. Stone still sees value in MapReduce-based batch processing and analytics. But YARN “makes Hadoop more of [a platform] that you can build applications on top of,” he said. “You can still use MapReduce in batch ways. But now you can roll out other applications, too.” Read more here
Subscribe to our Newsletter
Get The Free Collection of 60+ Big Data & Data Science Cheat Sheets. Stay up-to-date with the latest Big Data news.