The Hadoop Ecosystem: HDFS, Yarn, Hive, Pig, HBase and growing

Hadoop is the leading open-source software framework developed for scalable, reliable and distributed computing. With the world producing data in the zettabyte range there is a growing need for cheap, scalable, reliable and fast computing to process and make sense ...

How to become a Data Scientist for Free

By Nir Goldstein, ReSkill Statistical analysis and data mining were the top skills that got people hired in 2014 based on LinkedIn analysis of 330 million LinkedIn member profiles. We live in an increasingly data driven world, and businesses are aggressively ...

Top 5 Trends in Big Data Analytics

While the majority people understand that companies are empowered by actionable information penetrations and help drive sales, devotion and exceptional customer experiences, thinking of making sense of enormous quantities of information and undertaking the job of unifying is daunting. But ...

Data Lake Showdown: Object Store or HDFS?

The explosion of data is causing people to rethink their long-term storage strategies. Most agree that distributed systems, one way or another, will be involved. But when it comes down to picking the distributed system–be it a file-based system like ...

4 Considerations When Choosing a Hadoop Distribution

Choosing the right Hadoop distribution can be a tricky process. Many businesses looking to adopt Hadoop in their data infrastructure have a hard time figuring out what really differentiates one distribution from another. With so many options available, it’s easy ...

Apache Drill 1.0 is Now Generally Available

Today, we are extremely excited and proud to announce the general availability (GA) of Apache Drill 1.0, as part of the MapR Distribution. Congratulations to the Drill community on this significant milestone and achievement! Incubated in September 2012 as an Apache ...
1 12 13 14 15 16 22