Yahoo SAMOA, Open Source Platform for Mining Big Data Streams
Yahoo SAMOA (Scalable Advanced Massive Online Analysis) is a framework for mining big data streams and applying distributed machine learning algorithms. You can think of SAMOA as Mahout for streaming.
SAMOA (Scalable Advanced Massive Online Analysis) is a framework for mining ...
Essential big data tools for government
Big data. It’s massive. It comes in all types of formats. It’s dynamic, changing.
Government managers are looking to derive value from the mountains of data collected by their agencies to tackle a host of issues, including cybersecurity, fraud detection, crime ...
Managing Multiple Resources in Hadoop 2 with YARN
An overview of Cloudera’s contributions to YARN that help support management of multiple resources, from multi-resource scheduling to node-level enforcement
As Apache Hadoop become ubiquitous, it is becoming more common for users to run diverse sets of workloads on Hadoop, and ...
The 4 Key Pillars of Hadoop Performance and Scalability
The era of Big Data has arrived. Once dismissed as a buzzword, organizations are now recognizing the benefits of capturing and analyzing mountains of information to gain actionable insights that foster innovation and competitive advantage. To that end, open-source Hadoop ...
Should DBAs learn Hadoop?
sqlI’m a DBA with a working knowledge of Oracle, SQL Server, and MySQL. I have been reading more and more about big data, should I learn Hadoop?
First, thank you for emailing your question. The mere nature of your question tells ...
Hadoop 2 Pushes Big Data Further Into The Mainstream
The Big Data world is on a much firmer footing, as Apache Foundation’s Hadoop version 2 makes the platform more usable for business, and vendors are exploring possibilities to deliver easier-to-use software, according to the founder of the Hadoop movement.
Version ...
3 Ways to Move SQL into Hadoop Faster
Despite the increasing interest in unstructured data, much of the world's information still lives in some form of relational database. Firing up your first Hadoop cluster often means moving data from existing SQL tables, maybe hosted in SQL Server, MySQL ...
Government Networks Unprepared for Cloud, Big Data Transitions
As federal agencies gear up for a series of major IT transitions, a majority of government network managers say that their government software systems lack the capacity that will be required to meet the additional load that cloud computing, big ...
Google taps big data for universal translator
Google Translate is currently best known for being a quick and dirty way to render Web pages or short text snippets in another language. But according to Der Spiegel, the next step for the core technology behind that service is ...
7 Tips for Improving MapReduce Performance
One service that Cloudera provides for our customers is help with tuning and optimizing MapReduce jobs. Since MapReduce and HDFS are complex distributed systems that run arbitrary user code, there’s no hard and fast set of rules to achieve optimal ...






