How-to Implement Role-based Security in Impala using Apache Sentry
Apache Sentry (incubating) is the Apache Hadoop ecosystem tool for role-based access control (RBAC). In this how-to, I will demonstrate how to implement Sentry for RBAC in Impala. I feel this introduction is best motivated by a use case.
Data warehouse ...
What Can GPFS on Hadoop Do For You?
The Hadoop Distributed File System (HDFS) is considered a core component of Hadoop, but it’s not an essential one. Lately, IBM has been talking up the benefits of hooking Hadoop up to the General Parallel File System (GPFS). IBM has ...
Using Oozie 4.4.0 with Hadoop 2.2
The current version of Oozie (4.0.0) doesn’t build correctly when you try and target Hadoop 2.2. The Oozie team have a fix going into release 4.0.1 (see OOZIE-1551), but until then you can hack the Maven files to get it ...
Pivotal Brings In-Memory Analysis To Hadoop
Pivotal, the EMC spin-off company pursuing modern application development in the context of cloud computing and big-data analysis, on Monday released Pivotal HD 2.0, an update of its Hadoop distribution incorporating an in-memory database and a battery of new analysis ...
Hadoop and NoSQL Now Data Warehouse-Worthy-Gartner
Not long ago, the rules for what constituted a data warehouse were fairly well defined. The schema was fixed, you could say, and was based primarily on relational database technology designed to process structured data. My, how times have changed. ...
Avoiding Split Brainedness in HA Hadoop Clusters
The US Patent Office recently granted Zettaset a patent for the underlying technology in its Hadoop high availability that prevents a "split-brain" situation where multiple master nodes think they're in control of the Hadoop cluster. It's a feather in the ...
Why Apache Spark is a Crossover Hit for Data Scientists
Spark is a compelling multi-purpose platform for use cases that span investigative, as well as operational, analytics.
Data science is a broad church. I am a data scientist — or so I’ve been told — but what I do is actually ...
Integrating Hadoop into Business Intelligence and Data Warehousing
Information from SAS and TDWI Research
The purpose of this report is to accelerate users’ understanding of the many new products and practices based on Hadoop technologies that have emerged in recent years. While Hadoop usage is a minority practice today, ...
Introduction to Impala
Impala in terms of Hadoop has got the significance because of its,
Scalability
Flexibility
Efficiency
What’s Impala?
Impala is…
Interactive SQL–Impala is typically 5 to 65 times faster than Hive as it minimized the response time to just seconds, not minutes.
Nearly ANSI-92 standard and compatible with ...
SAP to buy KXEN to widen appeal of big data analytics
Enterprise software giant SAP says it's buying predictive-analytics firm KXEN to increase use of its own big-data analysis tools among ordinary business staff by adding more automation.
The planned acquisition for an undisclosed sum will allow SAP to absorb KXEN's "powerful ...






