Hadoop RDBMS Announced
The company says the database provides the scale-out technology of Hadoop, the distributed real-time computing power of the key-value store HBase, and the full features of an RDBMS, including ANSI SQL and ACID transactions.
The Splice Machine database is built on two technology stacks: Apache Derby, a Java-based, ANSI SQL Database, and HBase/Hadoop. The company has replaced the storage engine in Apache Derby with HBase, retained the Apache Derby parser, and redesigned the planner, optimizer, and executor to make use of the distributed HBase computation engine. HBase co-processors are used to embed Splice Machine in each distributed HBase region or data shard.
This means computational tasks can be pushed down to the distributed HBase data shards, so achieving “massive parallelization”, according to the Splice Machine product description.
Another advantage of the product is that because Splice Machine does not modify HBase, it can be used with any standard Hadoop distribution that has HBase. Supported Hadoop distributions include Cloudera, MapR and Hortonworks.
On the relational side, Apache Derby uses the IBM DB2 SQL dialect, and supports JDBC and SQL for programming. Spice Machine is SQL-99 compliant. For its transactional support, it uses Multiple Version Concurrency Control (MVCC) with snapshot isolation to provide high transactional throughput without record locking.
Writing on the Splice Machine blog, the company’s founders say that the standalone version of Splice Machine can be used on MacOS, Windows or Linux (Ubuntu or CentOS), and provides:
“an excellent way to experiment with Splice Machine with a reasonably small amount of data”.
The clustered version can be used on a cluster of Linux (Ubuntu or CentOS) machines and allows Splice Machine to access large amounts of data that are spread across the cluster. There’s a trial download of a standalone version that allows you perform functional testing of Splice Machine. Source