NoSQL, NewSQL, or RDBMS-How To Choose
Today’s databases are not only expected to be flexible enough to handle a variety of data formats, they’re also expected to deliver extreme performance and to scale to handle humongous data volumes. Database architects have responded with NoSQL and NewSQL alternatives to relational database management systems (RDBMS), but how do you know when to choose which option?
To answer this question, start with a fundamental understanding of all three technologies. RDBMS can guarantee performance on the order of thousands of transactions per second. But the new face of online transaction processing (OLTP) in scenarios such as real-time advertising, fraud detection, multi-player games, and risk analysis, to name a few, involves close to a million transactions per second — a pace that traditional RDBMS typically can’t handle.
RDBMS have always been distinguished by the ACID principle set (atomicity, consistency, integrity, and durability), which ensures that data integrity is preserved at all costs. SQL became the de-facto standard of data processing because it combines elements like data definition, data manipulation, and data querying, all under one umbrella.
NoSQL database management systems store data in a variety of formats, chief among them being document store, graph store, and key-value store. Most NoSQL products jettison ACID performance to achieve data storage flexibility. They remove hard constraints, such as tabular row-store and strict data definitions, and they provision for scale with distributed architectures supporting high-performance throughput.
The newest entrants in the database arena, NewSQL, retain both SQL and ACID, but they overcome the performance overhead of RDBMS caused by features such as latching shared data structures, buffer pooling, record level locking, and write-ahead logging, primarily by embracing distributed computing architectures.
How do you choose?
To address the choice of database types, start with the following questions:
- To what extent do you rely on data in terms of storage, processing, and analysis? The degree of dependency in each area can hugely shape the choice of a database. Application development, for example, is not heavily data centric, but data analysis is. Certain businesses revolve around data while others use data to supplement their core focus areas.
- How important are the scale, flexibility, and performance aspects of a DBMS?
- What is your level of investment in incumbent technologies? If you’re already invested in a DBMS, are you prepared to incur the cost of migrating to a newer technology (and possibly face feature incompatibilities or administrative and programming skill gaps among your staff)?
Table 1 below sheds light on the comparative capabilities and strengths of RDBMS, NoSQL, and NewSQL databases.
|ACID compliance (Data, Transaction integrity)||Yes||No||Yes|
|Data analysis (aggregate, transform, etc.)||Yes||No||Yes|
|Schema rigidity (Strict mapping of model)||Yes||No||Maybe|
|Data format flexibility||No||Yes||Maybe|
|Scale up (vertical)/Scale out (horizontal)||Yes||Yes||Yes|
|Performance with growing data||Fast||Fast||Very Fast|
|Popularity/community Support||Huge||Growing||Slowly growing|
The nature of your data ultimately dictates the choice of database technologies. For instance, transactional data that requires strict compliance with data integrity and consistency favors the usage of RDBMS and NewSQL over NoSQL.
Volatile data, on the other hand, is characterized by changing object models and data structure formats that demand flexibility and make NoSQL the top choice followed by NewSQL to a lesser extent. RDBMS, with their rigidity of schema design, can prove very costly when dealing with such data. Read more