nosqlIn-memory NoSQL datastores such as open source Redis and Memcached are becoming the de-facto standard for every web/mobile application that cares about its user’s experience. Still, large enterprises have struggled to adopt these databases in recent years due to challenges with performance, scalability and availability.

Fortunately, modern application languages (Ruby, Node.js, Python, etc.) and platforms (Rails, Sinatra, Django, etc.) have already built a set of tools and libraries that leverage the performance and variety of command types for in-memory datastores (especially in the Redis case) to implement a set of very popular use cases.

These open source software project use cases include job management, forums, real-time analytics, Twitter clone, geo search and advanced caching.

However, for each of these applications (and many use cases we may not have even imagined yet), the availability, scalability and performance of the datastore is extremely critical to success.

This article will outline how to make an in-memory NoSQL datastore enterprise-ready, with tips and recommendations on how to overcome the top seven challenges associated with managing these databases in the cloud.

1. Availability

No matter what you do, your dataset should always be available for your application. This is extremely important for in-memory datastores because, without the right mechanisms in place, you will lose part or all of your dataset in one of the following events:

  1.     node failure (frequently happens in the cloud);
  2.     process restart (you may need this from time to time); or
  3.     scaling event (let’s hope you will need this).

There are two main mechanisms that must be implemented in order to support cases #1 and #2 above     Replication: You should make sure you have at least one copy of your dataset hosted on another cloud instance, preferably in a different data center if you want to be protected against data center failure events, like those that happened at least four times to Amazon Web Services (AWS) during 2012. Is that easy? Unfortunately, it’s not. Here is just one scenario that makes replication challenging:

Once your application “write” throughput increases, you may find that your application servers are writing faster than your replication speed, especially if there are network congestion issues between your primary and replica nodes. And once this starts happening, if your dataset is large enough, there is a great chance that your replica will never be synced again. Read more