During last week’s PASS Summit day 2 keynote, Rimma Nehme discussed some off the architecture behind Cosmos DB, the globally-distributed flexible database system rolled out earlier this year. Dr. Nehme’s presentations are almost always buzz-worthy, with a great mix of technical depth and big-picture vision. Take 90 minutes to watch the recording; you won’t be disappointed.
Among the many notable points in this presentation was a journey through the concept of database consistency. Database consistency, also referred to as transactional consistency, is the attribute of database systems describing the coupling of different parts of a single transaction.
The simplest example of this is the data tasks that go into a banking transfer. In the database, money doesn’t “move” from one account to another; rather, the amount of money in one account is decreased by adding a credit line item, and the same amount is added to the other account by inserting a debit line item. Although these are two separate operations, they should be treated as one, with the whole operation succeeding or failing as one transaction. In a transactionally consistent system, the two parts of the transaction would not be visible unless both of them were complete (so you wouldn’t be able to query the accounts at the moment of transfer to see that money had been removed from one but not yet added to the other).
During Dr. Nehme’s keynote, she pointed out that most of us look at database consistency as a binary choice: either you have full transactional consistency or you don’t. However, data workloads aren’t always that simple. In Cosmos DB, there are actually five different consistency models to choose from.
Flexible Consistency
For those of us who have spent most of our careers working with ACID-compliant operations, having this kind of flexibility in database consistency requires a shift in thinking. There are many business cases where such transactional consistency is not critical, and the rise of NoSQL-type systems has demonstrated that, for some workloads, transactional consistency is not worth the latency it requires.
Some of us old-school data folks (this guy included) have in the past poked fun at other consistency models, using the term “eventually consistent” as a punch line rather than a real consideration. However, it’s no longer safe to assume that every database system must have absolute transactional consistency. As workloads become larger and more widely distributed, the use of less-restrictive consistency settings will be essential for both performance and availability.
This certainly doesn’t mean that transactionally consistent systems are going away. It just means that the classic RDBMS way of maintaining consistency is no longer the only way to do things.