Microsoftโs new database provides three cutting-edge approaches designed for applications that span geographic regions
Cloud computing isnโt like working on-premises. Instead of limiting code to one or maybe two datacenters, weโre designing systems that span not just continents but the entire world.
And thatโs where we start to get issues. Even using fiber connections, the latency of crossing the Atlantic Ocean is around 60ms, though in practice delays are around 75ms. The Pacific is wider, so latency through trans-Pacific fiber is around 100ms.
Delays add up, and they make it hard to ensure that distributed databases are in sync. That makes it harder still to be sure that a query in the U.K. will return the same result as one in the U.S. Yes, most replication strategies mean that eventually the two will have the same content, but thereโs a big question over just when that will happen. If the connections are busy, or there a lot of database writes, data can easily get delayed.
Microsoftโs recently launched Cosmos DB aims to simplify distributed application development, with more ways to reduce latency and five consistency models. Youโll find much of it familiar if you used the Document DB service, because Cosmos DB takes its capabilities and adds more data models and more query options. Cosmos DB also adds API compatibility with common on-premises and in-cloud NoSQL services, with more on the way.
APIs define how Cosmos DB handles data and how it exposes its contents. Although you choose a default API during set up, you query your data using any supported API. So, you can start with the familiar MongoDB APIs, then switch to Gremlin for graph queries on your data.
How cloud services typically ensure data consistency
Most distributed cloud services take one of two approaches to ensuring data consistency:
One approach, strong consistency, doesnโt allow reads until all associated writes are complete. Any read always returns the latest version of an item. If you use strong consistency for your data, your application is only as fast as the latency among all the regions where you store data. Thatโs why Cosmos DB limits you to using only a single Azure region. Itโs a belt-and-braces way of working with data, and it works well where writes matterโand where your app doesnโt need instant access to data. But that approach makes strong consistency impractical if thereโs any requirement for near-real-time data.
The other approach, eventual consistency, is a lazier approach to working with data. Itโs more focused on applications that need to read data as soon as itโs written. Data is read at any time, but thereโs a risk that it could be changed by another write. That leaves you with a level of indeterminacy in your data: You know at some point itโs going to be accurate, just not when that will be. Data may arrive in any order, so donโt use eventual consistency for series data. Thereโs even the chance that if you read the same item twice, the second time could return older data than the first.
How Cosmos DB handles data consistency in the real world
But only a small percentage of Cosmos DB users will use either of these approaches. Instead, most will take advantage of three alternative consistency models, based on the work of Turing Award winner Leslie Lamport. That foundation let Microsoft create a database to handle more realistic situations, and deliver distributed applications without the penalties of traditional consistency models.
The first alternative consistency model, bounded staleness, gives you a point at which reads and writes are in sync. Before it, thereโs no guarantee, after it, youโre always accessing the latest version (at least until the next time that item updates). You define the boundary as either a number of versions or a time interval. Outside the boundary, everything is consistent, inside it, thereโs no guarantee of a read returning the latest data. The result is a store that has an element of strong consistency, while still giving you low latency and the option of global distribution and high reliability. You use this model if you want to be sure that all reads are consistent, wherever they are, and that all writes are fast. You also get data thatโs correct if you read it in the region where itโs written.
The second alternative consistency model, session consistency, works well when you drive reads and writes from a client app. The client gets to read its own writes, while data replicates across the rest of the network. This way, you have low latency access to the data you need, along with knowing that youโll fail over in the event of any downtimeโand that your application will run in any Azure region.
Microsoft has added a third alternative consistency model, consistent prefix, in Cosmos DB. Consistent prefix adds predictability to the speed of eventual consistency. You might not see the latest write when you read the data, but your reads will never be out of order. Thatโs a useful feature, because itโs both fast and predictable. Write A, then B, and then C, and your client will see A, or A and B, but never just A and C. Eventually all the Cosmos DB regions will converge on A, B, and C, giving you speed and reliability.
Cosmos DB is a very different beast from much of its competition. Many NoSQL services offer some limited form of distributed access, but theyโre aimed only at offering redundancy and disaster recovery. Others, like Googleโs Spanner, offer some similar features, but only across datacenters in a single region. That might be fine if youโre working with a U.S.- or E.U.-only audience, but more and more cloud services have a global reach. Spannerโs low latency with strong consistency is a nice option to have, but itโs less valuable when cross-regional data replication becomes a major bottleneck.
The type of consistency you choose for Cosmos DB depends on your application. Are you focusing on writing data or reading it? How is the data used? Each consistency model has its pros and its cons, and you need to consider them carefully before making your choice. Session consistency is a good place to start for most app-centric data, but itโs worth experimenting with other options, especially if you donโt need instant access to data from all over the world.


