Serdar Yegulalp
Senior Writer

Google’s Cloud Spanner melds transactional consistency, NoSQL scale

news analysis
May 4, 20174 mins

The research behind the horizontally scalable, SQL-compatible database has spawned imitators, but Google's private network is the real secret sauce

bridge cloud
Credit: Unsplash

Earlier this year, Google offered a peek at Cloud Spanner, an automanaged database service that melds features from both conventional relational systems and NoSQL technologies.

Today, Google announced Cloud Spanner will be available to the general public later this month. It will compete not only with rival cloud databases, but also up-and-coming open source projects that address scale and reliability issues by using Google’s own ideas.

The best of both worlds

Google presents Cloud Spanner as a happy medium between two common database needs that often prove incompatible. A database can be highly scalable and distributed (the NoSQL approach), or it can be transactionally consistent (the conventional database approach). Cloud Spanner aims to be both.

As laid out in a 2012 research paper, one key to accomplish this is a time synchronization mechanism for actions that need to be kept consistent between nodesβ€”such as globally consistent read operations, which people expect from a transactional database.

This sync mechanism takes into account the potential differences between timestamps provided by different machines in the cluster and can β€œwait out” the differences if they are too large. But the system also tries to keep uncertainty to a minimum by drawing on multiple time sources to increase clock accuracy. As a result, it’s easier to get operations spread across multiple nodes (for example, MapReduce) to agree on when something was achieved and to deliver consistent results.

In a white paper published earlier this year, Google talked about another key element: How Cloud Spanner leverages Google’s own network. Of the three characteristics that are most desired from a distributed systemβ€”consistency, availability, and tolerance for splits between nodesβ€”Cloud Spanner tries to deliver all three by making slight but often undetectable sacrifices to availability, aided by the fact that the service runs on Google’s own highly redundant network.

A little more scale, a little less SQL

The actual database Google has created from this technology strongly resembles other cloud-hosted transactional databases, but with some potentially irksome differences.

First, Cloud Spanner is advertised as having support for ANSI 2011 SQL queries. The documentation shows this is true for SELECT queries; they support all the familiar SQL syntax, including JOIN and GROUP BY. But INSERT and UPDATE commands are not available; according to a blog post at Quizlet, which used Cloud Spanner in beta, you need to use β€œRPCs for mutating rows given their primary key” instead. Some of this is made easier through Cloud Spanner’s language and interface support, as it provides libraries for Go, Java/JDBC, Node.js., and Python, as well as support for REST calls.

Cloud Spanner’s other touted advantage is scale and availability.Β The database autoscales based on demand, with pricing based on the number of nodes in use, storage needed on those nodes, and outbound bandwidth consumed. Right now the size of a database influences the number of nodes required to deploy it; every 2TB of database storage requires at least oneΒ nodeΒ to support it.

Imitation and flattery

Cloud Spanner’s promises are echoes of features in other database products, although Google is clearly hoping to compete broadly by offering a better amalgamation of features in one place.

Take autoscaling, for instance. Ex-Microsoftie Bob Muglia served up SnowflakeΒ as a cloud data-warehouse system that didn’t need to be tweaked or tuned. There, Google can almost certainly compete on pricing, as it has its own infrastructure, where Snowflake is implemented on Amazon.

Speaking of Amazon, it has a few products that could be competition. Aurora, for instance, is Amazon’s hosted version of MySQL, and it beats Google’s MySQL offering for high-end work. It also has the advantage of being familiar and widely supported; there’s barely a database developer who hasn’t touched MySQL at some point. But again, Google’s hope is that Cloud Spanner will compete by offering better scale across the board, including for write operations and not only reads.

Then there’s CockroachDB, which is approaching its first full 1.0 version. This open source database project is an implementation of the ideas in Google’s Spanner paper, in much the same way Google’s paper on MapReduceΒ inspired Hadoop.

Where Google wants to stand out, though, is in the execution. That explains the white paper professing how it isn’t only the time-synchronization functions that makes Cloud Spanner special, but also Google’s tight control over the networking between nodes. It might be possible for another cloud to implement that through a CockroachDB-based service, but Google’s counting on first-mover advantageβ€”and all the major back-end resources it can work withβ€”to make an impression.

Serdar Yegulalp

Serdar Yegulalp is a senior writer at InfoWorld. A veteran technology journalist, Serdar has been writing about computers, operating systems, databases, programming, and other information technology topics for 30 years. Before joining InfoWorld in 2013, Serdar wrote for Windows Magazine, InformationWeek, Byte, and a slew of other publications. At InfoWorld, Serdar has covered software development, devops, containerization, machine learning, and artificial intelligence, winning several B2B journalism awards including a 2024 Neal Award and a 2025 Azbee Award for best instructional content and best how-to article, respectively. He currently focuses on software development tools and technologies and major programming languages including Python, Rust, Go, Zig, and Wasm. Tune into his weekly Dev with Serdar videos for programming tips and techniques and close looks at programming libraries and tools.

More from this author