Martin Heller
Contributing Writer

FaunaDB review: Fast NoSQL database for global scale

reviews
Dec 12, 201912 mins
DatabasesNoSQL DatabasesWeb Development

Low latency, strong consistency, and high scalability make FaunaDB an excellent choice for greenfield web or mobile apps

world map / Africa / binary code
Credit: -mosquito- / Getty Images

Distributed databases have become interesting and attractive in the last decade, as companies with world-wide operations require transactional databases with horizontal scalability and global reach. Thereโ€™s an essential tension between geographic distribution and low transaction latency, however: The speed of light limits the transmission time between distant nodes.

editors choice award logo plum InfoWorld

To allow for high throughput on write transactions, many NoSQL databases have weakened their transaction support, either by prohibiting cross-partition transactions, or by downgrading their consistency guarantees from strong (synchronous transactions) to eventual (asynchronous transactions). Most databases use a two-phase commit scheme for transactions, which drives up the transaction latency when there is geographic distribution of nodes. However, many recent distributed databases use either a Paxos or Raft scheme for quorum-based transaction consensus, which lowers the transaction latency.

FaunaDB is a distributed, strongly consistent OLTP NoSQL database that is ACID compliant and offers a multi-model interface. It has an active-active architecture and can span clouds as well as continents. FaunaDB supports document, relational, graph, and temporal data sets from a single query. In addition to its own FQL query language, the product supports GraphQL, with SQL planned for the future.

FaunaDB is the first database to use the Calvin cross-shard transactional protocol, which allows for single-phase commits without reliance on clocks and without loss of consistency. FaunaDB also uses the Raft consensus system for individual shards. Weโ€™ll explain these in more detail when we discuss the FaunaDB architecture.

Competition for FaunaDB in the area of globally distributed NoSQL databases includes Azure Cosmos DB, Amazon DocumentDB, Amazon DynamoDB, and YugaByte DB. Google Cloud Spanner and CockroachDB are its globally distributed relational database competitors.

FaunaDB architecture

FaunaDB claims architectural innovations at every layer. The biggest innovation is probably the use of Calvin as a distributed transaction protocol instead of the Google Spanner or older Google Percolator protocols.

Calvin was originally described in a 2012 paper by Abadi et al. of Yale:

Calvin is a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. Unlike previous deterministic database system prototypes, Calvin supports disk-based storage, scales near-linearly on a cluster of commodity machines, and has no single point of failure. By replicating transaction inputs rather than effects, Calvin is also able to support multiple consistency levelsโ€”including Paxos-based strong consistency across geographically distant replicasโ€”at no cost to transactional throughput.

In other words, using Calvin for distributed transactions gives FaunaDB single-phase commit and the option to guarantee strict serializability, even in globally distributed clusters without clock synchronization. Along with that, FaunaDB can boast low write latency (under 200 ms on average) and 99.99% uptime. According to Fauna, โ€œThe use of Calvin also allows FaunaDB to implement a master-less architecture. With replicas in a cluster, geographically distributed across many locations, FaunaDB provides active-active transactions that allow applications to scale horizontally across the globe without a single line of code.โ€

FaunaDB implements a semi-structured, schema-free, object-relational data model that is a superset of the relational, document, object-oriented, and graph paradigms. The data model allows enforcing constraints, creating indexes, and joining across multiple document entities. It also offers polyglot APIs mediated by drivers for a number of different programming languages. In short, the FaunaDB data model allows you do whatever you need with your database in a unified way.

By contrast, Azure Cosmos DB implements separate relational, document, and graph layers, each with its own query language and API. Similarly, YugaByte DB implements separate relational, wide-column, and key-value plug-ins.

FaunaDB provides both administrative and application-level identity and security using tokens. You can access the database securely through API servers, or directly from mobile, browser, and embedded applications.

FaunaDB has undergone extensive Jepson testing and passed with flying colors after fixing about 19 issues that came up during testing. Jepson โ€œis an effort to improve the safety of distributed databases, queues, consensus systems, etc.โ€ One of the statements to come out of the Jepson report on FaunaDB summarizes the databaseโ€™s transactional architecture:

FaunaDB is based on peer-reviewed research into transactional systems, combining Calvinโ€™s cross-shard transactional protocol with Raftโ€™s consensus system for individual shards. We believe FaunaDBโ€™s approach is fundamentally soundโ€ฆCalvin-based systems like FaunaDB could play an important future role in the distributed database landscape.

faunadb review 01 IDG

FaunaDB public cloud status for one day. There are three Amazon regions (east coast, west coast, and Europe) and one Google region (midwest). This reflects all activity on the cloud cluster, not just my activity.

FaunaDB query languages and drivers

FaunaDB currently supports two query languages, its own FQL and the open-source GraphQL. FQL is more capable, but GraphQL has more traction thanks to its use at Facebook, GitHub, and other prominent tech companies.

The easiest ways to test queries against FaunaDB are to use the FaunaDB Shell or the FaunaDB web console. Youโ€™ll see both of them in action in the Quick Start sections below.

FQL (Fauna Query Language) is an expression-oriented language with some characteristics of a functional programming language. FQL operates primarily on the schema types provided by FaunaDB, which include documents, collections, indexes, sets, and databases. If you compare FQL concepts to SQL concepts, FaunaDB documents correspond to relational rows, collections to tables, databases to schemas, and FaunaDB indexes to both SQL indexes and materialized views. FaunaDB sets are sorted groups of tuples.

The following is an example of an FQL query that creates multiple blog posts in the collection โ€œpostsโ€ using the Map function, which applies a Lambda function serially to each member of the array.

Map(
ย  [
ย ย ย  "My cat and other marvels",
ย ย ย  "Pondering during a commute",
ย ย ย  "Deep meanings in a latte"
ย  ],
ย  Lambda("post_title",
ย ย ย  Create(
ย ย ย ย ย  Collection("posts"), { data: { title: Var("post_title") } }
ย ย ย  )
ย  )
)

GraphQL is an open source data query and manipulation language that provides declarative schema definitions and a composable query syntax. The following is an example of a GraphQL query against a database about Star Wars movies.

faunadb review 02 IDG

At left is a sample GraphQL query, and at right is the beginning of the data returned. Note that the data has the same shape as the query.

FQL is available through drivers for nine programming languages. Each driver is available as an import in its languageโ€™s standard library import interface. For example, the JavaScript driver is available as an NPM package and is imported with var fdb = require('faunadb').

All of the language drivers are open source. The Android, Scala, and Java bindings share a common JVM driver.

faunadb review 03 IDG

FaunaDB currently has 9 supported programming language-specific drivers. They are for Android, C#, Go, Java, JavaScript, Python, Ruby, Scala, and Swift.

FaunaDB use cases

Fauna has created white papers for real-time consumer apps, financial services, game development, and retail and e-commerce. In a 2018 technical white paper, Fauna describes successful FaunaDB application patterns based on customer usage: as a distributed ledger; as a distributed app back-end; for SaaS with multi-tenancy and QoS; to integrate legacy silos; to consolidate applications; to globally distribute data; to unify on-premise and cloud data; and to manage cross-workload access to shared data.

Another common use case for FaunaDB is as the storage layer for JAMstack apps.ย  JAMstack is a modern architecture that avoids web servers in favor of JavaScript, APIs, and markup. JAMStack apps often use Netlify (an all-in-one platform for automating modern web projects), React (a JavaScript library for building user interfaces), Gatsby (a site generator that emits React.js), Jekyll (a Ruby-based site generator that starts with Markdown documents), Hugo (a fast Go-based site generator), or Nuxt (a site generate that emits Vue.js).

FQL quick start

The FQL Quick Start can be run on a local Fauna command line shell or in the web shell found in the FaunaDB console.

faunadb review 04 IDG

The FaunaDB web shell has essentially the same functionality as the downloadable Fauna command-line shell. You can find the shell within the console once you have selected a database.

I did my FQL quick start in the Terminal of a Mac. I added a few exploratory commands not shown in the tutorial for clarity. I also worked around one or two obvious small omissions in the documentation, for example using the actual post IDs from my session rather than the IDs in the documentation.

martinheller@Martins-Retina-MacBook ~ % fauna help
faunadb shell

VERSION
ย  fauna-shell/0.9.8 darwin-x64 node-v12.6.0

USAGE
ย  $ fauna [COMMAND]

COMMANDS
ย  add-endpoint
ย  autocompleteย ย ย ย ย  display autocomplete installation instructions
ย  cloud-login
ย  create-database
ย  create-key
ย  default-endpoint
ย  delete-database
ย  delete-endpoint
ย  delete-key
ย  eval
ย  helpย ย ย ย ย ย ย ย ย ย ย ย ย  display help for fauna
ย  list-databases
ย  list-endpoints
ย  list-keys
ย  run-queries
ย  shell

martinheller@Martins-Retina-MacBook ~ % fauna list-databases
listing databases
my_app
main_ledger
martinheller@Martins-Retina-MacBook ~ % fauna create-database my_db
creating database my_db

ย  created database 'my_db'

ย  To start a shell with your new database, run:

ย  fauna shell 'my_db'

ย  Or, to create an application key for your database, run:

ย  fauna create-key 'my_db'

martinheller@Martins-Retina-MacBook ~ % fauna shell 'my_db'
Starting shell for database my_db
Connected to https://db.fauna.com
Type Ctrl+D or .exit to exit the shell
my_db> CreateCollection({ name: "posts" })
{
ย  ref: Collection("posts"),
ย  ts: 1573056452245000,
ย  history_days: 30,
ย  name: 'posts'
}
my_db> CreateIndex({
...ย ย  name: "posts_by_title",
...ย ย  source: Collection("posts"),
...ย ย  terms: [{ field: ["data", "title"] }]
... })
{
ย  ref: Index("posts_by_title"),
ย  ts: 1573056468580000,
ย  active: true,
ย  serialized: true,
ย  name: 'posts_by_title',
ย  source: Collection("posts"),
ย  terms: [ { field: [ 'data', 'title' ] } ],
ย  partitions: 1
}
my_db> Create(
...ย ย  Collection("posts"),
...ย ย  { data: { title: "What I had for breakfast .." } }
... )
{
ย  ref: Ref(Collection("posts"), "248300322187903506"),
ย  ts: 1573056490060000,
ย  data: { title: 'What I had for breakfast ..' }
}
my_db> Map(
...ย ย  [
...ย ย ย ย  "My cat and other marvels",
...ย ย ย ย  "Pondering during a commute",
...ย ย ย ย  "Deep meanings in a latte"
...ย ย  ],
...ย ย  Lambda("post_title",
.....ย ย ย ย  Create(
.......ย ย ย ย ย ย  Collection("posts"), { data: { title: Var("post_title") } }
.......ย ย ย ย  )
.....ย ย  )
... )
[
ย  {
ย ย ย  ref: Ref(Collection("posts"), "248300337888232978"),
ย ย ย  ts: 1573056505030000,
ย ย ย  data: { title: 'My cat and other marvels' }
ย  },
ย  {
ย ย ย  ref: Ref(Collection("posts"), "248300337888234002"),
ย ย ย  ts: 1573056505030000,
ย ย ย  data: { title: 'Pondering during a commute' }
ย  },
ย  {
ย ย ย  ref: Ref(Collection("posts"), "248300337888231954"),
ย ย ย  ts: 1573056505030000,
ย ย ย  data: { title: 'Deep meanings in a latte' }
ย  }
]
my_db> Get( Ref(Collection("posts"), "248300322187903506"))
{
ย  ref: Ref(Collection("posts"), "248300322187903506"),
ย  ts: 1573056490060000,
ย  data: { title: 'What I had for breakfast ..' }
}
my_db> Get( Ref(Collection("posts"), "248300337888231954"))
{
ย  ref: Ref(Collection("posts"), "248300337888231954"),
ย  ts: 1573056505030000,
ย  data: { title: 'Deep meanings in a latte' }
}
my_db> Get(
...ย ย  Match(
.....ย ย ย ย  Index("posts_by_title"),
.....ย ย ย ย  "My cat and other marvels"
.....ย ย  )
... )
{
ย  ref: Ref(Collection("posts"), "248300337888232978"),
ย  ts: 1573056505030000,
ย  data: { title: 'My cat and other marvels' }
}
my_db>

GraphQL quick start

I did the GraphQL quick start online in the console. I donโ€™t think that Fauna has added GraphQL capabilities to its command-line client, and I couldnโ€™t see any reason to use a third-party GraphQL client app.

The GraphQL application for this quick start is a simple to-do list. The schema is as follows. You need to create or download this file to your computer for the next steps.

type Todo {
ย ย  title: String!
ย ย  completed: Boolean
}

type Query {
ย ย  allTodos: [Todo!]
ย ย  todosByCompletedFlag(completed: Boolean!): [Todo!]
}

You start the tutorial in the FaunaDB console, at the top-level dashboard.

faunadb review 05 IDG

The FaunaDB console shows you your databases and your usage, as well as offering a link to create a new database.

From there, you create a database.

faunadb review 06 IDG

Creating a database is just a matter of setting the database name. The priority did not matter for my use cases.

To import the schema, you upload it from your computer to the GraphQL Playground.

faunadb review 07 IDG

The GraphQL playground, like the FQL shell, is accessible once you select a database.

Once you upload the GraphQL schema, FaunaDB creates the necessary database objects. Then you can add and query documents in the GraphQL Playground.

faunadb review 08 IDG

Here we are creating and displaying a to-do item using the GraphQL playground. The result appears at the right after pressing the arrow to execute the mutation (DML) query. The createTodo function was automatically generated from the schema.

faunadb review 09 IDG

We ran a query to find all to-do documents in a second tab in the GraphQL Playground. This time we included the document ID in the output.

You can find other FaunaDB tutorials and references within the online documentation.

FaunaDB and competitors

FaunaDB is a good choice of database for greenfield web or mobile apps that need to be available globally with low latency and serializable consistency. With drivers for nine popular programming languages, FaunaDB isnโ€™t likely to present an impedance mismatch to your code. On the other hand, FaunaDB doesnโ€™t yet support SQL, so your database code will either be in FQL (powerful but proprietary) or GraphQL (less powerful, but open source and fairly easy to learn).

FaunaDB is especially appropriate for use with JAMstack apps. Whether FaunaDB will be cost-effective for scaling a brownfield app depends very much on the existing database, schema type, and query language used by the app. Donโ€™t forget to factor in the costs of migrating your data if you have an existing database.

If you absolutely require SQL compatibility, FaunaDB is not yet a good option for you. Youโ€™ll do better with Google Cloud Spanner, CockroachDB, YugaByte DB, or Azure Cosmos DB. If you donโ€™t need SQL, you want good scalability, and you are willing to give up global strong consistency, then there are several more alternatives to FaunaDB, such as Couchbase, DataStax, and MongoDB.

โ€”

Cost:ย Always free plan: Free, limited to 5 GB storage, 100K read ops per day, 50K write ops per day, 50 MB per day transfer out; for overage charges see the Utility plan.ย Utility plan: $0.18 per GB per month storage, $0.05 per 100K reads, $0.2 per 100K writes, $0.10 per GB per day transfer out.ย Pro plan: $99/month, 200 GB reserved storage, 1.5M per day reserved reads, 750K per day reserved writes, 500 MB per day reserved transfer out.ย Enterprise plan: custom, contact Fauna.ย 

Platform: Amazon Web Services, Microsoft Azure (in development), and Google Cloud Platform clouds; available serverless (shared public cloud cluster) or as a VPS.ย Docker, Zip, RPM, or Deb installers for local offline development; requires 16 or more GB RAM, Java 8 or later, and NTP on all nodes.ย 

Martin Heller

Martin Heller is a contributing writer at InfoWorld. Formerly a web and Windows programming consultant, he developed databases, software, and websites from his office in Andover, Massachusetts, from 1986 to 2010. From 2010 to August of 2012, Martin was vice president of technology and education at Alpha Software. From March 2013 to January 2014, he was chairman of Tubifi, maker of a cloud-based video editor, having previously served as CEO.

Martin is the author or co-author of nearly a dozen PC software packages and half a dozen Web applications. He is also the author of several books on Windows programming. As a consultant, Martin has worked with companies of all sizes to design, develop, improve, and/or debug Windows, web, and database applications, and has performed strategic business consulting for high-tech corporations ranging from tiny to Fortune 100 and from local to multinational.

Martinโ€™s specialties include programming languages C++, Python, C#, JavaScript, and SQL, and databases PostgreSQL, MySQL, Microsoft SQL Server, Oracle Database, Google Cloud Spanner, CockroachDB, MongoDB, Cassandra, and Couchbase. He writes about software development, data management, analytics, AI, and machine learning, contributing technology analyses, explainers, how-to articles, and hands-on reviews of software development tools, data platforms, AI models, machine learning libraries, and much more.

More from this author