FaunaDB review: Fast NoSQL database for global scale

Low latency, strong consistency, and high scalability make FaunaDB an excellent choice for greenfield web or mobile apps

Distributed databases have become interesting and attractive in the last decade, as companies with world-wide operations require transactional databases with horizontal scalability and global reach. There’s an essential tension between geographic distribution and low transaction latency, however: The speed of light limits the transmission time between distant nodes.

InfoWorld

To allow for high throughput on write transactions, many NoSQL databases have weakened their transaction support, either by prohibiting cross-partition transactions, or by downgrading their consistency guarantees from strong (synchronous transactions) to eventual (asynchronous transactions). Most databases use a two-phase commit scheme for transactions, which drives up the transaction latency when there is geographic distribution of nodes. However, many recent distributed databases use either a Paxos or Raft scheme for quorum-based transaction consensus, which lowers the transaction latency.

FaunaDB is a distributed, strongly consistent OLTP NoSQL database that is ACID compliant and offers a multi-model interface. It has an active-active architecture and can span clouds as well as continents. FaunaDB supports document, relational, graph, and temporal data sets from a single query. In addition to its own FQL query language, the product supports GraphQL, with SQL planned for the future.

FaunaDB is the first database to use the Calvin cross-shard transactional protocol, which allows for single-phase commits without reliance on clocks and without loss of consistency. FaunaDB also uses the Raft consensus system for individual shards. We’ll explain these in more detail when we discuss the FaunaDB architecture.

Competition for FaunaDB in the area of globally distributed NoSQL databases includes Azure Cosmos DB, Amazon DocumentDB, Amazon DynamoDB, and YugaByte DB. Google Cloud Spanner and CockroachDB are its globally distributed relational database competitors.

FaunaDB architecture

FaunaDB claims architectural innovations at every layer. The biggest innovation is probably the use of Calvin as a distributed transaction protocol instead of the Google Spanner or older Google Percolator protocols.

Calvin was originally described in a 2012 paper by Abadi et al. of Yale:

Calvin is a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. Unlike previous deterministic database system prototypes, Calvin supports disk-based storage, scales near-linearly on a cluster of commodity machines, and has no single point of failure. By replicating transaction inputs rather than effects, Calvin is also able to support multiple consistency levels—including Paxos-based strong consistency across geographically distant replicas—at no cost to transactional throughput.

In other words, using Calvin for distributed transactions gives FaunaDB single-phase commit and the option to guarantee strict serializability, even in globally distributed clusters without clock synchronization. Along with that, FaunaDB can boast low write latency (under 200 ms on average) and 99.99% uptime. According to Fauna, “The use of Calvin also allows FaunaDB to implement a master-less architecture. With replicas in a cluster, geographically distributed across many locations, FaunaDB provides active-active transactions that allow applications to scale horizontally across the globe without a single line of code.”

FaunaDB implements a semi-structured, schema-free, object-relational data model that is a superset of the relational, document, object-oriented, and graph paradigms. The data model allows enforcing constraints, creating indexes, and joining across multiple document entities. It also offers polyglot APIs mediated by drivers for a number of different programming languages. In short, the FaunaDB data model allows you do whatever you need with your database in a unified way.

By contrast, Azure Cosmos DB implements separate relational, document, and graph layers, each with its own query language and API. Similarly, YugaByte DB implements separate relational, wide-column, and key-value plug-ins.

FaunaDB provides both administrative and application-level identity and security using tokens. You can access the database securely through API servers, or directly from mobile, browser, and embedded applications.

FaunaDB has undergone extensive Jepson testing and passed with flying colors after fixing about 19 issues that came up during testing. Jepson “is an effort to improve the safety of distributed databases, queues, consensus systems, etc.” One of the statements to come out of the Jepson report on FaunaDB summarizes the database’s transactional architecture:

FaunaDB is based on peer-reviewed research into transactional systems, combining Calvin’s cross-shard transactional protocol with Raft’s consensus system for individual shards. We believe FaunaDB’s approach is fundamentally sound…Calvin-based systems like FaunaDB could play an important future role in the distributed database landscape.

faunadb review 01 — FaunaDB public cloud status for one day. There are three Amazon regions (east coast, west coast, and Europe) and one Google region (midwest). This reflects *all* activity on the cloud cluster, not just my activity.

FaunaDB query languages and drivers

FaunaDB currently supports two query languages, its own FQL and the open-source GraphQL. FQL is more capable, but GraphQL has more traction thanks to its use at Facebook, GitHub, and other prominent tech companies.

The easiest ways to test queries against FaunaDB are to use the FaunaDB Shell or the FaunaDB web console. You’ll see both of them in action in the Quick Start sections below.

FQL (Fauna Query Language) is an expression-oriented language with some characteristics of a functional programming language. FQL operates primarily on the schema types provided by FaunaDB, which include documents, collections, indexes, sets, and databases. If you compare FQL concepts to SQL concepts, FaunaDB documents correspond to relational rows, collections to tables, databases to schemas, and FaunaDB indexes to both SQL indexes and materialized views. FaunaDB sets are sorted groups of tuples.

The following is an example of an FQL query that creates multiple blog posts in the collection “posts” using the Map function, which applies a Lambda function serially to each member of the array.

Map(
  [
    "My cat and other marvels",
    "Pondering during a commute",
    "Deep meanings in a latte"
  ],
  Lambda("post_title",
    Create(
      Collection("posts"), { data: { title: Var("post_title") } }
    )
  )
)

GraphQL is an open source data query and manipulation language that provides declarative schema definitions and a composable query syntax. The following is an example of a GraphQL query against a database about Star Wars movies.

faunadb review 02 — At left is a sample GraphQL query, and at right is the beginning of the data returned. Note that the data has the same shape as the query.

FQL is available through drivers for nine programming languages. Each driver is available as an import in its language’s standard library import interface. For example, the JavaScript driver is available as an NPM package and is imported with var fdb = require('faunadb').

All of the language drivers are open source. The Android, Scala, and Java bindings share a common JVM driver.

faunadb review 03 — FaunaDB currently has 9 supported programming language-specific drivers. They are for Android, C#, Go, Java, JavaScript, Python, Ruby, Scala, and Swift.

FaunaDB use cases

Fauna has created white papers for real-time consumer apps, financial services, game development, and retail and e-commerce. In a 2018 technical white paper, Fauna describes successful FaunaDB application patterns based on customer usage: as a distributed ledger; as a distributed app back-end; for SaaS with multi-tenancy and QoS; to integrate legacy silos; to consolidate applications; to globally distribute data; to unify on-premise and cloud data; and to manage cross-workload access to shared data.

Another common use case for FaunaDB is as the storage layer for JAMstack apps. JAMstack is a modern architecture that avoids web servers in favor of JavaScript, APIs, and markup. JAMStack apps often use Netlify (an all-in-one platform for automating modern web projects), React (a JavaScript library for building user interfaces), Gatsby (a site generator that emits React.js), Jekyll (a Ruby-based site generator that starts with Markdown documents), Hugo (a fast Go-based site generator), or Nuxt (a site generate that emits Vue.js).

FQL quick start

The FQL Quick Start can be run on a local Fauna command line shell or in the web shell found in the FaunaDB console.

faunadb review 04 — The FaunaDB web shell has essentially the same functionality as the downloadable Fauna command-line shell. You can find the shell within the console once you have selected a database.

I did my FQL quick start in the Terminal of a Mac. I added a few exploratory commands not shown in the tutorial for clarity. I also worked around one or two obvious small omissions in the documentation, for example using the actual post IDs from my session rather than the IDs in the documentation.

martinheller@Martins-Retina-MacBook ~ % fauna help
faunadb shell

VERSION
  fauna-shell/0.9.8 darwin-x64 node-v12.6.0

USAGE
  $ fauna [COMMAND]

COMMANDS
  add-endpoint
  autocomplete      display autocomplete installation instructions
  cloud-login
  create-database
  create-key
  default-endpoint
  delete-database
  delete-endpoint
  delete-key
  eval
  help              display help for fauna
  list-databases
  list-endpoints
  list-keys
  run-queries
  shell

martinheller@Martins-Retina-MacBook ~ % fauna list-databases
listing databases
my_app
main_ledger
martinheller@Martins-Retina-MacBook ~ % fauna create-database my_db
creating database my_db

  created database 'my_db'

  To start a shell with your new database, run:

  fauna shell 'my_db'

  Or, to create an application key for your database, run:

  fauna create-key 'my_db'

martinheller@Martins-Retina-MacBook ~ % fauna shell 'my_db'
Starting shell for database my_db
Connected to https://db.fauna.com
Type Ctrl+D or .exit to exit the shell
my_db> CreateCollection({ name: "posts" })
{
  ref: Collection("posts"),
  ts: 1573056452245000,
  history_days: 30,
  name: 'posts'
}
my_db> CreateIndex({
...   name: "posts_by_title",
...   source: Collection("posts"),
...   terms: [{ field: ["data", "title"] }]
... })
{
  ref: Index("posts_by_title"),
  ts: 1573056468580000,
  active: true,
  serialized: true,
  name: 'posts_by_title',
  source: Collection("posts"),
  terms: [ { field: [ 'data', 'title' ] } ],
  partitions: 1
}
my_db> Create(
...   Collection("posts"),
...   { data: { title: "What I had for breakfast .." } }
... )
{
  ref: Ref(Collection("posts"), "248300322187903506"),
  ts: 1573056490060000,
  data: { title: 'What I had for breakfast ..' }
}
my_db> Map(
...   [
...     "My cat and other marvels",
...     "Pondering during a commute",
...     "Deep meanings in a latte"
...   ],
...   Lambda("post_title",
.....     Create(
.......       Collection("posts"), { data: { title: Var("post_title") } }
.......     )
.....   )
... )
[
  {
    ref: Ref(Collection("posts"), "248300337888232978"),
    ts: 1573056505030000,
    data: { title: 'My cat and other marvels' }
  },
  {
    ref: Ref(Collection("posts"), "248300337888234002"),
    ts: 1573056505030000,
    data: { title: 'Pondering during a commute' }
  },
  {
    ref: Ref(Collection("posts"), "248300337888231954"),
    ts: 1573056505030000,
    data: { title: 'Deep meanings in a latte' }
  }
]
my_db> Get( Ref(Collection("posts"), "248300322187903506"))
{
  ref: Ref(Collection("posts"), "248300322187903506"),
  ts: 1573056490060000,
  data: { title: 'What I had for breakfast ..' }
}
my_db> Get( Ref(Collection("posts"), "248300337888231954"))
{
  ref: Ref(Collection("posts"), "248300337888231954"),
  ts: 1573056505030000,
  data: { title: 'Deep meanings in a latte' }
}
my_db> Get(
...   Match(
.....     Index("posts_by_title"),
.....     "My cat and other marvels"
.....   )
... )
{
  ref: Ref(Collection("posts"), "248300337888232978"),
  ts: 1573056505030000,
  data: { title: 'My cat and other marvels' }
}
my_db>

GraphQL quick start

I did the GraphQL quick start online in the console. I don’t think that Fauna has added GraphQL capabilities to its command-line client, and I couldn’t see any reason to use a third-party GraphQL client app.

The GraphQL application for this quick start is a simple to-do list. The schema is as follows. You need to create or download this file to your computer for the next steps.

type Todo {
   title: String!
   completed: Boolean
}

type Query {
   allTodos: [Todo!]
   todosByCompletedFlag(completed: Boolean!): [Todo!]
}

You start the tutorial in the FaunaDB console, at the top-level dashboard.

faunadb review 05 — The FaunaDB console shows you your databases and your usage, as well as offering a link to create a new database.

From there, you create a database.

faunadb review 06 — Creating a database is just a matter of setting the database name. The priority did not matter for my use cases.

To import the schema, you upload it from your computer to the GraphQL Playground.

faunadb review 07 — The GraphQL playground, like the FQL shell, is accessible once you select a database.

Once you upload the GraphQL schema, FaunaDB creates the necessary database objects. Then you can add and query documents in the GraphQL Playground.

faunadb review 08 — Here we are creating and displaying a to-do item using the GraphQL playground. The result appears at the right after pressing the arrow to execute the mutation (DML) query. The `createTodo` function was automatically generated from the schema.

faunadb review 09 — We ran a query to find all to-do documents in a second tab in the GraphQL Playground. This time we included the document ID in the output.

You can find other FaunaDB tutorials and references within the online documentation.

FaunaDB and competitors

FaunaDB is a good choice of database for greenfield web or mobile apps that need to be available globally with low latency and serializable consistency. With drivers for nine popular programming languages, FaunaDB isn’t likely to present an impedance mismatch to your code. On the other hand, FaunaDB doesn’t yet support SQL, so your database code will either be in FQL (powerful but proprietary) or GraphQL (less powerful, but open source and fairly easy to learn).

FaunaDB is especially appropriate for use with JAMstack apps. Whether FaunaDB will be cost-effective for scaling a brownfield app depends very much on the existing database, schema type, and query language used by the app. Don’t forget to factor in the costs of migrating your data if you have an existing database.

If you absolutely require SQL compatibility, FaunaDB is not yet a good option for you. You’ll do better with Google Cloud Spanner, CockroachDB, YugaByte DB, or Azure Cosmos DB. If you don’t need SQL, you want good scalability, and you are willing to give up global strong consistency, then there are several more alternatives to FaunaDB, such as Couchbase, DataStax, and MongoDB.

—

Cost: Always free plan: Free, limited to 5 GB storage, 100K read ops per day, 50K write ops per day, 50 MB per day transfer out; for overage charges see the Utility plan. Utility plan: $0.18 per GB per month storage, $0.05 per 100K reads, $0.2 per 100K writes, $0.10 per GB per day transfer out. Pro plan: $99/month, 200 GB reserved storage, 1.5M per day reserved reads, 750K per day reserved writes, 500 MB per day reserved transfer out. Enterprise plan: custom, contact Fauna.

Platform: Amazon Web Services, Microsoft Azure (in development), and Google Cloud Platform clouds; available serverless (shared public cloud cluster) or as a VPS. Docker, Zip, RPM, or Deb installers for local offline development; requires 16 or more GB RAM, Java 8 or later, and NTP on all nodes.

Topics

About

Policies

Our Network

More

FaunaDB review: Fast NoSQL database for global scale

Low latency, strong consistency, and high scalability make FaunaDB an excellent choice for greenfield web or mobile apps

FaunaDB architecture

FaunaDB query languages and drivers

FaunaDB use cases

FQL quick start

GraphQL quick start

FaunaDB and competitors

More from this author

Retrieval-augmented generation with Nvidia NeMo Retriever

12 coding agents at the cutting edge

AI coding at the command line with Gemini CLI

Real-time analytics with StarTree Cloud and Apache Pinot

What is Llama? Meta AI’s family of large language models explained

Review: Zencoder has a vision for AI coding

What is retrieval-augmented generation? More accurate and reliable LLMs

Review: Gemini Code Assist is good at coding

Show me more

Databricks adds Data Science Agent to automate analytics tasks

Rust Innovation Lab launched, sponsors first project

PostgreSQL 18 to boost OLTP performance, but misses AI readiness

Getting encryption wrong (and getting it right, too)

How to build a native desktop app vs. a web UI app

PyApp: Build click-to-run Python apps with Rust