An Introduction to Conflict-Free Replicated Data Types | Hacker News

Dear reader! If you’re reading this, that’s most likely because you’ve pointed your browser to my website and/or followed a link to this page. Maybe you’re even reading this from a mobile device1 Perfect conditions for motivating what all this is about.

Contents

  1. Preliminaries (this page)
  2. Algebras & contracts
  3. Registers and Deletion
  4. Outlook (to be written)

The web is a truly distributed application platform

That’s right. When you’re building a web application, you absolutely, positively have to care about the distributed aspect of the web. (Unless your application is stateless, like my website.)

What does this mean? You may have a bunch of users. These users may be manipulating their data from a variety of devices. Some devices may have a slow Internet connection. Devices may go offline at any point in time.

Sometimes, application developers punt on this issue: the mobile app displays “You’re offline” and won’t let you see your data (best case), or silently discard information (worst case).

One particular piece in the puzzle of building distributed applications is to figure out the storage. Ideally, this storage should be resilient towards users that may become unavailable, concurrent edits, and so on.

Enter Conflict-​free Replicated Data Types. A glorious example of Computer Science naming that actually Makes Sense™, they attempt to provide a flexible solution to the storage problem. The fundamental idea is this: You have data. This data is stored on multiple replicas. CRDTs describe how to coordinate these replicas to always arrive at a consistent state.

Note that there are two different categories of CRDTs: state-​based and op-​based. Both serve the same purpose, but work in different ways and come with their own design trade-​offs. In this series, I’m mostly going to focus on state-​based CRDTs.

About CRDTs

That’s it! You now understand the idea behind CRDTs.

Of course, that’s only half the story. There are at least two sides to understanding CRDTs deeply.

  1. Knowing all the varieties (counters, maps, sets, …) and how they can be embedded in application software.