The AWS Data Ecosystem Grand Tour - Key/Value and "NoSQL" Stores
Written by Alex Rasmussen on January 13, 2020
This article is part of a series. Here are the rest of the articles in that series:
- Where Your AWS Data Lives
- Block Storage
- Object Storage
- Relational Databases
- Data Warehouses
- Data Lakes
- Key/Value and "NoSQL" Stores
- Graph Databases
- Time-Series Databases
- Ledger Databases
- SQL on S3 and Federated Queries
- Streaming Data
- File Systems
- Data Ingestion
- Data Interfaces
- Training Data for Machine Learning
- Data Security
- Business Intelligence
In the last couple articles in this series, we talked about two major kinds of analytical data systems. Let's switch gears now and go back to operational data systems - systems that are on the critical path of an organization's operation and are designed to handle a lot of relatively simple read and write operations quickly.
For a long time, the dominant operational data system was the relational database. OLTP relational databases like MySQL and PostgreSQL dominated the lowest tier of the three-tier web service architecture through much of the early days of the web. These databases are extremely powerful, but they can be complicated to operate. That complexity increases dramatically when a service's demands exceed the capacity of a single database server and the database has to scale horizontally across multiple instances.
How Relational Databases Scale
If a database has to handle a lot of reads, we've seen when looking at RDS that we can create a number of read replicas of the database and serve reads from those replicas. This takes load off the primary database instance but it also reduces the consistency of your reads. How this inconsistency arises - and whether and how much that matters to you - is a big topic that we won't cover here, but that reduced consistency can make the application that's built on top of the database harder to reason about. This approach also has its limits, since a database server can only handle so many read replicas before the primary can't keep them all synchronized anymore.
Another way of handling lots of reads is handling reads from a cache in front of the database. After all, the easiest way to reduce read load on the database is to not read from the database at all! Of course, this cache must be kept up-to-date as the data that it's caching changes, and maintaining a coherent cache is a notoriously hard problem.
Yet another way to handle more reads is to change the database's logical structure. Most of the expense of a read operation in a relational database comes from joins, so a common approach to reducing the query's expense is denormalization. When a database is denormalized, data that used to be split across multiple tables and combined together with joins is instead pre-joined as it's written. Denormalization is another topic that warrants its own article, but at a high level this approach makes reads a lot faster at the cost of introducing data duplication and making writes slower.
If a database has to handle a lot of writes, the traditional solution is to do what's called sharding the database - partitioning it among separate database servers called shards. Sharding spreads write load across a number of instances, but it introduces a whole host of its own problems. Querying the database become substantially more complicated, since queries must be aware of how the data is sharded. Transactional updates that would be trivial to do on a single database server become much more complex when the updates span multiple shards, and are generally avoided as a result. Load across shards is often uneven, leading to so-called hot and cold shards that decrease the shards' overall utilization. Overall, sharded databases aren't much fun to operate or to program against.
Dealing with the added complexity of sharding, denormalization, and read replication to scale up relational databases has vexed web service operators for a long time. Starting around the mid-2000s, many of the larger web companies started searching for less complex ways to handle their operational workloads. The new systems that arose from this search typically took one of two forms: increasingly scalable and capable caches, and alternatives to the relational database model that were easier to scale.
Bigger and Better Caches
Two popular systems that fall into the "more scalable and capable cache" category are Memcached and Redis. Memcached is an in-memory cache for binary objects. It supports a pretty simple get/put interface, with options for slightly more advanced operations like compare-and-set. Redis supports in-memory binary object caching like Memcached does, but also supports many other in-memory data structures including lists, sets, sorted sets, and geospatial indexes. Redis also has the option to persist these data structures to disk and bills itself as a database, but its most popular usage by far is as a more sophisticated cache.
AWS supports both Redis and Memcached through a single service, Amazon ElastiCache. ElastiCache provides a fully managed installation of either Redis or Memcached clusters, complete with automated backups, patching, and monitoring. There's not much more to say than that; if you need a Memcached or Redis cluster, AWS has you covered.
The "NoSQL" Alternatives
When they first became popular, data systems that abandoned or significantly modified the relational model to achieve high scalability that the web demanded were called "NoSQL databases". This name has aged poorly, both because many of these systems have SQL interfaces now and because there are several different kinds of systems that were all grouped together under the "NoSQL" umbrella. A detailed survey of these different kind of systems could fill a book; in fact, it fills a significant portion of Martin Kleppmann's Designing Data Intensive Applications. Rather than dive deep, we'll take a cursory look at a few of those system types to put AWS's services in their proper context.
The conceptually simplest relational database alternative is the key/value store. Key/value stores manage a collection of objects, and a user can read and write an object if they know the object's key. Key/value stores are often persistent, but they don't have to be; in fact, you could consider both Redis and Memcached to be key/value stores. From an interface perspective, S3 looks vaguely similar to a key/value store, but its relatively high latency makes it poorly suited for the kinds of use cases for which key/value stores are typically deployed.
If you know exactly what you want and just want to read and write it quickly, key/value stores work quite well. For more complex queries, you'll need more than just a key/value store. That's where the other system types we'll be discussing, both of which are cousins of the key/value store, come in.
One cousin of the key/value store is the document store. Document stores allow for the storage and serving of large collections of objects called documents that are grouped together into collections. Documents are polymorphic, meaning that different document structures can coexist together in the same collection. Document stores allow for much more complex queries than key/value stores do, although the complexity and efficiency of those queries varies depending on which document store you're using.
AWS has two document stores. The first is Amazon DynamoDB. DynamoDB is a serverless document store that's purpose-built for storing a large collection of documents (a table of items in DynamoDB parlance) where documents are either read/written individually or the entire collection is scanned front to back. DynamoDB items are addressed by a compound key consisting of a partition key that determines where a document is physically stored and an optional sort key that clusters documents with the same partition key together.
DynamoDB's simple interface is enough for a surprisingly large number of use cases, but some queries you might want to do on a DynamoDB table would be really inefficient if retrieval by key and table scanning were your only options. To allow for more complex queries, DynamoDB allows the user to provision global secondary indexes (GSIs). These GSIs allow the user to do some basic projection and aggregation and to define a new partition or sort key on an existing table. GSIs behave somewhat like materialized views in relational databases, and are updated asynchronously every time an item is added or changed. DynamoDB also exposes changes to items as a stream. This allows users to perform change data capture on a table, taking actions every time the table is modified.
AWS's second document store is Amazon DocumentDB (with MongoDB compatibility) - and yes, the parenthetical is part of the service's official name. DocumentDB is protocol compatible with MongoDB, a popular open source document store. There's some speculation that DocumentDB is backed by the same storage engine that powers Aurora MySQL and Aurora PostgreSQL. Like Aurora, DocumentDB is fully managed and advertises high availability and high throughput at scale. Unlike Aurora, at time of writing there's no serverless option. DocumentDB is targeted at applications that were built for MongoDB or that need more querying flexibility than DynamoDB can provide.
Distributed Row Stores
Another popular relational database alternative is what we'll call non-relational distributed row stores. They're also called wide column stores, but that term is a little misleading since they don't store data in the column-oriented way that a columnar relational database does. Non-relational distributed row stores have elements in common with key/value and document stores, but they don't quite behave like either one. One of the most popular open-source systems of this type is Apache Cassandra, and AWS recently introduced a serverless Cassandra service that they creatively called Amazon Managed Apache Cassandra Service. We'll focus on Cassandra as an illustrative example of what these kinds of stores look like.
Cassandra stores tables that have schemas and consist of collections of rows that are uniquely identified by a primary key. A row's primary key consists of a partition key and a clustering key, and rows are distributed around a Cassandra cluster in groups called partitions. At this point, you might wonder what differentiates Cassandra from DynamoDB. There are a lot of architectural details that differentiate the two systems, but we'll focus on one key semantic difference: the amount of structure that the store imposes on its data. Two items in the same DynamoDB table can have wildly different structures as long as both items have the same partition and sort keys, a fact that's used heavily when modeling a DynamoDB table. By contrast, all rows in a Cassandra table must conform to that table's schema, although all columns that aren't part of the table's key are nullable. This schema enforcement, however flexible, allows Cassandra to present a familiar SQL-like interface called CQL to clients.
ElastiCache and DocumentDB are priced in a way that we've become familiar with looking at systems like RDS; you pay for a cluster of instances, and more powerful instance types cost more. ElastiCache allows for both reserved and on-demand pricing, but DocumentDB only allows for on-demand at time of writing.
Both DynamoDB and Managed Apache Cassandra Service bill in read request units and write request units. Different operations use different numbers of these units, and operations with stronger consistency guarantees require higher numbers of units to execute. Both services offer (or plan to offer) both on-demand or provisioned modes. In on-demand mode, you don't specify a table's capacity ahead of time and you're billed for the read and write request units that you use. In provisioned mode, you specify how many read and write request units you expect a table to use per second. The system then uses a token bucket similar to what we saw when we looked at EBS volumes to rate limit requests so that they only consume a certain number of request units.
As you'd expect if you've read other articles in this series, you also pay for network transfer in the standard way, storage is charged by the GB-month, and backups are free unless you're backing up more than 100% of your production data's total size.
Next: Connecting the Dots
In this article, we looked at some alternatives to relational databases on the OLTP side of things. Next, we'll take a look at graph databases, which shift the system's focus from the entities being stored to the relationships between those entities.
If you'd like to get notified when new articles in this series get written, please subscribe to the newsletter by entering your e-mail address in the form below. You can also subscribe to the blog's RSS feed. If you'd like to talk more about any of the topics covered in this series, please contact me.
Special thanks to Scott Andreas, Mikhail Panchenko, and Jordan West for help with some of the Cassandra-related questions I had while writing this article. Any mistakes in this article are mine, not theirs.