Redis and Memcached are Dead: The Case for DB-Backed Hierarchical Caching

David Strauss

Despite years of working with (and training people to use) Drupal's support for Redis and Memcached, I think it's time to phase them out. We have several parts of the Drupal stack we optimize with these tools, especially caching, locking, and queues. I'll be showing how we can do them better, faster, and with fewer moving parts.

The database has gotten lots better:

  • Internal lock optimization in MySQL and MariaDB
  • Better HA options: MySQL MHA and MariaDB with Galera
  • The near-universality of InnoDB
  • Cheap, prevalent solid-state storage (SSD)
  • Support for certificate authentication and TLS
  • MySQL/MariaDB support for row-based replication that removes the replication-breaking race condition commonly encountered when using statement-based replication and DB-based locking

Redis and memcached have not improved much:

  • No coherency with Memcached failover
  • The new Redis replication (Sentinel) isn't even as good as MySQL's in 2012
  • Security based on iptables or a password (not great for cloud)
  • No durability for queues or locks
  • Still no native option for encypted communication

There are better need-specific options:

  • Hierarchical caching with apcu and coherency through the database
  • Modern distributed lock/queue daemons for non-caching purposes

In short, we should stop using Redis and Memcached:

  • Small sites should continue using the database backends for everything
  • Big sites should deploy lock/queue daemons and using hierarchical caching

Why this matters for core:

  • Certain cache items do not need cluster-wide consistency, but a consistent API is the only one we offer right now.
  • Core, itself, makes some of the heaviest use of the cache, so optimizing its use is important.
  • Some cache items in core are already (partially) content-addressed and can use lighter-weight consistency mechanisms. We could content-address more of them or otherwise adopt key naming schemes that auto-invalidate (for example, caching the rendered content from a field based on the field name, the entity ID, and the entity revision).
  • For some cache items core uses, it may be better to hit a stale item (in the local cache) than miss or escalate to "L2" or "L3" over a network. This can support softer failures under load that aren't possible to deliver via contib.

Session Track

DevOps

Experience Level

Advanced

Drupal Version