Intelligent memcached and APC interaction across a cluster

Anyone experienced with high-performance, scalable PHP development is familiar with APC and memcached. But used alone, they each have serious limitations:

APC

  • Advantages
    • Low latency
    • No need to serialize/unserialize items
    • Scales perfectly with more web servers
  • Disadvantages
    • No enforced consistency across multiple web servers
    • Cache is not shared; each web server must generate each item

memcached

  • Advantages
    • Consistent across multiple web servers
    • Cache is shared across all web servers; items only need to be generated once
  • Disadvantages
    • High latency
    • Requires serializing/unserializing items
    • Easily shards data across multiple web servers, but is still a big, shared cache

Combining the two

Traditionally, application developers simply think about consistency needs. If consistency is unnecessary (or the scope of the application is one web server), APC is great. Otherwise, memcached is the choice. There is, however, a third, hybrid option: use memcached as a coordination system for invalidation with APC as the main item cache. This functions as a loose L1/L2 cache structure. To borrow terminology from multimaster replication systems, memcached stores “tombstone” records.

The “extremely fresh” check for the APC item (see below) allows throttling hits to memcached. Even a one-second tolerance for cache incoherency massively limits the amount of traffic to the shared memcached pool.

Reading

The algorithm below may not be perfect, but I’ll revise it as I continue work on an implementation.

  1. Attempt to load the item from APC:
    1. On an APC hit, check if the item is extremely fresh or recently verified as fresh against memcached. (For perfect cache coherency, the answer is always “not fresh.”)
      1. If fresh, return the item.
      2. If not fresh, check if there is a tombstone record in memcached:
        1. If there is no tombstone (or the tombstone post-dates the local item):
          1. Update the freshness timestamp on the local item.
          2. Return the local item.
        2. Otherwise, treat as an APC miss.
    2. On an APC miss, attempt to load the item from memcached:
      1. On a memcache hit:
        1. Store the item into APC.
        2. Return the item.
      2. On a soft memcache miss (the item is available but due for replacement), attempt to take out a semaphore in APC:
        1. If the APC semaphore was successful, attempt to take out a semaphore in memcached:
          1. If the memcached semaphore was successful:
            1. Write the semaphore to APC.
            2. Rebuild the cache item and write it (see below).
            3. Release the semaphore in memcached. (The semaphore in APC should clear itself very quickly.)
          2. If the memcached semaphore was unsuccessful:
            1. Copy the memcached rebuild semaphore to APC. Store this very briefly (a second or so); it is only to prevent hammering memcached for semaphore checks.
            2. Return the slightly stale item from memcache.
        2. If the APC semaphore was unsuccessful:
          1. Return the slightly stale item.
      3. On a hard memcache miss (no item available at all):
        1. Is a stampede to generate the item acceptable?
          1. If yes:
            1. Generate the item real-time.
            2. Store to the cache.
          2. If no:
            1. Use the APC/memcache semaphore system (see above) to lock regeneration of the item.
            2. If the current request cannot grab the semaphore, fail as elegantly as possible.

Writing/invalidating

  1. Write to/delete from memcached.
  2. Write to/delete from APC.
  3. Set the tombstone record in memcached. This record should persist long enough for all web servers to notice that their local cache needs to be updated.

Commenting on this Blog post is closed.

Comments

Sounds like the penultimate caching system. In theory CacheRouter could be made to do this correct?

It would require an expansion to the API to integrate the semaphore support. Also, it would be wise to expand the API to support both perfectly consistent and very slightly inconsistent caching styles.

Ohhhh, this is exciting. I tried to get APC and Memcache playing together on a Project Mercury instance before realizing it just wasn’t going to happen.

Simply awesome. I indeed as well tried to play with both APC and memcached and was wondering if these two could be combined. Enlightening, as usual; thanks.

Since posting this, I’ve been thinking about ways to manage clock mismatches between servers. The second-level invalidation in the architecture I’ve described currently only performs well with well-synchronized clocks.

Some options for handling clock mismatches:

  • Instead of a validation timestamp, store a checksum for the cache item in both memcached and APC. This allows basically instant invalidation even with clock matches. If a client’s local cache checksum doesn’t match the one in memcached, it could download the item out of memcached. Since items would have to be serialized to send them to memcached in the first place, there isn’t a high cost to generating a checksum.
  • Store a clock in memcached. This avoids the checksum complexity at the cost of loading an additional item out of memcached for many requests. A server would have to update this clock, too, introducing a potential point of failure.

Why not require synchronised clocks instead of trying to work around the problems that sync issues can cause? You are in a world of hurt if you try to run a cluster of servers with clocks that are not synchronised. NTP is mandatory in those cases.

This sounds like a really interesting idea! I was just explaining the advantages/disadvantages of APC and memcache to people on my team the other day. Have you made any progress on this? Are you thinking of creating a product around it or open sourcing it?

The experimental menu-ng implementation on a branch off Pressflow uses this method. The big-picture plan is to create what I call “Storageflow,” an APC + memcached + distributed, persistent DB system (this currently looks like it will be Cassandra) that provides an exceedingly optimized and tunable key/value store for both cached and persistent items.