Open Source

AnsibleWorks launches

Our favorite configuration management and deployment tool just received a major boost today, in the form of a new commercial company: AnsibleWorks. Started by Michael DeHaan, the Ansible project founder and lead, this new company will be providing support, products and services within the Ansible ecosystem.

If you’re doing anything with servers - anything at all - you would probably benefit from using Ansible. It’s so simple and flexible that you’ll find it makes an ideal replacement for Chef, Puppet, Fabric, Func, Make, shell scripting, and build tools. Anything that involves automation of a process on servers, could probably be done more easily using Ansible. At Four Kitchens, we use it in our Jenkins jobs to perform ad-hoc tasks, for building Vagrant virtual machines, and configuring our development environments. One tool, one language, no daemons, all over SSH, simple and human-readable.

We love it so much that we’ve been helping to make it better, too. We wrote and contributed the MySQL and MongoDB modules, which are part of the Ansible core, as well as some general improvements, bug fixes and documentation. The strength and support of the community is definitely a major feature of the ecosystem, and we really hope our contributions will help to build its reputation and adoption.

Just how big is it? Well, one measure is the number of stars and forks of the project’s Github repository. Compared to Puppet’s 1197 stars, Ansible has 1211 - despite being much younger. Puppet has more forks - I’ll leave it up to you to draw your own conclusions about that.

To get started using Ansible, check out the introductory blog post I wrote last year, then move to the community documentation, and some of the playbooks we’ve contributed that perform common LAMP setup tasks.

ScriptCraft: using JavaScript within Minecraft

This flew across my inbox/twitter/something a couple days back, and I spent all last evening having fun with it. It’s called ScriptCraft, a JavaScript API for building within Minecraft.

If you’re not familiar with Minecraft, it’s a world made of 1-meter cubes. You just dig around for minerals and build cities to protect your supplies. Then you get bored and start modding Minecraft.

How it works

ScriptCraft supplies an object called a Drone. Think of it like a turtle that can move in 3D space. Just like jQuery, the Drone always returns this, allowing you to chain commands together on the one-line console available in Minecraft. You can change the Drone’s position, create blocks of any material, create primitives, complex objects like trees, “bookmark” coordinates, and even copy/paste huge volumes of blocks. Built upon the Drone are other JS plugins, and the author has thrown in some good ones to get you started using the API, among them a couple buildings, a block-letters, and other goodies.

Say you want to avoid griefers and would like a truly impenetrable fortress. I would like an underground lair made of bedrock, for example. This is basically impossible in vanilla Minecraft, but using ScriptCraft you can do it in one comand:

/js down(11).box(7,20,1,20).up().box0(7,20,9,20).up(9).box(7,20,1,20).up().box(2,20,1,20)

…it’s not pretty, but this generates a hollow shell which you can use as scaffolding for your new fortress.

Creating a JS plugin in ScriptCraft

That command looks crazy all on one line, but here I broke it out into fortress.js:

load(__folder + "drone.js");
/**
 * Creates a fortress of bedrock and one layer of dirt on top.
 */
Drone.extend('fortress', function(material, sides, height) {
  // Drop the Drone down to the bottom of the structure
  this.down(height-1);
  // Lay down a solid layer for the floor
  this.box(material, sides, 1, sides);
  // Move up one level
  this.up();
  // Create all four walls. box0() creates hollow boxes
  this.box0(material, sides, (height-3), sides);
  // Move up above the walls
  this.up(height-3);
  // Another solid layer for the ceiling
  this.box(material, sides, 1, sides);
  // Move up one more time
  this.up();
  // Finally, a layer of dirt to remain inconspicuous ;)
  this.box(2, sides, 1, sides);
 
  // return this to allow further chaining
  return this;
});

You can chain the commands here too, but I chose not to in order to document each command clearly. When you’re happy with your creation, you’ll want to put the JS file inside ./js-plugins on your Minecraft server. Once you’re finished uploading it, you have to load the file within the game and get access to use the command:

/js load('js-plugins/path/to/fortress.js')

You’re ready to create some rooms in your fortress. Keep the material ID list handy while you assemble your commands. Materials with colons in them need to be strings.

// made of bedrock, 20 each side, 12 blocks tall
/js fortress(7,20,12)
 
// made of diamond blocks, 25 each side, 8 blocks tall
/js fortress(57, 25, 8)

Use Bukkit in JS

If all of that weren’t enough, the entire Bukkit API is available via JavaScript once ScriptCraft is loaded. That means anyone with JavaScript experience can now write whole Minecraft mods using ScriptCraft.

We haven’t installed it quite yet, but ScriptCraft will be up on Drupalcrafters.org very soon. Stop by and give it a shot!

Credits

ScriptCraft was created by Walter Higgins. He’s active on his Github project

A node.js extension for systemd

A few weeks ago Pantheon sponsored a code sprint at their offices in San Francisco, the purpose of which was to extend support for the systemd journal. In attendance were the two maintainers of systemd: Lennart Poettering and Kay Sievers. We worked on creating native language bindings, and I headed up the JavaScript work - specifically creating a library that would allow asyncronous logging directly to the journal from node.js.

You can now do this in your application code:

var journald = require('journald').Log;
journald.log({
    MESSAGE: 'Hello world',
    ARG1: '<useful string here>',
    ARG2: '<another useful string here>',
    ARG3: '<yet another useful string here>'
 });

Systemd is the new default system manager in many major distributions. It handles starting and stopping services, provides socket activation capabilities, control over processes using cgroups, and much more. One of the new features it provides is the journal - a replacement for syslog that brings logging into the 21st century. There are many improvements over syslog, but one of the most obvious is the use of structured log entries, providing key/value fields instead of a string blob that often needs to be parsed again using regexes.

To add an entry to the journal from JavaScript, we need to break out of the interpreted world and call a C function that’s provided by the systemd API: sd_journal_sendv(). To do this, we create a V8 extension in C++, and implement the required functions to expose our new calls to JavaScript.

V8 provides neatly wrapped implementations of all JavaScript language components, so there’s a C++ version of the Function object, and a Number object, and every other piece you may require. You can easily build your JavaScript code from within C++, and return it back to JavaScript land for execution - an experience that feels something like being a backstage director at an elaborate costumed production.

The second part of the node.js API is the ability to hand off tasks to libuv, so that the main event loop doesn’t become blocked waiting for I/O. To do this, the uv_queue_work() function must be called with data about the job, a function to call to do the work, and a function to call once it’s completed. Since the V8 API isn’t available in the libuv worker threads, this part requires pure C++/C code.

The journald package is now available on npm, and it comes with a Winston transport plugin, allowing you to use Winston’s logging API to send to multiple places at once. The source is available on Github.

I found the following resources useful:

Drop that cron; use Hudson instead

For years, I used cron (sometimes anacron) without asking questions. The method was simple enough, and every project requiring cron-related capabilities documented the setup.

There is a much better way, and it involves Hudson Jenkins. I introduced “Hudson for cron” as a sidebar at the Drupal Scalability and Performance Workshop a few weeks ago. To my surprise, several of the attendees remarked on their feedback questionnaires that it was one of the most valuable things they picked up that day. So, I’ve decided to write this up for everyone.

Why not cron?

First, I have the burden of explaining why you should drop most use of the tried-and-true cron. To be honest, I don’t think cron is even a “good enough” solution for most of today’s systems:

  • You either get email from every run’s output, a dumb log to disk, or no reporting at all. When you do log to disk, you have to worry about segmenting and rotating logs.
  • Jobs have to be manually staggered to avoid massive slowdowns every hour (or midnight or ten minutes or other interval)
  • Even if the previous job hasn’t finished, cron happily starts up a new one on top of it
  • It doesn’t integrate with any other job kickoff or monitoring system
  • There’s no built-in ability to run remote jobs, let alone move a remote job from one machine to another
  • Most of the web-based tools for configuring cron aren’t very nice
  • There’s no built-in logging of job execution time, even though a cron job taking excessive time is one of the most common failure cases.

Why Hudson is better

Here’s how using Hudson with periodic “builds” beats cron:

  • Among the myriad ways Hudson can measure success of a “build,” it can verify a zero return status from each “execute shell” build step. If a job simply returns anything but zero, Hudson considers the build a failure and can notify you however you like. It can email you (on first failure only or every time), you can subscribe to build feeds via RSS, or you can simply use the Hudson interface as a dashboard that shows failures in a convenient, summarized way.
  • Hudson logs the output of “execute shell” build steps. Success or failure, Hudson archives the build output without filling your inbox or local disk. If the console output isn’t enough, Hudson can archive per-run “build artifacts,” which are files on disk matching a defined pattern. There’s also no-hassle “log rotation” by specifying a cap on the number of builds or a set number of days to keep results; this is configurable per-job. If a particular run had output (say, for troubleshooting) you want to keep around, you can tell Hudson to “keep this build” indefinitely.
  • Hudson runs each build on “build executors,” which are effectively process slots. Any system can have any number, but it puts a cap on how much Hudson tries to do, systemwide. This mean 50 jobs can get scheduled to run every hour with four “build executors,” and Hudson will queue them all every hour and run four at once until they’ve all finished.
  • If a job is still running when the “periodic build” time comes around, Hudson can either run the job immediately (like cron) or queue the job to run when the one in progress finishes.
  • Hudson isn’t limited to time-based scheduling. Sometimes, it’s useful to take a job that used to run periodically (say, a database refresh) and make it only available for manual kickoff. Of course, as a CI tool, Hudson can kick off jobs based on polling a version-control system.
  • For remote jobs, Hudson can sign onto systems with SSH, copy over its own runtime, and run whatever you’d like on the remote system. This means that, no matter how many servers in a cluster need scheduled jobs, Hudson can schedule, run, and log them from one server. Hudson can distribute the jobs dynamically based on which machines are already busy, or it can bind jobs to specific boxes.
  • Hudson has a solid web interface that can integrate with your Unix shadow file, LDAP, or other authentication methods. For people who prefer operating from the command line, Hudson has a CLI.
  • Every job’s running time is logged. Hudson even provides estimates for how long it will take the system to get to any particular job when there’s a queue.

Hudson isn’t perfect

To be fair, there are still a couple reasons continue using cron:

  • As a Java-based web application, Hudson is heavyweight. A low-memory or embedded system is better-off with cron. Even Hudson’s remote job invocation installs and starts a Java-based runtime. You can, however, use the SSH plugin if even one box can run the main Hudson instance.
  • Cron’s scheduling is more precise if things have to happen exactly at certain time intervals. Hudson’s assumption is that your periodic builds aren’t dependent on when they start within a minute or two.

Adding a cron-style job to Hudson

Moving jobs from cron to Hudson is easy:

  1. Install Hudson. From the front page of Hudson’s site, there are repositories for Red Hat Enterprise Linux, CentOS, Ubuntu, Debian, and a few others.
  2. Open Hudson in a browser (on port 8080 by default).
  3. Add a new “New Job” of type “Build a free-style software project.”
  4. Check “Build periodically” and put in a cron-like schedule.
  5. Click “Add build step” and “Execute shell.” The Hudson wiki has a page explaining this.
  6. Configure access control from within Hudson.

Drupal and Pressflow best practices

It’s easy to drop in the standard call to wget or cURL to run your Drupal/Pressflow cron from Hudson, but the best way is with Drush.

Why Hudson with Drush?

  • You can configure PHP’s CLI mode to be liberal in error reporting, giving you far more data on failure than a WSOD from wget or cURL. Hudson will also fail the “build” if PHP runs into a fatal error.
  • You can block access to cron.php entirely. (This advantage isn’t unique to Hudson integration.)

Because Drush requires local shell execution, there’s a bit more overhead to having one Hudson box run Drupal’s cron on remote servers in a cluster. It’s not that hard, though. Just configure a Hudson “slave” on each box that needs to run Drush and configure each job to run on the “build executor” that hosts the site. If using a Hudson slave is overkill, use the SSH plugin.

There are even better reasons to use Drush with Hudson for things like database schema updates, but that’s outside the scope of this blog entry.

In the wild

Four Kitchens is widely using Hudson for cron automation on client sites. We’ve also deployed Hudson to Drupal.org infrastructure for multiple non-testing purposes, including deploying updates to Drupal.org (to be discussed in a future blog post).

The CAP theorem is like physics to airplanes: every database must design around it

Back in 2000, Eric Brewer introduced the CAP theorem, an explanation of inherent tradeoffs in distributed database design. In short: you can’t have it all. (Okay, so there’s some debate about that, but alternative theories generally introduce other caveats.)

On Twitter, I recently critiqued a presentation by Bryan Fink on the Riak system for claiming that Riak is “influenced” by CAP. This sparked a short conversation with Justin Sheehy, also from the project. 140 characters isn’t enough to explain my objection in depth, so I’m taking it here.

While I give Riak credit for having a great architecture and pushing innovation in the NoSQL (non-relational database) space, it can no more claim to be “influenced” by CAP than an airplane design can claim influence from physics. Like physics to an airplane, CAP lays out the rules for distributed databases. With that reality in mind, a distributed database designed without regard for CAP is like an airplane designed without regard for physics. So, claiming unique influence from CAP is tantamount to claiming competing systems have a dangerous disconnect with reality. Or, to carry on the analogy, it’s like Boeing making a claim that their plane designs are uniquely influenced by physics.

But we all know Airbus designs their planes with physics in mind, too, even if they pick different tradeoffs compared to Boeing. And traditional databases were influenced by CAP and its ancestors, like BASE (warning: PDF) and Bayou from Xerox PARC. CAP says “pick two.” And they did: generally C and P. This traditional — and inflexible — design of picking only one point on the CAP triangle for a database system doesn’t indicate lack of influence.

What Riak actually does is quite novel: it allows operation at more than one point on the triangle of CAP tradeoffs. This is valuable because applications value different parts of CAP for different types of data or operations on data.

For example, a banking application may value availability for viewing bank balances. Lots of transactions happen asynchronously in the real world, so a slightly outdated balance is probably better than refusing any access if there’s a net split between data centers.

In contrast, transferring from one account to another of the same person at the same bank (say, checking to savings) generally happens synchronously. A bank would rather enforce consistency above availability. If there’s a net split, they’d rather disable transfers than have one go awry or, worse, invite fraud.

A system like Riak allows making these compromises within a single system. Something like MySQL NDB, which always enforces consistency, would either unnecessarily take down balance viewing during a net split or require use of a second storage system to provide the desired account-viewing functionality.

Making Drupal and Pressflow more mundane

Drupal and Pressflow have too much magic in them, and not the good kind. On the recent Facebook webcast introducing HipHop PHP, their PHP-to-C++ converter, they broke down PHP language features into two categories: magic and mundane. The distinction is how well each capability of PHP, a dynamic language, translates to a static language like C++. “Mundane” features translate well to C++ and get a big performance boost in HipHop PHP. “Magic” features are either unsupported, like eval(), or run about as fast as today’s PHP+APC, like call_user_func_array().

Mundane

  • If/else control blocks
  • Normal function calls
  • Array operations
  • …and most other common operations

Magic

  • eval()
  • call_user_func_array()
  • Code causing side-effects that depends on conditions like function existence
  • Includes within function bodies
  • Other PHP-isms that make Java and C++ developers cringe

How Drupal and Pressflow can run better (or at all) on HipHopPHP

Prelinking

Currently, we invoke hooks using “magic” (though still HipHop-supported) calls to call_user_func_array(). We don’t have to do that; we could be “prelinking” hook invocations by generating the right PHP for the set of enabled modules. If we generate the right PHP here, HipHop can link the function calls during compilation.

This sort of “prelinking” also cleans up profiling results, making it easier to trace function calls through hooks in tools like KCacheGrind.

Compatibility break? Nope, it should be possible to replace the guts of module_invoke_all() with appropriate branching and calls to the generated PHP.

Including files staticly

Drupal 6 introduced an optimization to dynamically load files based on which menu path a user is visiting. This won’t fly in HipHop; it’s simply not supported. Fortunately, this is easy to work around: we can either drop the feature (shared hosters without APC are already booing me) or we could, like in the prelinking example, generate a big, static includes file (which is itself included on HipHop-based systems) that includes all possible page callback handlers based on the hook_menu() entries. Sites that include the static includes file would skip the dynamic includes at runtime.

Compatibility break? None, assuming we take the approach I describe above.

Death to eval()

Like dynamic includes, eval() is unsupported on HipHop. Drupal has already relegated core use of eval() to an isolated module, which is great for security. eval() is pretty bad in general: PHP+APC doesn’t support opcode caching for it, so serious code can’t run in eval() sanely. Unfortunately, using the PHP module to allow controlling block display remains quite popular.

We have a few options here:

  • Drop the feature (ouch!)
  • Provide a richer interface for controlling block display, including support for modules to hook in and provide their own extended options
  • Pump out the PHP to functions in a real file, include that, and call those functions to control block display

Compatibility break? Yes, on all but the third option (writing out a PHP file).

Migrate performance-intensive code to C++

I’m looking at you, drupal_render().

This opportunity is exciting. Without the cruft of Zend’s extension framework, we can migrate performance-critical code paths in core to C++ and make use of STL and Boost, two of the most respected libraries in terms of predictable memory usage and algorithm running time.

Compatibility break? There’s no reason to have one, but keeping C++ and PHP behaviors consistent will be a serious challenge.

The takeaway

  • Use real, file-based PHP, avoiding dynamic language features.
  • Profile the system to find the biggest wins versus development cost for migrating core functionality to C++.

I’ll be presenting the “Ultimate PHP Stack” for large-scale applications at PHP TEK-X. Zend PHP, Quercus, and HipHop PHP (source code release pending) will all be contenders.

Anticipage: scalable pagination, especially for ACLs

Pagination is one of the hardest problems for web applications supporting access-control lists (ACLs). Drupal and Pressflow support ACLs through the node access system.

Problems with traditional pagination

  • Because pagination uses row offsets into the results, browsing listings where newly published items get added to the beginning of the results creates “page drift.” Page drift is where a user already browsing through paginated results sees, for example, items E, D, and C on page one, waits awhile, clicks to the next page, and sees items C, B, and A. Going back to page one again shows F (newly published), E, and D. Item C “drifted” to page two while the user was reading page one. If new items are published frequently enough, pagination can become unusable due to this drifting effect.
  • Even if content and ordering are fully indexed, jumping n rows into the results remains inefficient; it scales linearly with depth into pagination.
  • Paginating sets where the content and ordering are not fully indexed is even worse, often to the point of being unusable.
  • The design is optimized around visiting arbitrary page offsets, which does not reflect user needs. Users only need to make relative jumps in pagination of up to 10 pages (or so) in either direction or to start from the end of the results. (If users are navigating results by hopping to arbitrary pages to drill down to what they need, there are other flaws in the system.)

“Anticipage”

With a combination of paginating by inequality and, optionally, optimistic permission review, a site can paginate content with the following benefits:

  • No page drift
  • Stable pagination URLs that will generally include the same items, regardless of how much new content has been published to the beginning or end of the content listing
  • If the ordering is indexed, logarithmic time to finding the first item on a page, regardless of how many pages the user is into browsing
  • Minimal computation of JOINs, an especially big benefit for sites using JOINs for ACLs

The general strategy is to amortize the cost of pagination as the user browses through pages.

Paginating by inequality

The path to achieving fast pagination first involves a fresh strategy for sorting and slicing content. A “pagination key” must be selected for the intended set of content that:

  • Includes the column(s) desired for sorting. For a Drupal site, this might be the “created” column on the “node” table.
  • Is a superkey (unique for all rows in the table but not necessarily minimally unique). Sorting by the columns in a superkey is inherently deterministic. And because a superkey is also unique, it allows us to use where criteria on the deterministically sorted set to deterministically define pages. An existing set of sort columns for a listing can always be converted to a superkey by appending a primary key to the end.

For a Drupal site, a qualifying pagination key could be (created, nid) on the “node” table. This key allows us to deterministically sort the rows in the node table and slice the results into pages. Really, everyone should use such pagination keys regardless of pagination strategy in order to have a deterministic sort order.

Having selected (created, nid) as the key, the base query providing our entire listing would look something like this:

SELECT * FROM node ORDER BY created DESC, nid, DESC;

Traditionally, a site would then paginate the second page of 10 items in MySQL using a query like this:

SELECT * FROM node ORDER BY created DESC, nid, DESC LIMIT 10, 10;

But because we’re ordering by a pagination key (as defined above), we can simply run the base query for the first page and note the attributes of the final item on the page. In this example, the final node on the first page has a creation timestamp of “1230768000” and a node ID of “987.” We can then embed this data in the GET criteria of the link to the second page, resulting in running a query like this for rendering the second page:

SELECT * FROM node WHERE created <= 1230768000 AND (created <> 1230768000 OR nid < 987) ORDER BY created DESC, nid, DESC LIMIT 10;

We’re asking for the same sorting order but adding a WHERE condition carefully constructed to start our results right after the content on the first page. (Note: this query could also be dissected into a UNION if the database does not properly optimize the use of the index.) This strategy allows the database to fully employ indexes on the data to find, in logarithmic time, the first item on any page. Note how page drift becomes impossible when pagination happens using keys instead of offsets.

Should a system choose to support moving more than one page in either direction, it would either have to:

  • Read a sufficient depth into the results in nearby pages to obtain the necessary WHERE attributes. This is a bit inefficient but consistent with the rest of the approach.
  • Adopt a hybrid strategy by using a traditional-style query (a LIMIT that skips records) with WHERE conditions beginning the set on the adjacent page. For example, if a user were currently on page 9, the direct link to page 11 would load a page that runs the query for page 10 but starts its listing 10 items later (“LIMIT 10, 10”). Naturally, this becomes less efficient as we allow users to hop greater distances, but the running time, at worst, converges on how the traditional pagination approach works.

This inequality pagination strategy is already a huge win for pagination queries using expensive joins. If everything can be calculated in the database, this is about as good as it gets without denormalization or alternatives to relational databases. Unless, of course, we have a site where an optimistic permissions strategy works well:

An iterative, optimistic permissions strategy

One feature of ACLs is that they’re hard to generically and flexibly define in fixed schemas. Sometimes, it’s easiest to allow callback functions in the application that don’t have to fit into rigid ACL architectures. And for listings where a very large proportion of items are displayable to a very large proportion of users, it can be non-optimal to use a pessimistic permissions strategy where the database vets every item before sending it to the application.

Inequality-based pagination fits well with an optimistic, iterative pagination strategy:

  1. Fetch an initial batch of rows for a page without regard to permissions. The initial batch of rows need not be equivalent to the number intended for display on a page; the system could be optimized to expect approximately 20% of records it fetches to be non-displayable to most users.
  2. Test whether each item is displayable to the current user.
  3. Render and output the displayable items.
  4. Fetch more items if the quota intended for display on the page (say, 10 items) isn’t met. Each subsequent batch from the database may increase in size as the algorithm realizes that it’s finding a low proportion of displayable content.
  5. Repeat until the quota for the page is filled.

This strategy works well when a low percentage of items evenly distributed through result sets are locked away from general displayability. Fortunately, that case is quite common for large, public sites with:

  • Publishing workflows that exclude small quantities of content during the editorial process
  • Small quantities of content that need to be hidden, like Wikipedia for legally troublesome revisions
  • Small numbers of internal documents, like documentation intended for editors

Intelligent memcached and APC interaction across a cluster

Anyone experienced with high-performance, scalable PHP development is familiar with APC and memcached. But used alone, they each have serious limitations:

APC

  • Advantages
    • Low latency
    • No need to serialize/unserialize items
    • Scales perfectly with more web servers
  • Disadvantages
    • No enforced consistency across multiple web servers
    • Cache is not shared; each web server must generate each item

memcached

  • Advantages
    • Consistent across multiple web servers
    • Cache is shared across all web servers; items only need to be generated once
  • Disadvantages
    • High latency
    • Requires serializing/unserializing items
    • Easily shards data across multiple web servers, but is still a big, shared cache

Combining the two

Traditionally, application developers simply think about consistency needs. If consistency is unnecessary (or the scope of the application is one web server), APC is great. Otherwise, memcached is the choice. There is, however, a third, hybrid option: use memcached as a coordination system for invalidation with APC as the main item cache. This functions as a loose L1/L2 cache structure. To borrow terminology from multimaster replication systems, memcached stores “tombstone” records.

The “extremely fresh” check for the APC item (see below) allows throttling hits to memcached. Even a one-second tolerance for cache incoherency massively limits the amount of traffic to the shared memcached pool.

Reading

The algorithm below may not be perfect, but I’ll revise it as I continue work on an implementation.

  1. Attempt to load the item from APC:
    1. On an APC hit, check if the item is extremely fresh or recently verified as fresh against memcached. (For perfect cache coherency, the answer is always “not fresh.”)
      1. If fresh, return the item.
      2. If not fresh, check if there is a tombstone record in memcached:
        1. If there is no tombstone (or the tombstone post-dates the local item):
          1. Update the freshness timestamp on the local item.
          2. Return the local item.
        2. Otherwise, treat as an APC miss.
    2. On an APC miss, attempt to load the item from memcached:
      1. On a memcache hit:
        1. Store the item into APC.
        2. Return the item.
      2. On a soft memcache miss (the item is available but due for replacement), attempt to take out a semaphore in APC:
        1. If the APC semaphore was successful, attempt to take out a semaphore in memcached:
          1. If the memcached semaphore was successful:
            1. Write the semaphore to APC.
            2. Rebuild the cache item and write it (see below).
            3. Release the semaphore in memcached. (The semaphore in APC should clear itself very quickly.)
          2. If the memcached semaphore was unsuccessful:
            1. Copy the memcached rebuild semaphore to APC. Store this very briefly (a second or so); it is only to prevent hammering memcached for semaphore checks.
            2. Return the slightly stale item from memcache.
        2. If the APC semaphore was unsuccessful:
          1. Return the slightly stale item.
      3. On a hard memcache miss (no item available at all):
        1. Is a stampede to generate the item acceptable?
          1. If yes:
            1. Generate the item real-time.
            2. Store to the cache.
          2. If no:
            1. Use the APC/memcache semaphore system (see above) to lock regeneration of the item.
            2. If the current request cannot grab the semaphore, fail as elegantly as possible.

Writing/invalidating

  1. Write to/delete from memcached.
  2. Write to/delete from APC.
  3. Set the tombstone record in memcached. This record should persist long enough for all web servers to notice that their local cache needs to be updated.

Update from the Drupal 7 Contributed Modules Sprint

[img_assist|nid=213|title=|desc=The Vancouver Planetarium. Photo by Qole Pejorian.|link=none|align=right|width=300|height=222]

chx and I gathered last week in Vancouver’s West End for a two-person performance sprint during the final code slush days, allowing us to finish several key improvements to Drupal’s database layer. Right afterward, many more people joined us for another sprint to port key modules to Drupal 7. People worked in-person, voluntarily over IRC, and involuntarily over IRC (lost passport).

I can say — without reservation — that our work was successful. We kicked off the weekend with Drupal 6 versions of Coder and Views. (Though there had been a touch of prior work on the Views port to Drupal 7’s new database layer.)

We ended the weekend with usable releases of both modules. The Coder release is already posted to its Drupal.org project page. Views work is ongoing, and I’m posting fresh tarballs throughout the day on the Four Kitchens server as work continues.

Help testing is great! Please post bugs for Coder back to its Drupal.org project and bugs for the Views port to the #D7CX sprint’s “Views for Drupal 7” Launchpad project.

We owe thanks to NowPublic for hosting the sprint in their downtown Vancouver offices. We collaborated over Bazaar version-control branches on Canonical’s free (as in beer and speech) Launchpad service.

Need to scale Drupal on EC2? Check out Chapter Three's Mercury project

Josh Koenig from Chapter Three has made pre-release EC2 AMIs (pre-packaged virtual machine images) for Mercury, a project to combine Four Kitchens’ Drupal-derived, high-performance Pressflow with Varnish, Cache Router, and memcached. Initial results show it easily saturating the EC2’s pipe. Mercury instances directly update their Pressflow releases from the Four Kitchens Bazaar server.

Mercury is an exciting project for anyone who needs to run a high-traffic, Drupal-based site without having to configure a bunch of caching systems.

Pages