Four Kitchens blog

Drop that cron; use Hudson instead

Hudson: The butler for your cron jobs, tooHudson: The butler for your cron jobs, too
For years, I used cron (sometimes anacron) without asking questions. The method was simple enough, and every project requiring cron-related capabilities documented the setup.

There is a much better way, and it involves Hudson. I introduced “Hudson for cron” as a sidebar at the Drupal Scalability and Performance Workshop a few weeks ago. To my surprise, several of the attendees remarked on their feedback questionnaires that it was one of the most valuable things they picked up that day. So, I’ve decided to write this up for everyone.

Open your Golden Gate

The masses have spoken! Four of the Four Kitchens’ web chefs will be presenting (you guessed it) four sessions at DrupalCon San Francisco 2010! Here’s the low down in case you want to catch any of these talks during the conference.

Day one

On Monday at 5:30pm, Todd and Aaron will present From Photoshop to Drupal Theme. The takeaway from this talk will be stressing the importance of “designing for a system” rather than just focusing on how individual pages look in Drupal.

Day two

On Tuesday we’ll have two sessions. Don’t sleep in after the first night of partying, because at 9:45am you’ll watch to catch Todd’s Accelerated grid theming using NineSixty, an in-depth discussion about grid-based design in Drupal.

Then at 4:15pm, David will be talking about Performance testing The Economist Online using The Grinder, which is an open source Java-based load testing framework.

Day three

And on Wednesday at 3pm, Diana will be presenting PHP for NonProgrammers, a great introduction for those web designers who have only ever touched HTML and CSS.

The CAP theorem is like physics to airplanes: every database must design around it

Back in 2000, Eric Brewer introduced the CAP theorem, an explanation of inherent tradeoffs in distributed database design. In short: you can’t have it all. (Okay, so there’s some debate about that, but alternative theories generally introduce other caveats.)

Feast your eyes (and votes) on our web chefs' tasty DrupalCon San Francisco session proposals

Vote!

We are still 62 days away from DrupalCon San Francisco 2010, but in order to get truly excited about the event, you’ll want to start thinking about the amazing sessions you’ll attend. Beginning today, attendees are allowed to vote on those sessions they most want to see on the final schedule in April.

This year, Four Kitchens is both sponsoring DrupalCon San Francisco and offering to share our experience and knowledge with the community. If you’d like to see any of the sessions listed below, please vote! (And tell your co-workers, friends, and pets* to vote, too.)

Making Drupal and Pressflow more mundane

Drupal and Pressflow have too much magic in them, and not the good kind. On the recent Facebook webcast introducing HipHop PHP, their PHP-to-C++ converter, they broke down PHP language features into two categories: magic and mundane. The distinction is how well each capability of PHP, a dynamic language, translates to a static language like C++. “Mundane” features translate well to C++ and get a big performance boost in HipHop PHP. “Magic” features are either unsupported, like eval(), or run about as fast as today’s PHP+APC, like call_user_func_array().

Anticipage: scalable pagination, especially for ACLs

Pagination is one of the hardest problems for web applications supporting access-control lists (ACLs). Drupal and Pressflow support ACLs through the node access system.

Problems with traditional pagination

  • Because pagination uses row offsets into the results, browsing listings where newly published items get added to the beginning of the results creates “page drift.” Page drift is where a user already browsing through paginated results sees, for example, items E, D, and C on page one, waits awhile, clicks to the next page, and sees items C, B, and A. Going back to page one again shows F (newly published), E, and D. Item C “drifted” to page two while the user was reading page one. If new items are published frequently enough, pagination can become unusable due to this drifting effect.
  • Even if content and ordering are fully indexed, jumping n rows into the results remains inefficient; it scales linearly with depth into pagination.
  • Paginating sets where the content and ordering are not fully indexed is even worse, often to the point of being unusable.
  • The design is optimized around visiting arbitrary page offsets, which does not reflect user needs. Users only need to make relative jumps in pagination of up to 10 pages (or so) in either direction or to start from the end of the results. (If users are navigating results by hopping to arbitrary pages to drill down to what they need, there are other flaws in the system.)

Intelligent memcached and APC interaction across a cluster

Anyone experienced with high-performance, scalable PHP development is familiar with APC and memcached. But used alone, they each have serious limitations:

APC

  • Advantages
    • Low latency
    • No need to serialize/unserialize items
    • Scales perfectly with more web servers
  • Disadvantages
    • No enforced consistency across multiple web servers
    • Cache is not shared; each web server must generate each item

memcached

  • Advantages
    • Consistent across multiple web servers
    • Cache is shared across all web servers; items only need to be generated once
  • Disadvantages
    • High latency
    • Requires serializing/unserializing items
    • Easily shards data across multiple web servers, but is still a big, shared cache

Combining the two

Traditionally, application developers simply think about consistency needs. If consistency is unnecessary (or the scope of the application is one web server), APC is great. Otherwise, memcached is the choice. There is, however, a third, hybrid option: use memcached as a coordination system for invalidation with APC as the main item cache. This functions as a loose L1/L2 cache structure. To borrow terminology from multimaster replication systems, memcached stores “tombstone” records.

Update from the Drupal 7 Contributed Modules Sprint

The Vancouver Planetarium. Photo by Qole Pejorian.The Vancouver Planetarium. Photo by Qole Pejorian.

chx and I gathered last week in Vancouver’s West End for a two-person performance sprint during the final code slush days, allowing us to finish several key improvements to Drupal’s database layer. Right afterward, many more people joined us for another sprint to port key modules to Drupal 7. People worked in-person, voluntarily over IRC, and involuntarily over IRC (lost passport).

I can say — without reservation — that our work was successful. We kicked off the weekend with Drupal 6 versions of Coder and Views. (Though there had been a touch of prior work on the Views port to Drupal 7’s new database layer.)

Why Drupal.org lacks good themes (and why CVS has nothing to do with it)

What CVS does to (some) designersWhat CVS does to (some) designers

There’s been a lot of talk lately about how Drupal designers shouldn’t have to learn CVS. Nothing new to see here, really — just the same tired, self-fulfilling arguments about how much CVS sucks, how developers also hate using it, and how designers shouldn’t be expected to learn something so… technical.

Drupal does Dallas: DrupalCamp Dallas gets it right

In early August, David, Todd, and I took a road trip to Dallas, TX to attend the first ever DrupalCamp Dallas. It was a two-day event organized by several Dallas-based Drupal shops: LevelTen Interactive, Koine Media, and Tarakan Design. We had a lot of fun meeting fellow Texas Drupalers and saw some excellent presentations as well.

Contact Four Kitchens

Download Pressflow

Pressflow makes Drupal scale