Performance

Magic: Frontend Performance for all themes

Howdy perfers!

This week’s Webperf Wednesday is short and sweet, just like your page loads when you install this new module that enhances any Drupal theme. Magic is a set of frontend performance and development workflow tools for themers. Previously many themes had their own advanced settings — many of which did the same things as other themes, but they all did it a little differently — no more with Magic.

Built by Web Chef Ian Carrico and Sam Richard (of Aurora) with contributions from Sebastian Siemssen (of Omega), Magic was built by the desire to work together to make all themes better, instead of siloing improvements within specific themes.

What’s inside?

Performance features:

  • Enhancements to CSS Aggregation
  • Exclude CSS files from being included
  • Option to move JavaScript to the footer
  • Backport of Drupal 8 JavaScript handling
  • Exclude JS files from being included

Development goodies

  • Rebuild Theme Registry on Page Reload
  • Display a Viewport Width indicator
  • Display an indicator with classes applied to the HTML. Useful when used in conjunction with Modernizr
  • Export theme settings

That last one is super important, as it makes Drupal themes a little more DRY. With Magic, you can take your settings from one theme to another — or to another site completely — because they’re fully exportable. Have two different projects, and want similar asset output despite one being Omega and one being Zen? No problem, just export!

Note: the full import process has yet to land, but it’s coming very soon.

If you have an awesome trick that you always rely on during theming, open an issue and propose it to Magic. They’d love to hear from you.

Give it a shot today! Go to drupal.org/project/magic

One less JPG

I’d like to demo a simple how-to. There are many, many techniques to make pages load faster, but this post attempts to demonstrate large gains from very small code changes.

People often build beautiful sites with multiple easy-to-use JavaScript libraries. Then, when it comes to addressing frontend performance, suddenly those libraries are an enormous download that the users are forced to bear.

Just one image

Before you go worrying about how to minify every last library or shave tests out of Modernizr, try and see if you can remove just one photo from your design. It will make a bigger difference.

Coined by Adam Sontag, the “one less JPG” idea — nay, MOVEMENT — is summed up perfectly here:

Real example

Last year we re-launched Pressflow.org. We have some mobile traffic, but it’s likely people just browsing for info, since no one has a good reason to download Pressflow onto a phone or tablet. Let’s keep their attention and make the experience fast.

We have this huge, beautiful mountain on the homepage. It’s great. But it’s also 160K. I tried making it smaller, or splitting the photo off of the background pattern, but it decreased the quality of the photo too much when I lowered the file size. We made a wonderfully small SVG logo, but that’s not an option for a photograph with this kind of detail.

How much impact does it have?

A mountain is a big thing — just like the amount of traffic Pressflow can handle — and the image we chose was meant to convey that vastness. Since it doesn’t really pack the same punch on smaller screens, why include it at all? I decided to use Modernizr and conditionally load the stylesheet that references the mountain. That way it never gets loaded by tiny screens that don’t need it.

Using the Modernizr Drupal module, I added a conditional load into the .info file of my theme:

; Load CSS with Modernizr
modernizr[Modernizr.mq('screen and (min-width: 42em)')][yep][] = css/big.css

This tells Modernizr to output a Modernizr.load() statement with the test I specified. In this case, Modernizr will only load big.css if the test is true. My test checks the width of the window using a media query — .mq() — and returns true if the screen is at least 42em, causing the CSS to be fetched. Here’s the JavaScript output:

Modernizr.load({
  test: Modernizr.mq('screen and (min-width: 42em)'),
  yep : 'http://pressflow.org/sites/all/themes/pfo/css/big.css',
});

So that’s it, instant savings!

..oh what’s that? Always test your work? Thanks for keeping me honest.

Here’s some data.

I’ve got two network waterfalls here for comparison. They show a pretty stark difference following this one-line change to my code. If a screen isn’t big enough for the mountain, it’ll only take 20 HTTP requests and 193K total. If the screen is big enough, it takes 24 HTTP requests — for the CSS and then the images inside it — totalling 384KB total. That’s a savings of 191KB (almost exactly 50%) from a single change to my code. You’d have to remove 19 copies of jQuery 2.0 to achieve this kind of bandwidth savings.

(by the way, didja hear that jQuery 2.0 has small QSA-only custom builds?)

Small screens

Waterfall: Conditional load small

Big screens

Waterfall: Conditional load big

You can see in the second waterfall that the Initiator of big.css is modernizr.min.js, meaning that JavaScript loaded the file after running the test.

ThoughtContentLoaded

I hope this shows how easy it can be to reduce your page weight without worrying about shaving bytes of JavaScript that are supplying valuable functionality if you know how to use them right.

If you want to know more about the conditional loading API within Modernizr, head over to yepnope.js documentation and start reading. For more Drupal-specific examples check out the official documentation for conditional loading using the Modernizr module.

Webperf Wednesday: Video Roundup

Hey, speeders! There have been some great presentation videos put up on the web recently, so I’ve got links to a couple videos this week.

Breaking the 1000ms time-to-glass barrier

Last week we checked out slides from Ilya Grigorik, so if you enjoyed those you have to check out his 45-min presentation recorded during the last SFHTML5 meetup. For mobile web developers this is required watching. He starts with the constraint that he wants to load a mobile web page in less than 1000ms, and proceeds backwards, walking through the realities of phone chips/towers, mobile network latency, and battery usage, in addition to more familiar concepts like image compression, the browser rendering stack, and reflow/repaint gotchas. Warning: you might need a cigarette after watching this video!

Steve Souders HTML5 DevConf keynote

Steve Souders delivered a great keynote at the HTML5 Developer Conference, and he makes it very clear that users want speed. One of the original powerhouses in the frontend performance world, Steve has created more resources and tools than is possible to list here, so I’ll just encourage you to check out his site if you’re not familiar with his work. His latest blog post also contains some errata from this video so if you like it please head over there to pick up the corrections.

Webperf Wednesday: Inaugural edition

I’ve been kicking around this idea for quite some time and finally sat down to get it published. Four Kitchens has always taken performance very seriously, but traditionally most people only focus on backend server performance when thinking about these types of issues. I’m not a hardcore backend dev, but I have plenty of experience making things fast and useful within a web browser.

To that end, I’d like to spread the word about frontend performance by publishing a periodic blog post series highlighting techniques, articles, and other webperf stuff that we’re talking about day-to-day in the office. So without further ado, here are this week’s links:

Building Faster Mobile Websites (.pdf)

Ilya Grigorik presents an extremely compelling case for putting speed at the top of your priority list, especially on responsive and otherwise mobile websites. He explores the realities of network latency and suggests methods that could be used to break the 1000ms “time-to-glass” barrier. It’s a low-level, deep dive into the effects of TCP/IP and HTTP requests on your site’s loading speed, and contains tons of great information to help you build fast sites. To make the most of these slides, you’ll have to work closely with backend and/or deployment people who help manage your sites, but the payoff is well worth it.

Double-whammy from Alex Sexton

Alex has recently published a couple great resources that can actually help you achieve the lofty goals in Ilya’s slides. First is a blog post entitled Deploying JavaScript Applications that talks about building a web application which matches user action patterns, relies on intelligent caching, and employs distributed deployment (like CDNs) whenever possible. He also lists some lower hanging fruit such as reducing use of images before shaving bytes of JavaScript.

Next are his slides from HTML5 Developer Conference that highlight the awesome new Modernizr v3 workflow. Very closely related to deploying JS apps, he outlines how the Modernizr v3 will help you build applications which adapt to the browser loading it, allowing for tiered, optimized experiences that match each browser’s capabilities while minimizing the loading of unnecessary assets. Having helped out with the v3 work, I’m especially excited to see people begin to use it.

Prerender in Chrome for Faster Page Loads

A couple weeks back I wrote about a feature of Chromium that Ilya has helped publicize in his upcoming book, High-Performance Browser Networking. It’s a <link> tag which lets you suggest pages that Chrome should prerender in cases where your traffic funnels are extremely predictable. This is a great example of using data to accurately predict user actions rather than directly altering the load time of your pages. Although they still load at the speed of the network, the end result can reduce apparent load times to 0.0s in some cases! Check out my example which is live on Pressflow.org.

Auf Wiedersehen

I hope you enjoyed these links! As I mentioned earlier, these posts won’t always be limited to links, so check back for code snippets, full implementation walkthroughs, and other goodies in the future.

Feel free to leave feedback or topic requests in the comments, and happy perfing!

Prerender in Chrome for instant page loads

In January Ilya Grigorik published a mind-blowing article containing a preview chapter from his book High-Performance Browser Networking. The article, entitled “High Performance Networking in Google Chrome,” details all of the enhancements Google engineers have baked into Chromium to make it feel so fast. The enhancements apply to any build of Chrome, including mobile versions.

Preemptive Action

Network latency is a reality for any computer connected to the web, so one of the main points is that Chrome tries to intelligently guess what you’ll do next in order to pre-empt your actions. When this predictive behavior is correct, things appear to have happened instantaneously even though they happened at the speed of the network.

Although there are many features discussed, one of the simple enhancements highlighted is Chrome’s ability to prerender pages. The syntax is dead simple, although Ilya notes that Chrome does not guarantee this behavior in all situations. Regardless, this technique is a great way to speed up your main traffic funnels.

The article also stresses that if you misuse these tags, you incur huge performance penalties on users (the exact opposite of what we want!) so they are meant to be used carefully and only after confirming their usefulness by collecting analytics data. Do not just slap these tags onto all of your sites or you will slow them down.

When do I use this?

The short answer is: use it when you have data to confirm that it will be helpful. Check your analytics and look for funnels where a large percentage of traffic goes from one page to another. An easy place to start in Google analytics is Audience » Visitors Flow. That’s where the screenshot below came from.

Unfortunately you can’t use prerender for session-based funnels like shopping carts, where one page depends on information from the previous page. But for sites that feature one prominent CTA, such as a download page, this is a great enhancement.

Google themselves use prerendering for the top search result when you use the Google Instant version of their search engine. When you scan the results then end up clicking the first one, sometimes the page loads surprisingly fast. That’s prerender at work.

An example on Pressflow.org

Pressflow.org is a simple informational site with just a few pages. When this article came out, I looked at our analytics and decided it would be a great site to test this type of performance enhancement.

Looking at the following screenshot, you can see that an overwhelming majority of our traffic hits the homepage, then hits FAQs, then maybe goes elsewhere. Look at the proportion of traffic leading to the 1st Interaction column, where most traffic is clearly proceeding from the homepage to /faq versus the handful of hits leading to other pages:

Screenshot of Google Analytics visitors flow for Pressflow.org

Since this traffic pattern is a prime candidate for a prerender tag, I added one line of code to the homepage of the site:

<link rel="prerender" href="http://pressflow.org/faq" />

This tag suggests that Chrome should prerender the FAQs page before the user requests it, causing the FAQs to load instantaneously from the user’s perspective. Notice I used the word suggests because in Ilya’s article he makes it clear that the tag’s presence does not guarantee the page will prerender. There are other factors involved.

Assuming a prerender does occur, it still loads the page like any other, requesting uncached files at the speed of the network, assembling the page, executing JavaScript, and so forth. But it happens in the background, so when a user finally clicks FAQs, boom. The page is ready.

Of course, this process only works if you can serve pages pretty quick so it helps that our example site serves anonymous page loads from Varnish instead of Apache/PHP. But all of this added together makes for an instantaneous feel as you browse to FAQs.

More goodies

There are more tags that carry out other tasks such as DNS prefetching, which has been codified into a web standard and implemented in many browsers. The Chrome-specific ones seem valuable enough that it probably won’t be too long until they are also adopted by other browsers. Just remember to test, test, test, and only implement when you have data that says you should.

I encourage you to read the entire preview chapter on Ilya’s website and perhaps even buy his book.

Minified JavaScript, on the Fly!

As web applications become richer and more complicated the amount of JavaScript running them increases. More code means longer download times which means more waiting before your application or web site is usable. Thankfully there’s an easy solution that’s already widely used in the web development community: minification. Minified JavaScript strips whitespace and renames variables to produce a smaller download size.

Removing whitespace is pretty straightforward, but renaming variables is a more abstract concept. With tools like UglifyJS this is handled for you but here’s an example of what’s going on. Let’s look at this simple function that takes a parameter and alerts a message with that variable:

var sayHi = function(name) {
  alert('Hi ' + name + '!');
};

With script minification the above code will be changed to this:

var sayHi=function(a){alert("Hi "+a+"!")};

without altering the behavior of the function. As you can see, all whitespace that can be removed has been and the variable “name” was changed to “a”. The end result is the removal of 26 characters from the code. It’s not hard to see that the savings from minification will quickly add up!

You can take this one step further by wrapping your code in immediately-invoked function expressions (IIFEs). That’s a pretty scary looking name, but chances are you’ve seen these before, this is an example:

(function($) {
  $('h1').css('text-decoration', 'blink');
}(jQuery));

The function wrapped in parentheses is immediately invoked with jQuery as its argument. Within the function jQuery is accessed with the $ variable. Abstracting variables in IIFEs is not required though, consider this case:

(function(Drupal) {
  Drupal.behaviors.sayHi = {
    attach: function(name) {
      alert('Hi ' + name + '!');
    }
  };
}(Drupal));

Minifying this code produces the following:

(function(a){a.behaviors.sayHi={attach:function(a){alert("Hi "+a+"!")}}})(Drupal);

Keeping the “Drupal” variable in your script makes sense because that’s the namespace you’re used to accessing it with and it does have meaning in the code, but by including it as an argument to the IIFE UglifyJS will replace any instances of “Drupal” with a shortened variable.

What about Drupal?

So, this is pretty cool, right? Unfortunately Drupal 7 doesn’t provide minification out of the box. Thankfully, there are contrib solutions to this problem! The Speedy module provides pre-minified scripts but unfortunately needs to be re-released every time core is updated or you need to re-minify the files yourself on core updates.

I wasn’t completely satisfied with this approach so I built uglify.me and a companion Drupal module, UglifyJS, to do script minification on the fly. uglify.me isn’t the only “minifier-as-a-service” out there, but I wanted to be able to throw something up quickly and be fairly confident that it wouldn’t upset some other poor developer out there hosting their own service.

The uglify.me service accepts POST requests of un-minified JavaScript and returns the minified version. Simple! The UglifyJS module provides an API to expose Drupal scripts that should be minified:

/**
 * Implements hook_uglifyjs_info().
 */
function mymodule_uglifyjs_info() {
  return array(
    drupal_get_path('module', 'mymodule') . '/js/mymodule.js',
  );
}

The downside to this approach is that each script will create another request to the uglify.me service when the site cache is cold. If you expose a lot of scripts to the UglifyJS API this will be time consuming and could cause timeouts.

If you’re running Pressflow 7.20.1+ or apply this patch to core and you have JavaScript concatenation enabled (which you should, if you’re in production) the UglifyJS module will automatically minify the concatenated scripts. This greatly reduces the number of requests to the web service and overhead associated with minification.

Caveats

The biggest issue that I’ve encountered with this module is the requests made to the external web service. If there’s a problem connecting to the remote server the time spent waiting for the response is wasted and can cause timeouts on cold caches. This is mitigated if you’re using Pressflow 7.20.1+ since core will not request files to be rebuilt if the hash of the concatenated scripts did not change.

The uglify.me service also currently strips some header comments from scripts which would could remove any copyright or license information if they exist. This should be fixable from the uglify.me service, so if it’s bothering you and I haven’t had time to fix it before you need it, As of uglify.me v0.1.0 any comment containing the words “license” or “copyright” (case insensitive), or the common build tags “@preserve” and “@cc_on” will not be stripped from the source code. This is controlled by a regular expression so if you find a general case that’s not met by the current regular expression open a pull request to fix it!

Conclusion

Script minification is a great way to reduce your site’s download footprint and increase usability. If bandwidth and performance are concerns you should be minifying, regardless of your ultimate solution. Informal testing on the recently released Full Plate Living site showed reduction in the front page’s download weight by an average of 110KB*. Not monumental, but not too bad either.

Happy minifying!

* Note: The UglifyJS module is not currently in use on the production Full Plate Living site, but it likely will be soon!

Inlining one-use JavaScript

Everyone does it.

There’s a piece of JavaScript that will only be used on one page, perhaps to provide some unique interactivity. It’s probably attached to a View or maybe a unique node ID. It’s so easy to toss in a drupal_add_js() and move on — or worse, throw the code in your theme. Wouldn’t it be nice if you could inline all these one-use scripts and make them appear only the page they’re needed?

Inline on the Fly

Here’s an easy way to inline scripts without losing the ability to edit them easily. We don’t want our code sitting in a PHP string so we create and maintain a real JS file, and use file_get_contents() to grab it whenever the appropriate page is built.

  // Ensure this JS ends up inline at the bottom of the page
  $options = array(
    'scope' => 'footer',
    'type' => 'inline',
  );
 
  // JS lives in its own file but is included inline when page renders
  $js_code = file_get_contents(drupal_get_path('module','my_module').'/my_code.js');
 
  // Add JS to page
  drupal_add_js($js_code, $options);

Optimized pages + organized code

I often find myself using hook_views_post_build() to apply this behavior when a specific Views display needs some custom JS to function properly. That way I don’t have to worry about the path, it just works anytime this View is used.

Avid Features users know it’s much more maintainable to keep the JS in its own file next to the View instead of stuffing it in a Views footer, or worse: tossing vital code for components into the theme’s “main” (read: only) JS file. Putting code in a theme file can seem swell until you copy a Feature for use in another project and just can’t figure out where that JavaScript went.

/**
 * Implements hook_views_post_build().
 */
function my_feature_views_post_build(&$view) {
  $has_run = &drupal_static(__FUNCTION__);
 
  if (!$has_run) {
    switch ($view->name) {
      // Check for the relevant View(s)
      case 'my_view':
        // Check for the relevant display(s)
        if ($view->current_display == 'my_block') {
          // Ensure this JS ends up inline at the bottom of the page
          $options = array(
          'scope' => 'footer',
          'type' => 'inline',
          );
 
          // JS lives in its own file but is included inline when page renders
          $js_code = file_get_contents(drupal_get_path('module','my_feature').'/my_code.js');
 
          // Add JS to page
          drupal_add_js($js_code, $options);
          $has_run = TRUE;
        }
        break;
    }
  }
}

Performance

Inlining a script avoids an http request and is great for frontend performance. However, if you have a page that is uncached and hit continuously, adding disk reads won’t be so great for the actual server’s performance. You can see in the second example there’s a reference to drupal_static(). This is a good way to avoid running a slow Drupal hook more than once per page request. Always make sure to cache the outcome of functions like this one in order to avoid too many disk reads.

Case study: Big gains from small changes

One of our clients came to us with a performance issue on their Drupal 6 multi-site installation. Views were taking ages to save, the admin pages seemed unnecessarily sluggish, and clearing the cache put the site in danger of going down. They reported that the issue was most noticeable in their state-of-the-art hosting environment, yet was not reproducible on a local copy of the site — a baffling scenario as their 8 web heads and 2 database servers were mostly idle while the site struggled along.

Our performance analysis revealed two major issues. After implementing fixes, we saw the average time to save a Drupal view drop from 2 minutes 20 seconds to 4.6 seconds — a massive improvement. Likewise, the time to load the homepage on a warm cache improved from 2.3 seconds to 621 milliseconds. The two bugs that accounted for these huge gains turned out to be very interesting:

1. Intermediary system causes MySQL queries to slow down

Simple queries that are well indexed and cached, can see significant lag when delivering packets through an intermediary. This actually has nothing to do with Drupal, as it is reproducible from the MySQL command line utility. (It’s probably a bug in the MySQL libraries but we’re not entirely sure.) It could also be a problem with the intermediary but we’ve reproduced it in two fairly different systems: F5’s load balancer proxy and VMWare Fusion’s network drivers/stack.

For example:

SELECT cid, data, created, expire, serialized FROM cache_menu WHERE cid IN (x)

A query like this one should execute in a millisecond or less. In our client’s case, however, we found that 40ms was being added to the query time. The strange part is that this extra delay only occurred when the size of the data payload returned was above a certain threshold, so most of the time, similar queries returned quickly, but around 10–20 of these simple queries had 40ms or more added to their execution time, resulting in significant slowdowns.

We briefly debugged the MySQL process and found it to be waiting for data. Unfortunately, we didn’t pursue this much further as the simple workaround was apparent: reroute the MySQL traffic directly to the database instead of through the F5 load balancer. The net change from applying this simple modification is that the time to save a view was reduced to 25.3 seconds.

2. Database prefixing isn’t designed to scale as the number of prefixes increases

Drupal can be configured to share a database with another system or Drupal install. To do this, it uses a function called db_prefix_tables() to add prefixes to table names so they don’t collide with other applications’ table names. Our client was using the table prefixing system to allow various sub-sites to share data such as views and nodes, and thus they had 151 entries in the db_prefixes list.

The problem is that db_prefix_tables() is not well optimized for this implementation edge case. It will run an internal PHP function called strtr() (string token replace) for each prefix, on every database query string. In our case, saving a view executed over 9200 queries, meaning strtr() was called more than 1.4 million times!

We created a fix using preg_replace_callback() which resulted in both the number of calls and execution time dropping dramatically. Our view could now be saved in a mere 10.3 seconds. The patch is awaiting review in the Drupal issue queue, and there’s a patch for Pressflow 6, too, in case someone needs it before it lands in core.

The final tweaks included disabling the administration menu and the query logger. At that point, we finally reached a much more palatable 4.6 seconds for saving a view — still not as fast as it could be, but given the large number of views in the shared system, a respectable figure.

The CAP theorem is like physics to airplanes: every database must design around it

Back in 2000, Eric Brewer introduced the CAP theorem, an explanation of inherent tradeoffs in distributed database design. In short: you can’t have it all. (Okay, so there’s some debate about that, but alternative theories generally introduce other caveats.)

On Twitter, I recently critiqued a presentation by Bryan Fink on the Riak system for claiming that Riak is “influenced” by CAP. This sparked a short conversation with Justin Sheehy, also from the project. 140 characters isn’t enough to explain my objection in depth, so I’m taking it here.

While I give Riak credit for having a great architecture and pushing innovation in the NoSQL (non-relational database) space, it can no more claim to be “influenced” by CAP than an airplane design can claim influence from physics. Like physics to an airplane, CAP lays out the rules for distributed databases. With that reality in mind, a distributed database designed without regard for CAP is like an airplane designed without regard for physics. So, claiming unique influence from CAP is tantamount to claiming competing systems have a dangerous disconnect with reality. Or, to carry on the analogy, it’s like Boeing making a claim that their plane designs are uniquely influenced by physics.

But we all know Airbus designs their planes with physics in mind, too, even if they pick different tradeoffs compared to Boeing. And traditional databases were influenced by CAP and its ancestors, like BASE (warning: PDF) and Bayou from Xerox PARC. CAP says “pick two.” And they did: generally C and P. This traditional — and inflexible — design of picking only one point on the CAP triangle for a database system doesn’t indicate lack of influence.

What Riak actually does is quite novel: it allows operation at more than one point on the triangle of CAP tradeoffs. This is valuable because applications value different parts of CAP for different types of data or operations on data.

For example, a banking application may value availability for viewing bank balances. Lots of transactions happen asynchronously in the real world, so a slightly outdated balance is probably better than refusing any access if there’s a net split between data centers.

In contrast, transferring from one account to another of the same person at the same bank (say, checking to savings) generally happens synchronously. A bank would rather enforce consistency above availability. If there’s a net split, they’d rather disable transfers than have one go awry or, worse, invite fraud.

A system like Riak allows making these compromises within a single system. Something like MySQL NDB, which always enforces consistency, would either unnecessarily take down balance viewing during a net split or require use of a second storage system to provide the desired account-viewing functionality.

Making Drupal and Pressflow more mundane

Drupal and Pressflow have too much magic in them, and not the good kind. On the recent Facebook webcast introducing HipHop PHP, their PHP-to-C++ converter, they broke down PHP language features into two categories: magic and mundane. The distinction is how well each capability of PHP, a dynamic language, translates to a static language like C++. “Mundane” features translate well to C++ and get a big performance boost in HipHop PHP. “Magic” features are either unsupported, like eval(), or run about as fast as today’s PHP+APC, like call_user_func_array().

Mundane

  • If/else control blocks
  • Normal function calls
  • Array operations
  • …and most other common operations

Magic

  • eval()
  • call_user_func_array()
  • Code causing side-effects that depends on conditions like function existence
  • Includes within function bodies
  • Other PHP-isms that make Java and C++ developers cringe

How Drupal and Pressflow can run better (or at all) on HipHopPHP

Prelinking

Currently, we invoke hooks using “magic” (though still HipHop-supported) calls to call_user_func_array(). We don’t have to do that; we could be “prelinking” hook invocations by generating the right PHP for the set of enabled modules. If we generate the right PHP here, HipHop can link the function calls during compilation.

This sort of “prelinking” also cleans up profiling results, making it easier to trace function calls through hooks in tools like KCacheGrind.

Compatibility break? Nope, it should be possible to replace the guts of module_invoke_all() with appropriate branching and calls to the generated PHP.

Including files staticly

Drupal 6 introduced an optimization to dynamically load files based on which menu path a user is visiting. This won’t fly in HipHop; it’s simply not supported. Fortunately, this is easy to work around: we can either drop the feature (shared hosters without APC are already booing me) or we could, like in the prelinking example, generate a big, static includes file (which is itself included on HipHop-based systems) that includes all possible page callback handlers based on the hook_menu() entries. Sites that include the static includes file would skip the dynamic includes at runtime.

Compatibility break? None, assuming we take the approach I describe above.

Death to eval()

Like dynamic includes, eval() is unsupported on HipHop. Drupal has already relegated core use of eval() to an isolated module, which is great for security. eval() is pretty bad in general: PHP+APC doesn’t support opcode caching for it, so serious code can’t run in eval() sanely. Unfortunately, using the PHP module to allow controlling block display remains quite popular.

We have a few options here:

  • Drop the feature (ouch!)
  • Provide a richer interface for controlling block display, including support for modules to hook in and provide their own extended options
  • Pump out the PHP to functions in a real file, include that, and call those functions to control block display

Compatibility break? Yes, on all but the third option (writing out a PHP file).

Migrate performance-intensive code to C++

I’m looking at you, drupal_render().

This opportunity is exciting. Without the cruft of Zend’s extension framework, we can migrate performance-critical code paths in core to C++ and make use of STL and Boost, two of the most respected libraries in terms of predictable memory usage and algorithm running time.

Compatibility break? There’s no reason to have one, but keeping C++ and PHP behaviors consistent will be a serious challenge.

The takeaway

  • Use real, file-based PHP, avoiding dynamic language features.
  • Profile the system to find the biggest wins versus development cost for migrating core functionality to C++.

I’ll be presenting the “Ultimate PHP Stack” for large-scale applications at PHP TEK-X. Zend PHP, Quercus, and HipHop PHP (source code release pending) will all be contenders.

Pages