David Strauss
April 20, 2009


A discussion recently arose on the Bazaar mailing list asking, “Why isn’t rebase support in core?” Rebase support is currently packaged as a plugin. This plugin is widely distributed, even in the standard Mac OS X installation bundle.

There are boring reasons that rebase support isn’t in core, like the lack of strong test coverage. More interesting are questions about the necessity of rebasing in typical workflows.

What is rebasing, and why should I care?

In large projects, there’s a mainline branch representing the current, global, coordinated development. In Drupal’s case, this is CVS HEAD. This mainline might not always be in perfect condition, but there’s a general sense that the mainline is not a sandbox for untested changes. Many changes are small enough that the developers simply work on and test a patch, but this workflow is inadequate for larger development projects like Fields in Core. Such large features require their own branch for development, a feature branch.

A feature branch allows development of a feature in isolation from the mainline but with the eventual intent of merging the changes back into the mainline. Because feature branches are created to foster long-term, divergent development from the mainline, it’s common for both feature development and mainline development to happen in parallel. This parallel development creates a problem: How do developers on the feature branch prepare for the eventual re-integration of their feature code into the mainline?

There are a few options:

  • Don’t sync changes. This option makes merging the feature back into the mainline painful. This option also defeats the purpose of developing and testing the feature in isolation because merging two tested (but divergent) branches often results in one broken (but converged) branch.
  • Merge the feature into the mainline before making any changes to the mainline and then re-branch for more feature work after making mainline changes. Merging an untested or incomplete feature into the mainline makes this option unattractive and impractical. This option is so silly, I only included it for completeness.
  • Periodically update the feature branch from the mainline. This is ideal because the feature branch continually answers the question “What if we merged this feature into the mainline?” and is ready for quick merging into the mainline without any disruption to mainline work.

The third option is the only practical one. But how should it work? What should the feature branch history look like after synching from the mainline?

Back to rebasing…

Rebasing integrates the updates to the mainline as ancestors to the changes on the feature branch. The commit history is reorganized (read: rebased) as if the feature branch were freshly created from the mainline and all work were done on top of that. There are many theoretical objections to rebasing, and I won’t rehash them here. There’s general consensus that rebasing is sort of icky.

I find that many rebase users use the tool because they’re not aware of better workflows. I’ll address each (supposed) reason to use rebase in its own section.

“I want to keep my feature branch updated from the mainline.”

The better choice is to run bzr merge [mainline] on the feature branch. This command will update the common ancestry between the feature and mainline branches so that the feature branch includes the latest changes from the mainline and is ready for smooth merging back into the mainline.

“I want to view only the revisions that make up the feature I’ve been working on.”

With a rebase, it’s reasonably clear which revisions constitute the feature work: they’re the top ones. But rebasing is not the best choice for reviewing this list. Run bzr missing --mine-only [mainline] from the feature branch, and Bazaar will output all the feature branch’s unique revisions without mangling the actual history (the way rebasing does).

“I want a human-readable summary of how merging the feature into the mainline will affect the code.”

For background, a rebase user would run a diff from the oldest feature-specific commit to the latest commit, but there’s a better way. Instead, run bzr diff --old=[mainline], and Bazaar will provide the net diff for merging the feature into the mainline. Now, don’t use this diff for anything but human review; you should still use bzr merge from the mainline to integrate the feature branch’s changes and preserve all history.

Creating a merge directive with bzr send provides an identical human-readable diff to the method above, but a merge directive also includes all the binary data Bazaar needs to perform a history-preserving merge.

“I want to maintain a patch set on top of the mainline.”

Rebasing commits is an ugly way to do this because you don’t retain your own history of work on each patch or the history of how rebasing has changed each patch. Bazaar has a plug-in called “Looms” that provides direct support for a much better patch set workflow. I’m a touch skeptical of Looms’ stability, so I just do what Looms does under the hood: maintain multiple branches, each derived (branched) from the one below. Each branch represents a patch. This method retains full, original history, including any changes I’ve made to the patches. When the mainline updates, I simply merge the mainline changes up through my patches.

“I want to clean up my commit history prior to submitting my changes to the mainline.”

Rebasing may group the feature commits, but it doesn’t make them coherent or pretty. It’s more effective to do the following:

  1. bzr merge [mainline]
  2. Use bzr diff --old=[mainline] on the feature branch to create a net diff.
  3. Get a fresh branch from the mainline.
  4. Apply the net diff as a patch.
  5. Shelve all changes.
  6. Work through unshelving the changes and committing them to create a coherent, pretty history.
  7. Create a merge directive using bzr send.
  8. Submit the merge directive.

“[Your reason here]”

I’d like to hear from users of any distributed version-control system why they use “rebase” in their workflows, even if their reason is one I’ve discussed above.

David Strauss

Comments