After a bit of playing around with three different distributed version control systems, Git, Mercurial, and Bazaar, I’ve decided on Mercurial. For choices like this, especially when none have glaringly obvious problems, it can be difficult to make an objective decision. Sometimes it’s best to just go with your gut, and my gut was telling me Mercurial.

I plan on using Mercurial for new projects and converting a couple older Subversion repositories. For some projects where I still have to use Subversion, I will be using git-svn, though. This allows me to test drive Git a bit more, too.

Also, once I realized it was pretty easy to convert from one system to another, it made the choice a lot less permanent. If I decide to try Git or Bazaar in six months, I don’t think it’ll be much of an issue to convert the Mercurial repository.

There was a bit more detail to my choice, but it’s better to explain why I didn’t pick the other options. Read on for details.

Why Not Git

The biggest sticking point was poor Windows support. Git was originally written with the core in C and held together with a mix of Perl and shell scripts. I believe the shell scripts were particularly hairy to port to non-Unix machines, like Windows. In addition, some of the C code was dependent on forking, execing, and piping, things that are, again, not directly portable to non-Unix machines. That being said, Git is currently addressing this. One of the Google Summer of Code projects was to replace much of these scripts with more portable (and faster) C code. It’s quite possible that the upcoming 1.5.4 or 1.6.0 releases mean native Git on Windows. I’ll be keeping my eye on this.

Update: Yes, I know it’ll run under Cygwin, but that’s not practical for most Window’s users. Yes, I’m aware of Git on MSYS, but that’s currently a forked project. Once this is merged back in and I can file bugs about the Windows version, then it’ll be “officially” supported in my book.

One of the other annoyances with Git is that the repository requires periodic optimization with git-gc. I don’t know how often this happens in day to day usage, but after importing a small-ish Subversion project (MAME OS X) into Git, the initial repository size was 48M. After running git gc --aggressive, the size was 12M. I don’t know what the best practices are for repository optimizations, but Wincent repacks once a month. I feel that if something like this needs to be done periodically, the tool should just find a way to do it automatically. Otherwise, I’m just gonna forget to do it and get frustrated when things are slow.

The Git devs are definitely aware of this issue, as Linus started a thread about people being unaware of the importance of git-gc just a couple months ago. So, again, this may be a non-issue in six to twelve months.

The final annoyance is that Git revisions are marked as SHA-1 hashes. Thus, if you want to do a diff between two revisions, you have to use long, non-user friendly hashes. Coming from Subversion, where revisions are increasing integers, this is a pain. While, Mercurial uses SHA-1 hashes for revisions, too, it also assigns local integers as aliases to the SHA-1 hashes. This means you can still do things like hg diff -r53 -r60. Because these are local aliases only, revision 53 on one repository is not necessary the same as revision 53 on another, cloned, repository, though. You’ll have to use the full SHA-1 hash, which is unique among repositories. I believe Bazaar does something similar to Mercurial.

What Git has to Offer

One of the coolest parts of Git is git-svn. It’s a bi-directional gaetway between Subversion and Git, allowing you to use Git with an existing Subversion repository. This means you can get local commits and easy local branching and merging with these existing repositories. Once all your editing is complete, you can push your changes out to the Subversion repository. The other Subversion users wouldn’t even know you used Git. This is so cool, that I plan on using this on a project or two. Mercurial has something similar called hgsvn, but it doesn’t seem as polished as git-svn.

Git also has stronger local branching, in my opinion. With Git, you can create a branch in your repository that is local only for you. When you push out to another repository, these local branches do not go with it. You can easily switch between branches with git-checkout. This is space efficient, since you can create many branches within one local repository.

Mercurial, in contrast, does not have the concept of local branches. Instead, you just clone your local repository to another directory. While this may seem like a waste of space, Mercurial can use hard links so that most of the space is shared between the two clones. This can even be done on Windows, though it requires NTFS and an extra package. The downside to the hard link approach: if you do a pull on one of the clones, the hard links are broken. There is a method to recreate the links, but the solution looks really hacky. Most of the time, this isn’t an issue, but for people that do lots of branching on really large projects, Git may be a better choice.

One final possible advantage that Git has over Mercurial is the content tracking vs. file tracking. I believe the promise of this is that you, the user, don’t have to tell Git when you rename a file. You just rename it, and Git will automatically track this. However, I don’t know if this is really true. If it is, I don’t know why git-mv exists. Perhaps this just provides an extra hint to Git that a rename occurred. I can’t say for sure.

Why Not Bazaar

As I mentioned in my previous article, the main strike against Bazaar is its smaller market share. DVCSs are still currently fringe enough that I don’t want to be using the fringe of the fringe.

Also, Bazaar’s command line seemed a tad slower. Wile certainly not a useful benchmark, running bzr status on an empty repository repeatedly took about 0.21 seconds on my MacBook Pro. Mercurial’s hg status repeated took about 0.09 seconds. While 0.21 seconds is not slow, per se, it’s enough to explain why Mercurial has a snappier feel.

What Bazaar has to Offer

The ability to version and rename directories is something neither Git nor Mercurial have. You can rename all files in a directories under both, but since directories are not versioned, it’s a little differemt. Mark Shuttleworth has a nice post about this:

The number one thing I want from a distributed version control system is robust renaming.

He definitely has a point, but I’m not sure how big of a deal this is in practice.

I also like the ability to use a remote server using just SFTP and HTTP. I don’t think you even need to have Bazaar installed on the server to get remote repositories. Git requires installation on the remote servers, and Mercurial needs to be installed, along with a custom CGI script.

Update: Mercurial does not need a CGI script. You can use it over static HTTP with the static-http URL scheme, but that is described on the wiki as “much slower, less reliable”. As they don’t even tell you how to set it up, we may as well ignore it. Also, Bazaar over HTTP is apparently quite slow. There is a smart server, but that requires another daemon outside of Apache. Also, a faster smart server is a major focus of future versions.

Bazaar has come a long way recently. Just a year ago, it was really slow with an evolving repository format. Now that it just hit 1.0 and these problems seem to be part of the past, we may be hearing a lot more of Bazaar in the future. Again, I’ll be keeping my eye on it.

More to Come

Evaluating these systems has been a bit of a diversion, but it’s actually been quite fun to learn about distributed version control. Now that I’ve decided on Mercurial, I’m sure you’ll be seeing more posts on Mercurial and distributed version control systems.