Mercurial at Unity – Na'Tosha Bard

DISCLAIMER: This blog post is now several years old and it should not be used as a source of current advice. The landscape of version control, and DVCS in particular, has changed a lot in recent years.

Not too long ago, @CorporateShark asked for a blog post about Unity’s experience using Mercurial for version control. This is that blog post.

NOTE: Throughout this post, I use several terms that assume knowledge of DVCS in general and some Mercurial-specific terminology.

A Bit of History

For those who are interested, I’ll give the history of Mercurial at Unity; if you’re not interested in my trip down memory lane, feel free to skip this bit and go onto the sections about what we like and don’t like.

Unity started using Mercurial for version control in early 2011. We had been using Subversion, and knew we wanted “something” better — specifically something that would support branching and (more importantly) merging. We were also becoming a widely distributed team and wanted something that could handle developers being spread around different locations. After some testing and debate about whether to use Mercurial, Git, Bazaar, or Perforce, we landed on Mercurial as the chosen system (perhaps I’ll write a separate blog post about our specific decision-making process at some point).

Before switching to Mercurial, we had a workflow with Subversion that involved developers committing regularly to trunk. This was, in a way, quite convenient for all of us, but as the team and product grew in size and complexity, we encountered problems. More specifically, the chance of a developer breaking something on a platform other than the one he was working on was getting quite high. We wanted better control over what went into the mainline development branch; we wanted developers to be able to do work in branches, test their work, and share it with others only when it was ready.

We knew we wanted/needed a self-hosted solution for Mercurial and we ideally wanted something that had some code review capabilities built in as well. We also had to deal with the problem of large binary files. Distributed version control systems don’t lend themselves very well to storing large binary files, since a clone, by default, contains the entire history for a file, and large binary files are often already compressed and do not diff well; this means the size of a clone can grow very fast if large binary files are committed and changed regularly. After surveying our options, we settled on Kiln as a hosting and code review tool, and it had a built-in solution for large binary files — the kbfiles extension (which was a fork of the original bfiles extension by Greg Ward).

I prepared for the switch by scanning our entire Subversion history for binary files and creating a filemap that would remove these files from the history. This would allow us to have a new Mercurial repository that was not too large and contained the history for the source code files. And in 2011, while many people in R&D were away at GDC and it was relatively quiet in the Copenhagen office, I converted our Subversion repository to a Mercurial repository with the convert extension, and added in the missing binary files as kbfiles in a new commit. The road was a bit bumpy at first (it quickly became apparent we had some bugs to address with the kbfiles extension, so I immediately dove into fixing those), but we were underway. We switched both our version control system and our development model (to a branch-based model) at the same time; whether that was a good or bad idea, I’m not sure, but it worked out in the end (there was a lot of fumbling and confusion for the first few days in particular, though).

In 2011, most literature suggested branch-by-cloning as the recommended approach for branching, and Kiln was designed to support this as a workflow, so that’s the approach we took. We had a ‘trunk’ repository, and several ‘topic’ or ‘area’ repositories which were forks of trunk. These had names like ‘android’, ‘ios’, ‘editor’, ‘core’, etc. All developers would clone the repository that they were working on, and commit to the ‘default’ branch, and push their changes. They would ask for a Code Review in Kiln, and run a build verification on their branch (we were using TeamCity for our Continuous Integration server at the time), and after their code was reviewed and tested, they would merge the changes from their ‘default’ branch in their fork to the ‘default’ branch in the trunk repository.

This worked well enough for a while, but after a few months we ran into performance issues with Kiln; our developers and our build machines could not reliably pull changes or make new clones. In late 2011 we switched instead to RhodeCode, which was a purely open-source hosting solution for Mercurial. RhodeCode hadn’t yet grown any code review capabilities, which was a step back for us in the code review department, but since we couldn’t work without being able to pull/clone, we pushed forward.

A big issue for us was that we’d been relying on kbfiles in order to use Kiln, and we needed a similar solution to use Mercurial with a different hosting solution. So I worked with the Mercurial open-source community, as well as FogCreek (the makers of Kiln and maintainers of kbfiles), to get kbfiles added to Mercurial as a built-in extension. This happened with Mercurial version 2.0 and largefiles exists as a bundled extension in Mercurial today.

We went along with Mercurial using RhodeCode and largefiles and a branch-by-cloning workflow for about a year. During this year, we contracted the RhodeCode developer/maintainer to begin working on Pull Request functionality, which was far from complete or polished, but we were using it and had been able to add peer code reviews back into our development model.

In late 2012 we were running into serious performance problems with TeamCity, which were severely aggravated by the fact that we were using branch-by-cloning, which TeamCity was not designed to support. Additionally, developers were suffering from having to manage multiple clones — SSDs were still pretty small back then, and the size of a Unity clone+working copy had grown considerably. Some of the more brave developers had figured out how to manage multiple branches in the same clone with local bookmarks as a workaround.

So in December of 2012, we made two big changes to how we were using Mercurial:

We switched to named branches instead of branch-by-cloning. This required a bit of re-education and there was a bit of unhappy grumbling (suddenly commands got more complicated; developers had to remember to specify a branch name!), but overall it was received positively and helped a lot with the issues of disk space.
We introduced a ‘Mercurial Mirror’ that the build servers could use to pull from. When a client pulls from Mercurial, there is some hand-shaking that first happens (for the server to figure out exactly what needs to be sent to the client), then the server re-compresses the data to be sent so as to use less bandwidth; unfortunately this uses CPU. A lot of on-going connections from build machines can stress the server resource-wise (as well as using some of the finite number of connections available), and offloading this serving of data to a machine that is not the server developers are using to browse and review code is A Good Thing ™. “The Mirror” (as it’s referred to internally) is updated via push hooks when developers push to the main repository.

Throughout 2013, RhodeCode grew better code review capabilities (in late 2012, I had hired Mads (aka killerix), a member of the Mercurial development community, to work on tasks related to version control at Unity, and he’d taken over the development we needed of RhodeCode and other things), and we continued on with named branches, code reviews inside RhodeCode, and our Mercurial mirror.

In 2014, the RhodeCode project changed their licensing and we switched to the community fork of RhodeCode called Kallithea, which we still use and are one of the primary maintainers of.

Current Day

We’ve been using Mercurial for over 4 years now at Unity. Here are some stats (at the time of this writing) about our trunk (yes, it’s still called that) repository:

Clone size: 11 GB (4.6 GB of meta-data in the .hg directory, 6.4 GB of files (most space taken by largefiles) in the working copy)
196,453 files in the working copy
337 largefiles in the working copy
180,144 revisions
2,987 branches
Required extensions: largefiles and eol
Other commonly-used extensions: keyring, histedit, progress, rebase, record, purge

We have a separate ‘unity’ repository that developers actually work out of, but it contains (in addition to the ‘trunk’ branch) branches that are not yet (or never will be) merged to trunk. In order to not have to get all of that abandoned/unmerged/irrelevant work, we recommend that developers clone the ‘trunk’ repository, then update their clone’s .hgrc file to point to the developer repository called ‘unity’.

Our build farm still uses “The Mirror”, and we still use Kallithea (or, rather a modified-with-special-stuff-only-Unity-would-care-about version of Kallithea called ‘Ono’) for hosting and code reviews.

We also have machines set up in some of the Unity offices around the world that act as “Caching Mercurial Proxy Thingies” (because, well, we never figured out a better name). These work by using the hgwebcachingproxy extension (and users use the dynapath extension) to cache data that is pulled from the central server in Copenhagen for all users in an office to share. This is particularly nice for largefiles, as data has to only be pulled once from Copenhagen to Singapore (for example), but authentication is still handled through the central server in Copenhagen.

Finally, I’ll talk about some of the more major Good and Bad things we have encountered in our experience with Mercurial (keep in mind these are educated opinions and observations based on what I’ve seen — YMMV):

The Good

It’s pretty easy to use, all things considered. The learning curve has generally been pretty shallow for Mercurial. The commands are reasonably easy to use, and whether devs prefer the command line, or a GUI tool like TortoiseHG or SourceTree, new devs get the hang of Mercurial pretty quickly. It doesn’t have the complexity of Git’s staging area (though the lack of a staging area can be annoying for more advanced users, especially those coming from Git) or tracking branches, and especially after Mercurial introduced the concept of phases, the chances of developers accidentally shooting themselves in the foot (because of having rebased themselves into an impossible-to-push situation) or otherwise losing data is extremely low. Mercurial does a good job of suggesting what the user should use next, has reasonably complete built-in help, and there is usually only one way to do something, which makes it easy to google a question and find a relevant answer. It also supports a variety of local workflows (patch queues, anonymous branches, bookmarks) for ‘advanced’ users, and users who are not interested or can’t be bothered can get away with just the bare minimum.
All operating systems are first-class citizens. One of the reasons why we chose Mercurial in the past was because we have more developers running Windows than any other OS. Mercurial only needs a python interpreter to run — no Unix shell or anything else. GUI tools for Windows were also important to us; when we chose Mercurial, it hadTortoiseHG as a GUI tool available for Windows and this was a big advantage as many of our users were already using TortoiseSVN. Some time later SourceTree was also ported to Windows; however, a recent internal survey shows most Windows users still use TortoiseHG.
It has an easy-to-use extension system. Mercurial has a very well thought-out extension system which has proven invaluable. Extensions can be built-in (such as largefiles, eol, and many others) or 3rd-party (which is what kbfiles was, and what our above-mentioned hgwebcachingproxy and dynapath extensions are). The annoying thing for 3rd-party extensions is that the extension API in Mercurial is not guaranteed to be stable or backwards-compatible, which means for invasive extensions (like kbfiles) it can be difficult to keep up. Still, this easy way to expand Mercurial’s functionality is a big win, especially if your use case is complex, and non-invasive extensions usually don’t break from one version to the next.
Branch names (for named branches) are permanently embedded in commits. When your repository is large and your history is complex, having branch names embedded into your commits is invaluable when trying to see where a change came from, or why. For this reason, repository forensics are made much easier when using Mercurial’s named branches. That does, however, mean you need to pick your branch names wisely!

The Bad

Branch closing is weird. When we first started using named branches, we had a rule that all branches should be closed when they were merged ‘upstream’ (i.e, into trunk, or into a branch that was going to be merged into trunk). It is important, in practice, that if you are going to close a branch, you do it before you merge it. This is because closing branches actually creates a new commit — and you can end up with another topological head on your branch, which can later lead to weird stuff. Unfortunately, developers would often forget to close branches and it would have to be done after-the-fact. We ultimately concluded that closing branches caused more problems than it solved, but it’s annoying now that branch names never disappear from the ‘hg branches’ list.
No shallow or narrow clones. Mercurial supports neither shallow (only the last N revisions) or narrow (only sub-directory XYZ) clones. This is mostly just an inconvenience for us, but it does become more relevant as our repository grows. Shallow/narrow clones are hard problems to solve, especially in a user-friendly way (assuming you want the clone to be fully functional, provide a way to back-fill missing data on-demand, etc). Git actually does support shallow clones, but the last time I tried them, they had a fair number of limitations, and I’m not aware of any prominent DVCS that supports narrow clones.
Largefiles are a bit bulky and add some overhead. The largefiles extension is listed as a feature of last resort, and it is for good reason. You may need to use it, but it does add overhead to some commands and it has resulted in a fair number of odd bugs over the years (though it has improved steadily). In hindsight, I think something that handles largefiles at a lower level than using standins in the working copy would be smarter. But, largefiles has allowed us to actually use Mercurial, which wouldn’t be possible otherwise, and it’s been nice for us that it’s not been specific to any particular hosting platform or tool.
No ‘uncompressed’ pulling. Mercurial’s clone command has an ‘–uncompressed’ flag that can be specified (though technically it would better be called ‘–no-recompress’, as it doesn’t actually send the data uncompressed, it just doesn’t re-compress it from the format it’s normally stored in to use less bandwidth). This flag is great if you’re on a LAN with high bandwidth (in our case either in Copenhagen, or in an office with a Caching Mercurial Proxy Thingy). Unfortunately the pull command does not have the same flag, so if your clone is severely out of date, it can actually be faster to make a new one than pull the outstanding changes. This is something I’m sure we can solve once it becomes the item most worth our attention.

Conclusion

Overall, Mercurial has been a good choice for us, and I still think we made the right choice when we chose it years ago. With large companies like Facebook and Google also using and contributing to Mercurial, it continues to evolve (hey, look, I made a pun!) and develop.