Programming, Software

Even if I’m happy with Subversion?

I’m not going to try and make the case for the next generation of DVCS (Git, Bazaar, Mercurial). There are a host of posts on the subject and you can search for them via the magic of Google searches. In addition you have thriving communities around BitBucket and GitHub and you have major repositories for huge open source projects like Linux and OpenJDK. All of this means that the tools are good, mature and support sophisticated development processes.

However I have been surprised that a number of people feel that Subversion is enough for their needs. Now this might be true for some people but I genuinely think that DVCS is a step change in source control and that ultimately it will adopted almost universally and Subversion will end up in the same place as CVS is now.

I have been trying to explain why I think this to people and I sometimes get the feeling that Subversion has ended up defining source control best practice by enshrining the limitations of the tools. For example a lot of people think that branches of the code should be short-lived. This is true for Subversion, branching and merging is painful. However in Mercurial and Git I fill that branching and to a lesser extent merging is pretty painless. In fact you often find yourself doing it without really thinking. Therefore why limit yourself to short-lived branches? Since it is easy to take change sets into branches it is easy to push canonical updates into derivative branches as you need to keep them up to date.

In truth some branches, like maintenance releases and conditional changes, tended to be long-lived in Subversion repositories anyway. They were just painful to manage. Also you had this set of conventions around branches, tags and trunk in the Subversion repository that really didn’t help manage the history and intent of the code they contained. In the DVCS model those repository concepts become repositories in their own right and are easier to manage in my view.

What about continuous integration? Many people have expressed an opinion that they don’t like to be more than five minutes away from the canonical source tree. However checking in to the parent source line this frequently means that intermediate work is frequently checked into “trunk” or its equivalent. I think that DVCS opens the possibility of having a very clean and consistent trunk because intermediate products and continuous integration can be pushed closer to the source of change. You still need to push to the canonical code stream frequently but I think in a DVCS world you actually pull a lot more than you push.

It is certainly my anecdotal experience that code “breaks” are rarer in DVCS as you tend to synchronise in your personal branches and resolve issues there so the view of the code as pushed to the canonical stream is much more consistent.

The recording of change in code files is also much more sophisticated in the changesets of DVCS’s. The ability to drill into change on all the leading contenders is amazing, as is the ability to track it between different repositories. Switching from the linearity of revisions to the non-linear compilation of changesets can be headfuck but once you’ve mastered it you begin to see the possibilities of constructing the same sets of changes in different ways to result in outcomes that would traditionally be called “branches”. Your worldview changes.

Subversion is not what I want to work with anymore and every time I have to pull an SVN archive across the world via HTTP/WebDAV or I have to wait for the server to be available before I can create a new copy of trunk or commit a set of changes then I wonder why people are still so happy with Subversion.

Programming, Python

Setting up a Mercurial work environment

So today I had a frustrating experience trying to get a Mercurial environment going for the current project I am working on. I am convinced for the kind of work we are doing the distributed branch model is going to be the right solution and I had been using a local Mercurial instance during my tech spiking earlier on the product.

Having built Python and Mercurial on the server and created a Mercurial user with the SSH keys of the individual developers I thought it would be easy sailing this morning. Not a bit of it and all of the problems stemmed from the lack of by default support for SSH in Windows.

Like most people I use the Putty Tool Suite for my SSH needs on windows. However simply aliasing putty to ssh doesn’t work. There are some key command switches that are different (including the trivial but annoying -P instead of -p for port).

Coming from a Java Subversion background I am used to having a portable library to take care of my ssh protocol needs. I also use OSX for my home development and that obviously provides a UNIX ssh command-line implementation, hence hassle-free SSH based source control.

The command line tool turned out to be the root of all my problems, both Eclipse and NetBeans Mercurial plugins don’t provide an implementation of the Mercurial client, instead they delegate it to the Mercurial client on the host OS, that in turns delegates ssh protocol invocations to the command line SSH.

The solution is simple once you know it, in the Mercurial.ini file you can alias ssh to Putty’s plink executable. There is in fact an example in the Mercurial book. Better still you don’t have to specify the key used if you are using Pageant.

However getting the first station to work was incredibly hard work. I even downloaded Cygwin just to get a more normal Unix ssh but by that point I had put a typo into the ssh alias and was getting a very weird error whenever I invoked Mercurial and I’d lost the forest in the trees.

At one stage I even gave Bazaar a go, hoping that SFTP might cure all my ills. However Bazaar uses Paramiko for it’s SSH support and on Windows that was failing due to its inability to find an OS source of entropy.

Server-side Python also let me down a bit as I was having an issue with the zlib module and after similar experiences with Ruby I knew that this would be because I must have compiled Python before zlib. Despite cleaning and re-configuring Python it still didn’t build the zlib module and in then end I had to go and run configure within the zlib module manually prior to a full Python build. This is exactly the same issue I have had with Ruby, what is the problem with this library?

Once you’ve done it once Mercurial repositories are actually easier to setup and manage than SVN (and that itself is pretty easy once you’ve done it a few times) and if you are working in a UNIX environment then they are extremely compelling.

However on Windows you are currently going to either do a lot of preparatory reading or be ready to set aside six hours to prevent frustration. In my case it probably didn’t help that I am also the one selling the utility of Mercurial. The pioneers always have the arrows in their backs; on the other hand they also get to better places long before anyone else.