Software, Work

Up-front quality

There has been a great exchange on the London Clojurians mailing list recently talking about the impact of a good REPL on development cycles. The conversation kicks into high-gear with this post from Malcolm Sparks although it is worth reading it from the start (membership might be required I can’t remember). In his post Malcolm talks about the cost of up-front quality. This, broadly speaking, is the cost of the testing required to put a feature live, it is essentially a way of looking at the cost that automated testing adds to the development process. As Malcolm says later: “I’m a strong proponent of testing, but only when testing has the effect of driving down the cost of change.”.

Once upon a time we had to fight to introduce unit-testing and automated integration builds and tests. Now it is a kind of given that this is a good thing, rather like a pendulum, the issue is going too far in the opposite direction. If you’ve ever had to scrap more than one feature because it failed to perform then the up-front quality cost is something you consider as closely as the cost of up-front design and production failure.

Now the London Clojurians list is at that perfect time in its lifespan where it is full of engaged and knowledgeable technologists so Steve Freeman drops into the thread and sensibly points out that Malcolm is also guilty of excess by valuing feature mutability to the point of wanting to be able to change a feature in-flight in production, something that is cool but is probably in excess of any actual requirements. Steve adds that there are other benefits to automated testing, particularly unit testing, beyond guaranteeing quality.

However Steve mentions the Forward approach, which I also subscribe to, of creating very small codebases. So then Paul Ingles gets involved and posts the best description I’ve read of how you can use solution structure, monitoring and restrained codebases to avoid dealing with a lot of the issues of software complexity. It’s hard to boil the argument down because the post deserves reading in full. I would try and summarise it as the external contact points of a service are what matters and if you fulfil the contract of the service you can write a replacement in any technology or stack and put the replacement alongside the original service.

One the powerful aspects of this approach is that is generalises the “throw one away” rule and allows you to say that the current codebase can be discarded whenever your knowledge of the domain or your available tools change sufficiently to make it possible to write an improved version of the service.

Steve then points out some of the other rules that make this work, being able to track and ideally change consumers as well. Its an argument for always using keys on API services, even internal ones, to help see what is calling your service. Something that is moving towards being a standard at the Guardian.

So to summarise, a little thread of pure gold and the kind of thing that can only happen when the right people have the time to talk and share experiences. And when it comes to testing, ask whether your tests are making it cheaper to change the software when the real functionality is discovered in production.


Agile software development defers business issues

My colleague Michael Brunton-Spall makes an interesting mistake in his latest blog post:

much of our time as developers is being completely wasted writing software that someone has told us is important.  Agile Development is supposed to help with this, ensuring that we are more connected with the business owners and therefore only writing software that is important.

Most Agile methodologies actually don’t do what Michael says here. Every one I’ve encountered in the wild treats it as almost axiomatic that there exists someone who knows what the correct business decision is. That person is then given a title, “product owner” for example and then is usually assigned responsibility for three things: deciding what order work is to be done, judging whether the work has been done correctly and clarifying requirements until they can be reduced to a programming exercise.

That’s why it was liberating to come across System Thinking which does try to take a holistic approach and say that any organisation is only really as good as its worst performing element. Doing that does not eliminate all the process improvements in development that Agile can provide but also illustrates that a great development team doing the wrong thing is a worse outcome than a poor development team doing the right thing.

The invention of the always correct product owner was a neat simplification of a complex problem that I think was probably designed to avoid having multiple people telling a development team different requirements. Essentially by assigning the right to direct the work of the development team to one person the issue of detail and analysis orientated developers getting blown off-course by differing opinions was replaced by squabbling outside the team to try and persuade the decision maker. Instead of developer versus business the problem was now business versus business.

Such a gross simplification has grave consequences as the “product owner” is now a massive point of failure and few software delivery teams can effectively isolate themselves from the effects of such a failure. I have heard the excuse “we’re working on the prioritised backlog” several times but I’ve never seen it protect a team from a collectivised failure to deliver what was really needed.

Most Agile methodologies essentially just punt and pray over the issue of business requirements and priorities, deferring the realities of the environment in the hoping of tackling an engineering issue. Success however means to doing what Michael suggests and trying to deal with the messy reality of a situation and providing an engineering solution that can cope with it.


The Cathedral and the Lemonade Stand

Software is big, hard and complicated. It has also traditionally been long-lived, often lasting beyond all the reasonable expectations of its creators.

These realities have driven a lot of software “best practices” in the last few years. Suites of tests to make sure the software is easy to change with confidence, the understanding that software needs to be complete in it itself rather than accompanied by auto-generated documentation files or 200 page manuals, the understanding that software is written to be read not to do things.

It has also led to more difficult arguments that have yet to be won. Things like the fact that software needs to be considered more like infrastructure, with ongoing costs. Most people realise that if they don’t clean their buildings they get dirty and if they don’t service their cars then eventually they stop working. Most people though are happy to pay large sums creating software only to avoid paying anything further and thereby creating a system that slowly slips into weed-ridden obsolescence.

It is this tendency to regard software as something you purchase once rather than an ongoing investment that I want to talk about here.

An interesting thing is starting to happen just now with the arrival of high-productivity languages such as Python and Ruby and flexible NoSql data solutions like CouchDb and key-value stores. Software is probably easier to create (without compromising the practices we’ve come to understand are important) now than ever before. It is also getting easier and easier to deploy applications with cloud services like Google App Engine and Heroku for the web and EC2 for raw machines.

In short we can now turn around software very quickly if chose to. This creates an interesting avenue for tackling the maintenance problem by caving in to what budget holders tend to do naturally. Budgets are generally spent on specific items of functionality. Maintaining that functionality as service tends to come out of a generalised pot, if at all. Normally arguments about solving this problem have focused on trying to include the true cost of the product into the initial budget.

But why? Why don’t we just create a piece of software and then later when we want it do something else through it away and start again. Take a lemonade stand. You don’t build a lemonade stand out of stone and marble with a team of master masons. When the hot weather comes round you grab some wood and cardboard and make a stand that will be good enough for the weekend or evening that you need it for. If next week it is still hot and you want to sell lemonade again you just make it again.

Websites are a lot like this, they get old very quickly and on the front end, they probably have a six-month lifecycle before design trends changes or new browser standards are implemented or some new way of using the web is discovered. Building a website to last four years (or the 10 or 20 years that key infrastructure software tends to be used for) is a waste of time, it’s very unlikely to repay your effort over it’s actual lifetime of, maybe, two years.

Even in software that does last 10 years there is often a feeling of regret that due to tremendous sunk costs in developing it it is not viable to move it to commodity hardware or the cloud or in the case of banking mainframe code make any change to it of any significance.

We can now develop really powerful applications in the timeframe of four to ten weeks. If that application lasts three months with no additional effort and we then spend another four to ten weeks replacing it, are we not actually better off? We are able to implement our lessons learnt sooner. Solutions are much more flexible and mistakes do not come with long-running costs attached. A bad idea can just be left as is.

I’m not arguing for a complete rewrite every three months, just like the lemonade stand we can reuse the good bits, our sign or the good piece of wood for the counter perhaps. Things like Guerilla SOA are taking us in these directions anyway with loosely coupled services with standard-based protocols and interchange formats. If we write a great authentication service we can keep that, or we could replace it with OAuth or OpenId. Our options are open and wider when we play in the short term rather than the long term.