Software, Work

The Joy of Acceptance Testing: Is my bug fixed yet?

Here’s a question that should be a blast from the past: “Is my bug fixed yet?”.

I don’t know, is your acceptance test for the bug passing yet?

Acceptance tests are often sold as being the way that stakeholders know that their signed-off feature is not going to regress during iterative development. That’s true, they probably do that. What they also do though is dramatically improve communication between developers and testers. Instead of having to faf around with bug tracking, commit comment buggery and build artifact change lists you can have your test runner of choice tell you the current status of all the bugs as well as the features.

Running acceptance tests is one example where keeping the build green is not mandatory. This creates a need for a slightly more complicated build result visualisation. I like to see a simple bar split into green and red with the number of failing tests. There may be a good day or two when that bar is totally green but in general your test team should be way ahead of the developers so the red bar represents the technical deficit you are running at any given moment.

If it starts to grow you have a prompt to decide whether you need to change your priorities or developer quality bar. Asking whether a bug has been fixed or when the fix will be delivered are the wrong questions. For me the right questions are: should we fix it and how important is it?

If we are going to fix a bug we should have an acceptance test for it and its importance indicates what time frame that test should pass in.

Software, Work

Agile must be destroyed

Let’s stop talking about the post-Agile world and start creating it.

Agile must be destroyed not because it has failed, not because we want to return to Waterfall or to the unstructured, failing, flailing projects of the past but because it’s maxims, principles and fixed ideas have started to strangle new ideas. All we need to take with us are the four principles of the Agile Manifesto. Everything else are just techniques that were better than what preceded them but are not the final answer to our problems.

There are new ideas that help us to think about our problems: a philosophy of flowing and pulling (but not the training courses and books that purport to solve our problems), the principles of software craftmanship, “constellation” architecture, principles of Value that put customers first not our clients. There are many more that I don’t know yet.

We need to adopt them, experiment with them, change and adapt them. And when they too ossify into unchallengeable truisms we will need to destroy them too.


Fine grained access control is a waste of time

One of the things I hate developing most in the world (there are many others) is fine grained control systems. The kind of thing where you have to set option customer.view_customer.customer_delivery_options.changes.change_customer_home_delivery_flag to true if you want a role to be able to click a checkbox on and off.

There are two main reasons for this:

  • Early in my career I helped implement a fine grained system, it took a lot of effort and time. It was never used because configuring the options were too difficult and time consuming. Essentially the system was switched to always be on.
  • Secondly, when working in a small company I discovered that people that do the job of dealing with customers, buying stock or arranging short term finance actually did a better job when the IT department didn’t control how they did they worked. Having IT implement “controls” on their systems is like selling a screwdriver that only allows you to turn it in one direction.

Therefore I was very happy to hear Cyndi Mitchell on Thursday talking about the decision not to implement fine level ACL in Mingle. If you record who did what on the system and you make it possible to recover previous revisions of data then you do not need control at level much finer than user and superuser.

Instead you can encourage people to use your system as a business tool and if they decide to use that screwdriver to open paint tins or jam open doors, then good on them.

Programming, Software

The One True Layer Anti-Pattern

A common SQL database anti-pattern is the One True Lookup Table (OTLT). Though laughable the same anti-pattern often occurs at the application development layer. It commonly occurs as part of the mid-life crisis phase of an application.

Initially all objects and representations are coded as needed and to fit the circumstances at hand. Of course the dynamics of the Big Ball of Mud anti-pattern are such that soon you will have many varying descriptions of the same concept and data. Before long you get the desire to clean up and rationalise all these repetitions, which is a good example of refactoring for simplicity. However, at this point danger looms.

Someone will point out eventually that having one clean data model works so well that perhaps there should be one shared data model that all applications will use. This is superficially appealing and is almost inevitably implemented with a lot of fighting and fussing to ensure that everyone is using the one true data model (incidentally I’m using data models here but it might be services or anything where several applications are meant to drive through a single component).

How happy are we then? We have created a consistent component that is used across all our applications in a great horizontal band. The people who proposed it get promoted and everyone is using the One True Way.

What we have actually done is recreated the n-tier application architecture. Hurrah! Now what is the problem with that? Why does no-one talk about n-tier application architecture anymore? Well the issue is Middleware and the One True Layer will inevitably hit the same rocks that Middleware did and get dashed to pieces.

The problem with the One True Layer is the fundamental fact that you cannot be all things to all men. From the moment it is introduced the OTL must either bloat and expand to cover all possible Use Cases or otherwise hideously hamstring development of the application. If there was a happy medium between the two then someone would have written a library to do the job by now.

There is no consistency between which of the two choices will be made; I have seen both and neither of them have happy outcomes. Either way from this point on the layer is doomed: it becomes unusable and before long the developers will be trying to work their way around the OTL as much as possible, using it only when threatened with dismissal.

If the codebase continues for long enough then usually what happens is the OTL sprouts a number of wrappers around its objects that allow the various consumers of its data to do what they need to. When eventually the initial creators of the OTL are unable to force the teams to use the layer then the wrappers tend to suck up all the functionality of the OTL and the library dependency is removed.

In some ways this may seem regressive, we are back at the anarchy of objects. In fact what has been created is a set of vertical slices that represent the data in the way that makes sense for the context they appear in. These slices then collaborate via external API interfaces that are usually presented via platform neutral data transfer standards (HTTP/JSON for example) rather than via binary compatibility.

My advice is to try to avoid binary dependent interactions between components and try to avoid creating very broad layers of software. Tiers are fine but keep them narrow and try to avoid any tier reaching across more than a few slices (this particularly applies to databases).


Get your own Couch

At Erlounge in London I recently had the chance to catch up with the guys J Chris Anderson and Jan Lehnardt. The conversation was interesting as ever and I was intrigued by JChris’s ideas that CouchDb has twisted conventional data storage logic on it’s head by making it easy to create many databases with relatively small amounts of information; the one database per user concept.

More importantly though I discovered it was okay to talk about the hosted Couch instances that (who also do CouchDb consulting if you need some help and advice) are offering. The free service offers you a Basic authentication account with 100Mb of storage to play around with Couch to your heart’s content. Paying brings more space but also more sophisticated authentication options.

The service is the perfect way to play around with Couch and learn how you could use it go get an account today! It’s on the Cloud as well: schema-less data on the Cloud, how buzzword compliant is that?!

On a more serious note, this is an excellent service and one I have been asking for as it allows people who have no desire to build and maintain Erlang and Couch to use the datastore as a web service rather than as a managed infrastructure component.

Programming, Software

Even if I’m happy with Subversion?

I’m not going to try and make the case for the next generation of DVCS (Git, Bazaar, Mercurial). There are a host of posts on the subject and you can search for them via the magic of Google searches. In addition you have thriving communities around BitBucket and GitHub and you have major repositories for huge open source projects like Linux and OpenJDK. All of this means that the tools are good, mature and support sophisticated development processes.

However I have been surprised that a number of people feel that Subversion is enough for their needs. Now this might be true for some people but I genuinely think that DVCS is a step change in source control and that ultimately it will adopted almost universally and Subversion will end up in the same place as CVS is now.

I have been trying to explain why I think this to people and I sometimes get the feeling that Subversion has ended up defining source control best practice by enshrining the limitations of the tools. For example a lot of people think that branches of the code should be short-lived. This is true for Subversion, branching and merging is painful. However in Mercurial and Git I fill that branching and to a lesser extent merging is pretty painless. In fact you often find yourself doing it without really thinking. Therefore why limit yourself to short-lived branches? Since it is easy to take change sets into branches it is easy to push canonical updates into derivative branches as you need to keep them up to date.

In truth some branches, like maintenance releases and conditional changes, tended to be long-lived in Subversion repositories anyway. They were just painful to manage. Also you had this set of conventions around branches, tags and trunk in the Subversion repository that really didn’t help manage the history and intent of the code they contained. In the DVCS model those repository concepts become repositories in their own right and are easier to manage in my view.

What about continuous integration? Many people have expressed an opinion that they don’t like to be more than five minutes away from the canonical source tree. However checking in to the parent source line this frequently means that intermediate work is frequently checked into “trunk” or its equivalent. I think that DVCS opens the possibility of having a very clean and consistent trunk because intermediate products and continuous integration can be pushed closer to the source of change. You still need to push to the canonical code stream frequently but I think in a DVCS world you actually pull a lot more than you push.

It is certainly my anecdotal experience that code “breaks” are rarer in DVCS as you tend to synchronise in your personal branches and resolve issues there so the view of the code as pushed to the canonical stream is much more consistent.

The recording of change in code files is also much more sophisticated in the changesets of DVCS’s. The ability to drill into change on all the leading contenders is amazing, as is the ability to track it between different repositories. Switching from the linearity of revisions to the non-linear compilation of changesets can be headfuck but once you’ve mastered it you begin to see the possibilities of constructing the same sets of changes in different ways to result in outcomes that would traditionally be called “branches”. Your worldview changes.

Subversion is not what I want to work with anymore and every time I have to pull an SVN archive across the world via HTTP/WebDAV or I have to wait for the server to be available before I can create a new copy of trunk or commit a set of changes then I wonder why people are still so happy with Subversion.

Groovy, Java, Programming, Scripting, Software

Working with Groovy Tests

For my new project Xapper I decided to try and write my tests in Groovy. Previously I had used Groovy scripts to generate data for Java tests but I was curious as to whether it would be easier to write the entire test in Groovy instead of Java.

Overall the experience was a qualified “yes”. When I was initially working with the tests it was possible to invoke them within Eclipse via the GUnit Runner. Trying again with the more recent 1.5.7 plugin, the runner now seems to be the JUnit4 one and it says that it sees no tests, to paraphrase a famous admiral. Without being able to use the runner I ended up running the entire suite via Gant, which was less than ideal, because there is a certain amount of spin-up time compared to using something like RSpec’s spec runner.

I would really like all the major IDEs to get smarter about mixing different languages in the same project. At the moment I think Eclipse is the closest to getting this to work. NetBeans and Intellij have good stories around this but it seems to me to be a real pain to get it working in practice. I want to be able to use Groovy and Java in the same project without having Groovy classes be one of the “final products”. NetBeans use of pre-canned Ant files to build projects is a real pain here.

Despite the pain of running them though I think writing the tests in Groovy is a fantastic idea. It really brought it home to me, how much ceremony there is in conventional Java unit test writing. I felt like my tests improved when I could forget about the type of a result and just assert things about the result.

Since I tend to do TDD it was great to have the test run without having to satisfy the compiler’s demand that methods and classes be there. Instead I was able to work in a Ruby style of backfilling code to satisfy the runtime errors. Now some may regard this a ridiculous technique but it really does allow you to provide a minimum of code to meet the requirement and it does give you the sense that you are making progress as one error after another is squashed.

So why use Groovy rather than JRuby and RSpec (the world’s most enjoyable specification framework)? Well the answer lies in the fact that Groovy is really meant to work with Java. Ruby is a great language and JRuby is a great implementation but Groovy does a better job of dealing with things like annotations and making the most of your existing test libraries.

You also don’t have the same issue of context-switching between different languages. Both Groovy and Scala are similar enough to Java that you can work with them and Java without losing your flow. In Ruby, even simple things like puts versus println can throw you off. Groovy was created to do exactly this kind of job.

If the IDE integration can be sorted out then I don’t see any reason why we should write tests in Java anymore.

Java, Programming, Software

Programming to Interfaces Anti-Pattern

Here’s a personal bete noir. Interfaces are good m’kay? They allow you to separate function and implementation, you can mock them, inject them. You use them to indicate roles and interactions between objects. That’s all smashing and super.

The problem comes when you don’t really have a role that you are describing, you have an implementation. A good example is a Persister type of class that saves data to a store. In production you want to save to a database while during test you save to an in-memory store.

So you have a Persister interface with the method store and you implement a DatabaseStorePersister class and a InMemoryPersister class both implementing Persister. You inject the appropriate Persister instance and you’re rolling.

Or are you? Because to my mind there’s an interface too many in this implementation. In the production code there is only one implementation of the interface, the data only ever gets stored to a DatabaseStorePersister. The InMemory version only appears in the test code and has no purpose other than to test the interaction between the rest of the code base and the Persister.

It would probably be more honest to create a single DatabaseStorePersister and then sub-class it to create the InMemory version by overriding store.

On the other hand if your data can be stored in both a graph database and a relational database then you can legitmately say there are two implementations that share the same role. At this point it is fair to create an interface. I would prefer therefore to refactor to interfaces rather than program to them. Once the interaction has been revealed, then create the interface to capture the discovery.

A typical expression of this anti-pattern in Java is a package that is full of interfaces that have the logical Domain Language name for the object and an accompanying single implementation for which there is no valid name and instead shares the name of the interface with “Impl” added on. So for example Pointless and PointlessImpl.

If something cannot be given a meaningful name then it is a sure sign that it isn’t carrying its weight in an application. Its purpose and meaning is unclear as are its concerns.

Creating interfaces purely because it is convenient to work with them (perhaps because your mock or injection framework can only work with interfaces) is a weak reason for indulging this anti-pattern. Generally if you reunite an interface with its single implementation you can see how to work with. Often if there is only a single implementation of something there is no need to inject it, you can define the instance directly in the class that makes use of it. In terms of mocking there are mock tools that mock concrete classes and there is an argument that mocking is not appropriate here and instead the concrete result of the call should be asserted and tested.

Do the right thing; kill the Impl.


SiCamp 2008: Working with the Carbon Co-op

Okay so it’s Monday morning, I’m tired as hell and I’m kind of wondering what the point of spending my whole weekend at SiCamp was.

The weekend was really a game of two halves: Saturday was generally pretty good. I got there, learnt about the idea, pushed back on it, the group kicks around an idea that we thought we could get done in a weekend and then we set about making it. The team I was in was Carbon Co-op which is about using collective buying power to reduce the initial cost of buying and installing energy saving or renewable energy products.

Sunday though was a general fail fest for the team. On Sunday morning the team consisted of just four members, we had to beg for some help to get the webpages for the site done. Then during the pitch we had a complete fail, the AV system was screwed, the project sponsor seemed nervous about his pitch and there wasn’t enough time to switch laptops and show the site, which meant that all the time we had spent on it was completely wasted. About the only thing wasn’t a fail was the lunch, which was excellent and far beyond what I was expecting for this kind of thing.

The idea for the project is good and actually unlike a lot of the ideas at the event it had a genuine business model. However the team failed to communicate that or show any of the potential behind the concept. It failed to distil simple messages that could be quickly absorbed, in short: it failed to impress.

Part of the failure was not having gone to one of these events before and not knowing the format, what was expected and what you should be doing. So lets try and rectify that for the future.

Firstly, the event is advertised as an X-Camp, fully buzzword compliant. However for the talk of “self-organising”, it is actually a competition. Everything will boil down to 10 minutes in front of a panel of judges. For me an X-Camp should be able to decide the criteria for success, not the organisers; Camp Fail.

As you are in a competition: find out who the judges are. I did speak to the judges and what struck me was that they were for the most part concerned about business model and development. Innovation for them meant responding to changing circumstances in the economy. None of them seemed to care about the tech side of things except as an illustration of what the final product might look. It would have been more cost-effective for me to have hammered out some HTML and Javascript mocks of the site that could have been zipped up and put on the pitcher’s machine. All the working code I had was kind of a vanity project in the end (it is out there as open source though if you are interested). Misguided Effort Fail.

Another important aspect of the judge’s background is that every “mentor” and “adviser” that came through the project door gave really bad advice in the context of winning the competition. This year is you wanted to win SiCamp your project should have found something in the existing economy and done it better. Every team seemed to want to “disrupt” things and that message won no friends. If the advisers do not have the same background as the judges: don’t listen to them. If you think they will be helpful to the business: schedule a post-Camp meeting. Focus Fail.

Get a big team and assign roles quickly. I had assumed that everyone was in a similar boat in struggling to get enough people to cover the work (and indeed the winning team was quite small). However at least two of the teams had 10 to 15 people involved. You need people to run your blog, someone to create the presentation, someone to give the pitch, someone to facilitate and someone to project manage. Perform a skills audit quickly and sort out when everyone is available, don’t leave it to Saturday evening to ask whether people will be back tomorrow.

If you decide you are going to try and build a working site on the weekend then you will need a couple of developers with complimentary skills, some web designers (nothing fancy: HTML/Javascript/CSS will do but make sure its practical experience with some cross-browser experience), an infrastructure person (who can be shared with other teams), someone to generate content for the site and ideally someone to test the user experience.

You also need to have done some planning prior to the event. At the very least buy your hostname, buy some hosting, don’t be afraid to make a technology choice if it means your hosting is going to be simpler. Be responsible: Carbon Co-op, for example, requires people to submit an email and postcode, this means data protection issues. You simply can’t ask someone you randomly meet at an event to open their credit card, buy you a name, hosting and then start collecting people’s details. I know everyone at SiCamp is going to encourage you to do this but it’s a really terrible idea and this is meant to be your business not a final year student project. What happens if you fall out with random person after the event? How are you going to get your data back?

If after the skills audit you decide you don’t have a viable team then don’t just plough on. Take it on the chin that your idea hasn’t attracted enough interest and disband the team and go try make someone else’s idea awesome. Alternatively scale back to what you can achieve, which will usually be a really nice slideshow. Accept that a slideshow isn’t going to knock anyone dead when other teams will be launching websites.

Finally a practical point: don’t make videos for your project and presentation unless you are trying to make work for idle hands. Making videos is time-consuming and they don’t impress people. Actually getting someone who had never heard of the project before the weekend to come and talk during your presentation would have totally slain the judges. Think of people who video tape acceptance speeches, the undertone is: “Your event doesn’ t matter to me”. If you want to have someone speak who can’t physically get to the venue then use a video chat instead.

If you are a project founder then don’t be tempted to take an active role in the weekend’s work. Your role is to be the project visionary, don’t even be tempted to give the pitch yourself. Get someone else to give the pitch and then have them invite you to speak during the presentation. It’ll make you seem much more important and insightful. If you give yourself a role in the project then remember that all the time you spend on slides or talking to potential investors or managing tasks is time that you are unavailable to your team. Your team needs you to inspire them and make choices about what you want. This is your job.

What about the organisation of SiCamp itself? Well the venue was fantastic, the internet provision was first-rate, catering was excellent. Kudos on this.

Things that were not so good were the lack of facilitators and runners for teams. Each team should have had an experienced Camp hand to provide an idea of what the event was about, what was expected and when things were going wrong. This person should also have mediated who could visit the team rooms. They should also have compared notes with the other Team Camp hands to see whether all the teams were equally balanced.

Although in principle you were meant to be able to swap between teams and look into the other project rooms I’m not sure who the organisers thought was going to do the work for your team if you went off and have a wander round the other project rooms. I would have liked to seen the other team’s efforts and how they organised their teams but most of the weekend was spent looking at an editor on a MacBook screen.

When teams needed help or advice they should have been able to ask the facilitator to send a runner round the other team’s faciltators to find out if what they needed was in the other teams. There was a real confusion around whether people were competiting in teams or collaborating in a single event.

However my number one issue was the AV at the final show and tell. The sound didn’t work, the microphone kept cutting out due to low batteries ¬†and the VGA connection to the OHP wasn’t working and everything projected purple. In short absolutely nothing worked and stress for our team was massive as we struggled to get something reasonable going.

My checklist of prep would involve: paying for a decent DVI projector with Mac compatibility, do a sound check prior to the event, allow the teams to run through 2 minutes of their pitches in the actual venue, sort out a running order ahead of time. In short, don’t make those 10 minutes more painful than they have to be.

So okay, praising and moaning over, do I think it was worth going? Well it was an interesting challenge and I wanted to see what could be done in two days. I’ve also got at least three blog posts out of it so I guess I learned a lot. My first thought on trying it again would be to assemble a team prior to the event so that you could be guaranteed the range of skills you need to really build something in 8 hours.

That’s really focussing on the competition aspect though and thinking outside the competition there are a lot of intangibles that the entrepreneurs involved gained. Some projects got new names, Carbon Co-op got a cool logo. In some ways assembling the shock troopers of project execution would mean that you were not really taking into account what people need to push their projects forward. You would also exclude some people from teams who could benefit from the experience of working in cross-discipline close-knit teams with immediate feedback loops.

I think I would want a more relaxed role next time with time to take in more of the event. I would also reign back from treating the event like a Hack Day with an emphasis on cool stuff working. Faking it before you make it is absolutely fine. To be honest I could also do with a 10:30am start on the weekend. With that in mind I would consider doing it again.


Nulls mean nothing, or something…

My last post touched on something that is a real issue for me. Using Null in a Relational Database, now I’m not Fabian Pascal but I have had enough problems with Nulls to feel that if you are going to use a Relational Database it is well worth making the effort to eliminate Nulls from your data model.

You really haven’t suffered until you’ve had to add a Decode in a query to add a layer of logic to a column which really never should have been Null in the first place. And then there is learning the special logic for Nulls usually when your predicate has failed to evaluate as expected. Could there be anything finer than saying X = Y OR X IS NULL?

However usually my problems with nulls come down to the following:


(,2,3,,5) intersection (2, 5)

What do these things mean? They make my head explode in data terms. They don’t really make senses as tuples, vectors or indeed anything other than lists. In the case of lists you can charitably interpret the empty values as being positions that have no assigned value.

Null in programming languages usually has a strong definition. In Java it means that a reference has no associated memory allocation, in Ruby it’s a type of object. In databases people always struggle to say what NULL is.

In the comments on the previous blog post the commentor came up with at least three interpretations of a Null value: unknown, unset and Not A Number (NAN). However really Null is an absence of information, it’s kind of the anti-fact. It’s presence tells you nothing and as soon as you try and interpret it the number of potential meanings multiples exponentially.

Trying to store NAN as a concept in a database is just misguided. If I can say a NUMERIC field is, in fact, NAN then why can’t I store the idea that a Number is not a Lemon? Or a roadsign? If it is really not a number, why are you trying to store it in a numeric field? If it’s the result of a calculation that is either not storable within the allocated storage or infinity then why don’t you store that information some other way?

Relational databases work best without nulls and usually the data model works best if it is seen as a collection of facts, that can be stored in a normalised form with strong relations between the sets of facts.

However the way they often get used is a surrogate (and usually underperforming) hash stores where each entry consists of a set of keys that may or may not have values. This is great for object serialisation and instead of relational queries you can introduce highly parallelisable search algorithms. However it kind of sucks at everything else. Firstly because the usual RDBMS like MySql or Oracle isn’t expecting to be used as a Hash Store so it is doing a lot of things that you aren’t going to need in this model. Secondly because the irregular nature of the hashed data means that trying to get sets of rows back out of the database can underperform because the system is often forced to brute force examine the data in a serial fashion.

The whole point of creating tuple arthimetic is so that you can optimise for relational processing and query big sets of data quickly. Completely ignoring it or, worse still, crippling it so that serialising objects is easy is like shooting yourself in the foot… with a shotgun… for no reason.