Programming

Writing code without tests

This post is aimed at people who have mastered test-driven development and ideally also behaviour-driven development and who are familiar with XCheck testing. If you don’t have good basic steps then trying to jump onto some of these techniques are likely to backfire on you as you will probably struggle to assess the risks correctly.

There is a reason TDD was invented, it represents the refinement of good testing practice and the philosophy of good software design. TDD is a relatively simple practice to describe that requires effort to implement. Writing code driven by tests is safer that straight-coding.

Writing untested code is a kind of mastery technique. It is high-risk and relies on the skills and the knowledge of the programmer. I don’t think it is ever responsible if the programmer is not going to be the person supporting the result in production. Without this condition then the programmer’s interests are not properly aligned with the consumers of their code.

So with all those caveats in place what if we want to create code faster because we don’t have to write tests?

Well we have to understand where bugs come from and we will have to write code that doesn’t allow those situations to arise.

There are two important principles to start with. If you can rely on tested library code, then you can rely on the underlying quality of the tested code and leverage it in your own application. Secondly the code you don’t write will not have bugs.

Therefore we should be aiming to write the smallest amount of code possible and we should never try to code what others have coded for us.

The next point is about where bugs occur. I think we’re now at a consensus point that most bugs occur in the way we change and maintain state. In both procedural and functional languages it is rare to get a mistake in the order of steps that something must happen in for example. These kind of problems tend to be misunderstandings of the domain (that get written into test suites as well so testing doesn’t help catch them) rather than genuinely unexpected consequences of the programmer’s code. Object-orientated code is really hard to reason about from this point of view as objects don’t have an implied order of execution.

This is why quick scripts of less than 200 lines tend to do stable sterling service for years whereas larger applications are more tortured in their existence.

Therefore whatever language we are coding in we need to adopt the functional principle of operating only on our parameters and returning values that can be consumed by the caller.

Size matters, a lot, if whole program can fit into a single file and you can pretty much hold the whole thing in your head then it will be easy to reason about what the program is doing and see flaws in the logic of the program. A single complex line of code is better than many lines and is much better than many lines split across many files.

One way to bring down the size of code files is to be ruthless about concerns. For example recently in my Python programming I have been assigning only one purpose to each module: this module renders reports, this one provides JSON endpoints.

Another technique is to not persist any state, this is actually surprisingly easy in web programming since each request is completely separate event and by default you can trade CPU time for isolation.

If you are doing batch or server-side programming then it is worth considering using something like parallel to create many separate bubbles of execution rather than trying to write code yourself to distribute work.

Another aspect of state that causes issues are making global modifications, whether it be to a database or a filesystem. Try and defer all global changes to the final moment of a program and do all the manipulation in-memory instead. If you never change the world then you can run a program over and over again refining what it does.

Assertions are more powerful than logging in writing test-less code, it is better to kill a thread of execution rather than let it do something you weren’t expecting. Logging is really about helping build your intuition about what a program does and how it works.

Assertions allow you to create strong pre and post-conditions on the operation of the program. Essentially they allow you to guarantee the “happy path” execution of your code and avoid having to test all the negative situations that might occur.

Despite this you always want to code for failure, use short-circuit logic to abort code flow early and therefore simplify the context of the code in the rest of the function.

Remember all the basic rules of cyclomatic complexity, don’t nest, don’t do conditionals, do try and express your looping as list comprehensions.

Don’t write generic code, ever. The more potential inputs a function has, the more you end up needing unit-tests to verify the interactions. If something is meant to work on strings don’t try to make it work on strings and integers. Your detection code ends up being a potential source of bugs that needs testing.

If you write dynamic interpreted languages then you are going to have do some manual testing, unless you can remember the names and orders of the functions exactly. Don’t forget to dive into the shell or REPL and play around with the code in isolation. If you can verify the behaviour of individual parts of your program without having to wire together multiple components then you have the right level of granularity for your code.

Re-use code that is already working. Code re-use is generally best achieved by cut and pasting files and then importing the functions you need. Don’t try and synchronise your code, updating some library code ultimately means that you are going to know whether the new library code works as you expect with your functionality and that means you’ll need a test suite.

Don’t refactor your code, rewrite it. Refactoring requires unit tests. Don’t be afraid of things like myfunction2 (although once you have the new functionality you need to delete the old unused stuff). Re-writing allows you to ditch all your assumptions about the code and attempt to express your new understanding of the problem and the requirements as simply as possible.

Don’t work with large numbers of people on the same code base. The more people trying to modify and change the code the more you need tests to try and clarify your different intents for the code base. Again, try divide and conquer on the problem, rather than six people working on the same code can you get three sets of two people collaborating on three smaller codebases.

Finally don’t be afraid to write a test. Writing the right unit-test to prove you can rely on a base piece of functionality means that you then don’t have to write tests for all the pieces of code that use that underlying function. I like to try and write code without tests to maximise the flexibility of the code base when I’m tackling problems with unclear solutions. It is not an ideological thing to have no tests whatsoever, it is rather that when tempted to write a test I think “Could I do this in a way that is trivial and doesn’t require a test?”. Simplicity is the cornerstone of test-free code.

Standard
Software, Web Applications, Work

String Templates, or what I learned from Python and doing nothing

It’s an ill wind that blows no-one any good. The same is true of projects (although money generally helps more here; it’s an ill project that is making no-one any money).

I’m currently meant to be doing some work on Accessibilty for some new HTML pages. I thought it would be pretty easy but I was really wrong and it is changing the whole way I look at the View part of the (deceased) MVC web paradigm.

On my last project I was looking at things like Groovy’s Markup Builder and marvelling at how my collegues managed to put together a 30 line Freemarker template that did some pretty compex HTML assembly. In my spare time I have been looking at Haml as a way of escaping the verbosity and monotony of XHTML and to have the code guarantee the correctness of my page structure to avoid validation grind.

That’s because in those projects I was a developer/web designer. I wanted accurate, compliant HTML with minimum effort and which was easy to style without having awful CSS hacks.

On my current project I’m in the utterly baffling (for me anyway) world of .Net. There is no way that I can understand the huge variety of C#, XML and templating overrides that make up my current project. Having code generate HTML is a massive barrier to me being productive because, while I know a far bit of HTML having to root around an entire Visual Studio project to find the fragment that generates the problematic Div element you actually want to work with means I spend the whole day knowing nothing about .Net rather than applying the knowledge I do have.

Now some people are going to say that having a wacky Component model is different from having a nice templating language but look at something like Haml or Freemarker. The former is concise and fun and full of obscure rules; the latter is tremendously powerful, more firmly rooted in HTML and not much less obscure. For power users I agree, they are the bomb. They are a massive barrier to entry though, in a way that HTML just isn’t. People may do HTML badly but they rarely don’t do it at all.

This experience put combined with using Django/Jinja/Google App Engine is leading to me have a huge rethink about the way templating and views are put together. Passing a map of parameters to a template that is essentially exactly the way the output will look on the final device is obviously the way this problem should be tackled.

To try and get the HTML to generate in the current project I spent a day trying to get: SQLServer, BizTalk, Active Directory and Windows MQ to work together. This is utter madness and can only have been created by programmers who have no idea how to collaborate with non-programmers.

Why should I be trying to install BizTalk when what I want to do is actually generate some sample HTML so we can have a quick check of WAI standards? It should be possible to define some fixture data and then just generate HTML from the templates. It really shouldn’t be hard.

This experience is really changing the way I think about web frameworks. I am already determined to learn String Template, then I am going to look at whether my current favourite frameworks allow me to use it. I’m going to look at frameworks that ask you to put the HTML template next to the Java code. I want to know if I can put those templates in the same heirarchy that the actual website uses.

In short if I need to work with people outside the project team on a web project again, how can I get all the good things about templates and combine them with both simplicity and intuition as to how a website is organised?

Standard