Blogging

Experimenting with Tumblr

I have recently hived off a few bits of posting that used to be in this blog to Tumblr, a startup that ValleyWag described as being, like Twitter, “unencumbered by revenue”. It’s been an interesting experience.

As this blog has become a bit more work-focussed and more formal I was feeling like writing about Doctor Who wasn’t quite the right thing to mix with the more esoteric tech stuff. I like WordPress a lot and I thought about starting up a second blog here. However I did feel that I wanted something that was a little bit lighter and light-hearted as the topics were going to be relatively trivial.

Signing up was easy (all very Web2.0: massive fonts, custom urls, etc.) but when I saw that you could use Markdown to write up posts rather than WSIWYG editors I was sold. Since I know it anyway it saves me a lot of time not frigging around with generated HTML. I also liked the AJAX UI that made it seem quite easy to just post a few thoughts.

In my mind Tumblr fits a kind of position between Twitter and WordPress. Where you have something to say that is more than a sentence but it isn’t a whole lot more than a paragraph. It is the kind of thing that Blogger should have become after it was clear that WordPress had completely whupped it on almost every front.

I have found Tumblr to be fun and also something that entices you into just jotting down a few thoughts. In terms of the experience it is all light, responsive and dynamic up front but you can dig around behind the scenes to take control of the visual aspects of your site via CSS and HTML (something that is paid for in WordPress) as well as get more options for posting.

So what do I miss from WordPress? Well the first thing is the Stats crack, obviously. WordPress has a killer feature in telling you exactly how many people are reading your articles and how they came to read them. There are also a lot of features that surround this like auto-promotion of articles to Google, the related articles list and the Blogs of the Day. Publishing something in WordPress feels like launching it into the world, by comparision Tumblr posts are a much more muted affair. It feels more like a secret club. I know Tumblr does the promotion as well but I guess WordPress does a better job of closing the feedback loop.

Not having comments on Tumblr is also part of that. Given that comments on your blog can be a very mixed bag I was surprised to find myself missing them. Somehow I must have gotten used to them and their lack now feels like silence. I know some people have used Intense Debate to add in comments but if I was really that bothered about it then I would probably have gone back to WordPress.

So I’m enjoying Tumblr but I am also hoping that they keep it simple and don’t get tempted to add every feature there is from other blogging software.

Java, Programming

Java Library Silver Bullet/Golden Hammer Syndrome

One thing I notice a lot with Java projects is that there is this strong desire to just have One Thing in the code base. One web framework, one testing framework, one mocking library, one logging library, one templating engine and so on and so on.

There is the understandable desire to reduce the complexity required to comprehend the codebase but that often flows over into One True Way-ism. There is One Library because it is The Library that we should all use to Fix Our Problems.

One reason why Java developers argue so fiercely and nervously at the outset of a project is that when they are defining the application stack there is the unspoken assumption that these choices of framework are fixed forever once made. Once we chose Spring then we will use Spring forever with no chance of introducing Guice or PicoContainer. If we make the wrong choice then we will Cripple Our Project Forever.

I actually like Slutty Architectures because they take away this anxiety. If we start out using JUnit 4 and we suddenly get to this point where TestNG has a feature that JUnit doesn’t and having that feature will really help us and save a lot of time; well, let’s use TestNG and JUnit4 and let’s use each where they are strongest.

Similarly if we are using Wicket and we suddenly want some REST-ful Web APIs should we warp our use of Wicket so we get those features? Probably not; let’s chose a framework like JAX-RS that is going to do a lot of the work for us in the feature we want to create.

Of course Slutty Architecture introduces problems too. Instead of the One True Way and hideous API abuse we do have an issue around communication. If we have two test frameworks then when someone joins the team then we are going to have to explain what we test with JUnit 4 and what we test with Test NG. There is a level of complexity but we are going to deal with it explicitly instead of giving a simple statement like “we use JUnit4” but which has been poisoned by all these unspoken caveats (“but we write all our tests like JUnit3 and you have to extend the right base test class or it won’t work”).

We also need to review our choices: when a new version of a library gets released. Does it include functionality that was previously missing? Perhaps in the light of the new release we should migrate the entire test code to TestNG for example. That kind of continual reassessment takes time but also stops an early technology choice becoming a millstone for everyone involved.

But please, let’s get over the idea that there has to be one thing that is both floor polish and desert topping. It’s making us ill.

Java, Programming, Python, Scripting, Web Applications, Work

JMeter or The Grinder, so which one is better, like?

Or for the benefit of Google: Apache JMeter versus The Grinder. Fight!

A while ago I got paid to put these two tools head to head and I think it’s been long enough that the people who put up the money will have got the benefit now.

I had used The Grinder in a previous job where it had done excellent service finding out what level of peak load our site could handle. It was convenient in that you script it in Jython because Jython was our chosen scripting language.

JMeter on the other hand was a whole new thing to me. It is a kind of graphical programming interface where you drop controls into a tree structure and the framework generates threads that run through the tree generating interactions with the target application via things like HttpClient.

The scripting versus component structure is a pretty major difference between the two and is probably one of the key things that is going to help you choose between two products that are both, to be honest, pretty mature, capable pieces of software.

If you don’t feel comfortable with Python or Jython then Grinder is likely to be out. JMeter is much more friendly to non-programming testers. However this decision isn’t as clear as you might think, JMeter actually includes Javascript functions, access to BeanShell and its own expression language. As we created more complex traffic simulations we found ourselves having extremely complex expressions that often mixed a scripting language with an expression evaluation. In some ways The Grinder is more honest in telling you up front that you are going to have to program some of this stuff yourself.

Another big differentiator is in reporting, if you need to generate reports then JMeter is a much better choice as it has components that collect and collate information and you can dump data to XML for XSLT transformation into pretty much any output format you want. It is also possible to run JMeter headless once you have the test scripts so it is possible to encorporate it into a continuous integration process. Debugging slightly flaky applications is much easier with JMeter because you can easily capture a lot of result information and then just delete the data gathering component once the test flow is working.

A lot of the functionality of both products overlaps: they can both be used a browser proxy and be used to capture scripts as a user “drives” around the site. These captures are likely to form the basis of your test plans unless you have a very clear URL scheme with little session state.

Both use thread pools to generate load and both tend to get blocked on your application before they max out their local machine (although both are capable of using 90% of memory and CPU). JMeter has a few nice options around randomly ramping up the thread activation (recreating a spike in usage more closely). The Grinder tends to give more bang for the buck as its runtime is very simple and minimal.

Both are also capable of using distributed agents and collating the data from these agents back to the coordinator.

In many ways there isn’t a “bad” choice to be made between them. So what might sway your choice one way or another? Well JMeter is an Apache project and that might make a difference for some people as you have all those good things like project governance and a good chance the project is going to continue to be developed and enhanced. JMeter was also the only project to have a test suite last time I checked out both code bases.

The Grinder does have one unbeatable quality in my opinion and that’s flexibility. When you look at the features listed on the websites you might think that JMeter does all this cool stuff with HTTP, SOAP and POP3 and the like and that The Grinder is comparatively light in the feature set.

The opposite is actually true, as The Grinder website points out the The Grinder can test anything with a Java API. There really is no limit to what you can make it do or how you can execute your test plan. In fact most of the things I’ve complained about with The Grinder are fixable from within the test scripts (what I am really complaining about is that some things like capturing HTML output per run is something that should be available as part of the standard package). On the other hand I felt that creating a new JMeter component was actually quite tricky as you have to deal with both the GUI aspects and the actual functionality you are trying to create in your test component.

If you really want to have total control of your concurrent volume testing then The Grinder is probably the best product for you.

Perhaps the last topic for consideration is The Grinder’s use of Jython. Python is a great language and you get a lot of power for some very concise code. Just as some people are going to find it off-putting others are actually going to find it very compelling.

Okay so yet another weak “horses for courses” post on the InterWeb. Perhaps the easiest summary to end on is that your test teams will probably take a shine to JMeter while your developer teams who are also building scale tests are probably going to like The Grinder; unless they dislike Python or learning a new language.

Celeb Spotting

Krishnan Guru-Murphy, Greys Inn Road

Channel 4 newsreader, heading back to the ITN offices at lunchtime. Short and podgy with an expensive looking suit and an amazingly deep and clear voice.

Programming

The Helper Anti-Pattern

You have a class X, you have a class called XHelper. XHelper contains methods that make it easy to use X.

The problem I have with this antipattern is that XHelper does nothing of value. If the methods are truly related to X then they should actual be class methods of X. However if you need “helper” methods to use the API of X chances are what is really required is a refactoring of X to incorporate the enhancements of XHelper invisibly. You shouldn’t need a helper to use an API.

Take Rails page helpers. A helper to construct the content of a page contains functionality that would be better marshalled in the controller, prior to view rendering. If multiple controllers perform the action then extract it to a service that controllers can invoke on the requests and delegates they are co-ordinating.

What if the Helper class actually refactors common functionality from classes X and Y and is actually called FooHelper because it helps perform Foo in X and Y?

Well, here we are onto something, we have some common functionality which is good and the name of the class reflects its purpose. The same question arises though, could FooHelper’s methods actually reside in Foo? If Foo is purely a function or method call then perhaps all the functionality relating to Foo should be encapsulated in a Foo class that presents the foo method.

Alternatively perhaps there is a better name than “Helper”? As examples, I tend to call collections of class methods that transform instances of one class into instances of another class “Transformers”. Similarly methods that create database connection instances could be called “Providers”. If you cannot make the class a private class instance of the class or classes the Helper is nominally a Helper to, then there is usually a better name for the class lurking around somewhere.

Java, Programming, Software

Programming to Interfaces Anti-Pattern

Here’s a personal bete noir. Interfaces are good m’kay? They allow you to separate function and implementation, you can mock them, inject them. You use them to indicate roles and interactions between objects. That’s all smashing and super.

The problem comes when you don’t really have a role that you are describing, you have an implementation. A good example is a Persister type of class that saves data to a store. In production you want to save to a database while during test you save to an in-memory store.

So you have a Persister interface with the method store and you implement a DatabaseStorePersister class and a InMemoryPersister class both implementing Persister. You inject the appropriate Persister instance and you’re rolling.

Or are you? Because to my mind there’s an interface too many in this implementation. In the production code there is only one implementation of the interface, the data only ever gets stored to a DatabaseStorePersister. The InMemory version only appears in the test code and has no purpose other than to test the interaction between the rest of the code base and the Persister.

It would probably be more honest to create a single DatabaseStorePersister and then sub-class it to create the InMemory version by overriding store.

On the other hand if your data can be stored in both a graph database and a relational database then you can legitmately say there are two implementations that share the same role. At this point it is fair to create an interface. I would prefer therefore to refactor to interfaces rather than program to them. Once the interaction has been revealed, then create the interface to capture the discovery.

A typical expression of this anti-pattern in Java is a package that is full of interfaces that have the logical Domain Language name for the object and an accompanying single implementation for which there is no valid name and instead shares the name of the interface with “Impl” added on. So for example Pointless and PointlessImpl.

If something cannot be given a meaningful name then it is a sure sign that it isn’t carrying its weight in an application. Its purpose and meaning is unclear as are its concerns.

Creating interfaces purely because it is convenient to work with them (perhaps because your mock or injection framework can only work with interfaces) is a weak reason for indulging this anti-pattern. Generally if you reunite an interface with its single implementation you can see how to work with. Often if there is only a single implementation of something there is no need to inject it, you can define the instance directly in the class that makes use of it. In terms of mocking there are mock tools that mock concrete classes and there is an argument that mocking is not appropriate here and instead the concrete result of the call should be asserted and tested.

Do the right thing; kill the Impl.

Programming, Python, Ruby

Mocking Random

Mocking calls to random number generators is a useful and important technique. Firstly it gives you a way into testing something that should operate randomly in production and because random number generation comes from built-in or system libraries normally it is also a measure of how well your mocking library actually works.

For Ruby I tend to use RSpec and its in-built mocking. Here the mocking is simple, of the form (depending on whether you are expecting or stubbing the interaction):

Receiver.should_receive(:rand)
Receiver.stub!(:rand)

However what is tricky is determining what the receiver should be. In Ruby random numbers are generated by Kernel so rand is Kernel.rand. This means that if the mocked call occurs in a class then the class is the receiver of the rand call. If the mocked call is in a module though the receiver is properly the Kernel module.

So in summary:

For a class: MyClass.should_receive(:rand)
For a module: Kernel.should_receive(:rand)

This is probably obvious if you a Ruby cognoscenti but is actually confusing compared to other languages.

In Python random functions are provided by a module, which is unambiguous but when using Mock I still had some difficultly as to how I set the expectation. Mock uses strings for the method called by the instance of the item for the mock anchor. This is how I got my test for shuffling working in the end.

from mock import Mock, patch_object
import random

mock = Mock()
class MyTest(unittest.TestCase):
    @patch_object(random, 'shuffle', mock)
    def test_shuffling(self):
            thing_that_should_shuffle()
            self.assertTrue(mock.called, 'Shuffle not called')

You can see the same code as a technicolor Gist.

This does the job and the decorator is a nice way of setting out the expectation for the test but it was confusing as to whether I am patching or patch_object’ing and wouldn’t it be nice if that mock variable could be localised (maybe it is and I haven’t figured it out yet).

Next time I’m starting some Python code I am going to give mocktest a go. It promises to combine in-built mock verification which should simplify things. It would be nice to try and combine it with Hamcrest matchers to get good test failure messages and assertions too.

Work

One Year at ThoughtWorks

It’s now been a year since I joined Thoughtworks and for me a year at ThoughtWorks is like two at any other company I’ve ever worked at. I have learned so much since I joined I really feel like I’m a very different person to the one who joined. I’ve met lots of really smart people who are doing really interesting things but all of whom have been generous and unstinting with their knowledge, experience and advice when asked.

This openess is the thing that really sets the culture apart from so many other firms. Knowledge seems to have value only when shared and people are generally so enthusiatic about the things they know they are really eager to help you understand things. In most companies knowledge is power and hoarded carefully, divested for maximum gain with the grace of a man having a tooth extracted.

The other cultural aspect that has been really different is that there is tremendous peer pressure to be excellent. If you are cutting corners or hacking something that’s convenient but flawed or just riding on your opinion then someone is going to call you on it.

When I’m working on client-owned projects I often think about different approaches and then wonder how I would feel if I had to justify the solution I chose to my TW collegues. It’s interesting because it gives you a strength to stand up against weak solutions and weak answers, even if you’re not actually working with anyone from ThoughtWorks.

So it’s been a good year generally and certainly the best I have had working for a company rather than doing my own thing. However ThoughtWorks isn’t perfect because any aggregation of individuals requires compromise and the biggest problem with ThoughtWorks is how you handle that.

One of the obnoxious things you can come across is the idea that you should be grateful to work for ThoughtWorks (particularly held I think amongst those who have only worked at TW or the City and other consultancies). ThoughtWorks has problems and each individual has to balance the benefits of getting to work with so many amazing people against an organisation that can’t really resolve its central dichotomies. Is it the Employee owned company that is delivering excellence or is it the home to the best knowledge workers who are revolutionising IT?

The trouble with not deciding exactly what the company wants to offer its employees as the vision is that every decision ends up infuriating half of your smart, emotionally invested and highly motivated workforce because it doesn’t fit with their interpretation of the ThoughtWorks ideal.

So it has been a great year and an experience I would recommend to anyone. My closing thoughts about ThoughtWorks is that it is a company that hasn’t said “No” to anything I have wanted to do. There isn’t always a lot of support and it is more forgiveness than permission but I feel that ThoughtWorks has helped me be a better person because it has given me the chance to do things that other organisations simply shut down in an arbitrary and off-hand way.

So thanks Roy and everyone else who works and has worked at ThoughtWorks for creating that opportunity for me.

Programming

The Locality Problem

One of the key problems to solve if the Carbon Co-op application is to work is the Locality problem. That is: who is near who? For the SiCamp weekend my approach was going to be simply to hack it by saying that everyone who shared the first same segment of postcode were in the same Locality.

This is blatently not true though. In London Boroughs matter more and can consist of different Postcode stems due to the alphabetic numbering system. In small cities like Swansea, Bath and Bristol you can effectively consider most of the city codes to be in the same Locality. Swansea postcodes also cover Carmarthen which is clearly not in the same locality as Swansea.

I have seen this handled in a variety of ways before, usually by creating a hierarchy where a Postcode stem is mapped to a leaf node in a tree which is normally the county, city or metropolitan area the area belongs to. The counties then belong to regions, regions to countries and countries to the UK, then Europe.

However for the Carbon Co-op I didn’t want to have the hassle of having to create and maintain a hierarchy. You also need to deal with the fact that people who share the same Postcode actually may have special meaning in terms of the projects they can partake in.

So I thought about create a Locality model that is simply named and arbitrary and which has many (and at least one) Locality Filters that are actually Regular Expressions. You then run someone’s postcode through the Filters to see which Localities they belong to. This deals with the situation where someone may be able to take part in Locality linked Actions for both Scotland and Fife.

Design-wise I’m quite taken with this and would like to implement it to see if it works in practice. However it is a techie solution as only programmers really love Regular Expressions and even then… not so much. To simplify the interface you would probably just have the user type the extent of the postcode they wanted to group. There is also a question of whether this filtering would actually scale. You might have to associate the Devotee with their Localities, which then raises the issue of when it should be regenerated, which makes you wonder whether you can back a Django model with a view, which makes a good case for doing it!

culture

Generational shifts

One interesting thing about SiCamp was getting to legitimately hang out with people who are about ten or fifteen years younger than me. Interesting differences were in views on Facebook Apps and SMS integration. Apparently there is no business proposal or application that cannot be improved by the addition of these two things. Both of which I really hate. I enjoy basic Facebook but I’ve never found an application that I really trusted to share my data with or that even had a compelling reason for existing.

I also hate the idea of web apps SMS’ing me or me, them. I can barely understand why you might want to SMS microblogging sites (after all SMS wasn’t designed for group messaging) but seriously why do you want to encourage a stream of textspeek to your website? Are we really such whores for user content? Isn’t there something really sinister about being tracked through time and space by websites? Didn’t people die to try and stop that happening?

It was also clear that the appeal of Google Maps has passed me by but I am prepared to accept that you ain’t got an app until you’ve geotagged stuff and asked Google for a map with pins and icons on it. It makes your site look the same as everyone else’s and they aren’t a coherent part of the your design but what the hell. The api doesn’t seem that complex and user’s critical faculties seem to be completely disabled by the magic of maps with lines showing how you get from one place to another.

Echo One

Sequentially arranged sentences composed of words (and punctuation)