Programming, Python, Ruby

Truly open classes

Here’s an interesting observation, I needed to write a little script to automate some number calculating for me. I was wondering whether to do it in Ruby or Python. I’m doing a lot of Python at the moment so I felt I ought to give Ruby a little go. Share some of the love.

However the solution I had in mind really didn’t work with Ruby because while Ruby has open classes it has a comparatively fixed idea of attributes. In Python you can set attributes very freely on any object so I have got in the habit of creating something and then enhancing by applying a function. Example? Okay.

def make_captain(actor): actor.rank = "Captain" return actor

class Person: pass

captain = make_captain(Person())

So this little trick doesn’t work, or rather is much more difficult to do in Ruby as Ruby, at is dynamic heart, is a language that believes in object-orientation and that classes should encapsulate rather than being little collections of data. You can use instance_variable_get/set but it lacks the elegance of the Python syntax.

In Ruby it would be easier to define the attributes in the class using the existing metaprogramming constructs and then have a class method to generate the content (effectively encapsulating my script logic).

Now this isn’t a straight “Ruby sux, no Python sux more” post. Between Scala, Clojure and Python I have been doing a lot more in a functional style that depreciates objects as anything more than value carriers. The Ruby vision of a class would give me something with a stronger sense of purpose and encapsulation, something that is hard to benefit from in a script for a particular purpose.

What is going to be interesting this year is trying to identify when the value of a piece of code is in the structure of it’s data-definition (i.e. objects) versus its process (functions). Having had a think about it I should perhaps rewrite my script to use some OO modelling because it may answer similar requirements down the line. However from a strict Lean/Waste point of view I should have gone with the Python solution as Ruby was imposing a restriction on me while providing benefits that I was unlikely to realise.

Programming, Work

The beauty of small things

I am very interested in the idea of “constellation architecture” and microapps as new model for both web and enterprise architecture. It feels to me like it a genuinely new way of looking at things that can deliver real benefit.

It is also not a new way of doing things, it is really just an extension of the UNIX tools idea and taking ideas like service-orientated architecture and some of the patterns of domain-driven design and taking them to their logical extreme conclusion.

If I take ls and I pipe it through grep, you wouldn’t find that particularly exciting or noteworthy. However creating a web application or service that does just one thing and then creating applications by aggregating the output of those many small components does some novel and slightly adventurous to some.

SOA failed before it began and the DDD silos of vertical responsibility seem poorly understood in practice. Both have good aspects though. However both saw their unit of composition as being something much larger than a single function. An SOA architecture for payments for example tended to include a variety of payment functions rather than just offering one service, authorising a payment for example.

There is a current trend to look at a webpage as being composed of widgets, whether they be written as server-side components or as client operated components. I think this is wrong and we need to see a page as being composed of the output of many different webapps.

Logging in a web-application whose only responsibility is to authenticate users, the most popular pages are delivered by an application whose responsibility to determine which pages are popular.

This applications should be as small as we can make them and still function. Ideally they should be a few lines of domain code linking together libraries and frameworks. They should have acceptance/behaviour tests to guarantee their external functionality and that’s about it.

It seems to me that the only way we are going to get good large-scale functionality is by aggregating useful, small segment small functionality. Building large functional stacks takes a lot of time and doesn’t deliver value exponentially to the effort of its creation.

Programming, Scala

Mockito and Scala

Scala is a language that really cares about types and mocking is the art of passing one type off as another so it is not that surprising that the two don’t get along famously. It is also a bit off probably that we are using Java mocking frameworks with Scala code which means you sometimes need to know too much about how Scala creates its bytecode.

Here are two recent problems: the “ongoing stubbing” issue and optional parameters with defaults (which can generally be problematic anyway as they change the conventions of Scala function calling).

Ongoing stubbing is an error that appears when you want to mock untyped (i.e. Java 1.4) code. You can recognise it by the hateful “?” characters that appear in the error messages. Our example was wanting to mock the request parameters of Servlet 2.4. Now we all know that the request parameters (like everything else in a HTTP request) are Strings. But in Servlet 2.4 they are “?” or untyped. Servlet 2.5 is typed and the first thing I would say about an ongoing stubbing issue is to see if there is Java 1.5 compatible version of whatever it is you are trying to mock. If it is your own code, FFS, add generics now!

If it is a library that you have no control over (like Servlet) then I have some bad news, I don’t know of any way to get around this issue, Scala knows that the underlying code is unknown so even if you specify Strings in your mock code it won’t let it compile and if you don’t specify a type your code still won’t compile. In the end I created a Stub sub-class of HttpServletRequest that returned String types (which is exactly what happens at runtime, thank you very much Scala compiler).

Okay so optional parameters in mocked code? So I love named parameters and default values because I think they are probably 100% (no, perhaps 175%) better at communicating a valid operating state for the code than dependency injection. However when you want to mock and set an expectation on a Scala function that uses a default value you will often get an error message saying that the mock was not called or was not called with the correct parameters.

This is because when the Scala code is compiled you are effectively calling a chain of methods and your mock needs to set matchers for every possible argument not the ones that your Scala code is actually calling. The simplest solution is to use the any() matcher on any default argument you will not be explicitly setting. Remember that this means the verification must consist entirely of matchers, e.g. eq() and so on.

What to do when you want to verify that a default parameter was called with an explicit value? I think you do it based on the order of the parameters in the Scala declaration but I haven’t done it myself and I’m waiting for that requirement to become a necessary thing for me to know.

Java, Programming

Python as a post-Java language

I’m a UNIX-based developer and since 2000 I have been working mainly with Java and then JVM languages. When Java 7 slipped I made no real secret of the fact that Java was in a lot of trouble. The post-Oracle world though looks even worse with a lack of clarity of what in the core ecosystem is free, open source and liability free.

Clojure and to a less extent Scala are great steps forward so I don’t feel the burning need for a Java 7/8 whatever. However a moribund or tainted JVM is a major problem and so I’m now thinking about what the post-Java escape route looks like. On the web front it is pretty obvious, Python and Ruby are great languages with great frameworks for developing web-based application. For the server-side heavy lifting it is a lot less clear, people are talking about Google Go but that does feel quite low-level, I’m not sure I’m ready to go back to pointer wrangling even with memory-management. It feels like something you’d build a tool out of not an application. Mono feels like more of the same problems of wrestling with big companies with vested interests, if you are going to do that then why not try and sort out the OpenJDK?

As the title of the post suggests the language I am most inclined towards right now is Python. It is a really concise but clear language that on UNIX systems comes with an amazingly comprehensive set of libraries and which has a virtual environment and dependency management that is on a par with RVM and gem.

The single issue that comes up is performance, what I have been finding that for 80% of the work I am doing performance is okay and I’m producing a fraction of the code I would normally have to create. For that last 20% maybe I am going to have to look at something like Go or (god forbid) go back to C but I would much prefer to see a Clojure or Scala that could run on top of something like LLVM. I also have some hope of smarter people than me making progress on a JIT for Python that might take 20% down to a figure where performance would matter so much to me I wouldn’t mind sweating to make it happen.

Programming

Clojure metadata, the compromise solution

One thing I think is really great about Clojure is the deep reflection and runtime manipulation capabilities that are on offer (it is one thing that I really miss in Scala for example). Last night at the London Clojure Dojo (website needs to be created this month I feel) we needed to distinguish a subset of functions. Now ideally we should have put this subset in a different namespace but you know code that is written by many hands doesn’t always make sense and now we are where we are.

We knew it was possible to examine all the functions in our namespace and that the relevant functions would take two arguments with particular names (we are at least consistent in that respect). My initial view was “hell, lets just find every function taking the the right names”, at the other end of the spectrum were people who thought we should explicitly declare the list of valid functions. In the end we settled on adding some metadata to the relevant functions.

This seemed a reasonable compromise, it means that applicable functions are explicitly defined but that in terms of code maintenance we only have to write one function that generates a list of all the valid functions and this will correctly respond to changes in the code as they are made.

This led to another conversation at work about whether a function’s metadata is mutable or not. Now it seems that as map it should be possible to edit it but it’s an interesting question.

Software, Work

Comparing Jabber servers

I recently had a trawl around the available Jabber servers looking for something that was suitable for use as a messaging system for a website. My first job was to do a quick review of what is available out there. The first thing that was quite clear that is ejabberd has massive mindshare. There was a definite feeling of “why would you want to try other servers when you could just be running ejabberd?”.

Well there are kind of two answers; first there is inevitable Erlang objection. This time from sysops who felt uncomfortable with monitoring and support. It is a fair point, I feel Erlang can be particularly obtuse when it’s failing. The second was that ejabberd stubbornly refused to start up on my Fedora test box. Neither the yum copy nor a hand-built version cut the mustard. Ironically I was able to get an instance running in minutes on my own Ubuntu-based virtual machine (hand-built Erlang and ejabberd).

Looking a JVM-based alternatives I looked at OpenFire and Tigase. Tigase has lovely imagery but also seemed to have spam over its comments and the installation was a pig that I gave up on quickly. OpenFire is one of those old-school Java webapps where you are meant to manage everything via a web gui. This makes some kinds of tasks easy but you have to write your own plugins to get programatic access to the server. I didn’t want to setup up an external database (why on earth would I want to do that?) but the internal HSQL store left me with no way of easily tweaking the setup of the server, a command-line tool would have been ultra-helpful because… OpenFire is annoyingly buggy, by this I mean it doesn’t have a lot of bugs but they are really annoying. After running through the gui setup process I tried to login. This was the wrong thing to do. What I needed to do was stop the server and restart it. Now the server was fucked and I needed to manually delete data and tweak config files to allow me to do the setup again.

Once it was running there was also some fun and games getting the BOSH endpoint to work. Do you need a trailing slash or not? I don’t remember but get it wrong and it doesn’t work. If you try to HTTP GET the endpoint then you get an error, this is technically correct but leaves you wondering whether your config is correct or not (if you know the error message you are looking for perhaps it is helpful but I didn’t so it wasn’t). ejabberd (when I got it working) is more helpful in giving an OHAI message that at least confirms you have something to point the client at.

OpenFire did what I needed but seems to make an implicit assumption that there is going to be an admin working at screens to configure and monitor everything. It feels more like a workgroup tool than a workhorse piece of infrastructure.

One oddity I found was Prosody, it sensibly used the same defaults, urls and conventions as ejabberd but is written in Lua and was gloriously lightweight. It absolutely hit the spot for development work and actually felt fun. I was able to script everything I needed to do via its ctl script. Of course if Erlang (a big, serious infrastructure language) is a bit of an issue the hipster scripting language might be too.

What the hell, if it was about doing what I need to do quickly and without fuss I would use Prosody and worry about any infrastructure issues later.

Programming

Learning to work atomically

I have been doing a lot of work with MongoDb recently and I have made a few noob mistakes despite being relatively well-grounded in the theory. One of the key mistakes I have made using the Java driver is to not have the driver in the right mode. By default the driver will not block on an insert, you need to be in Safe Mode for that to happen.

What is the impact? Well if you are trying to update a record that you have just inserted and the update neither fails nor is applied then chances are that the update failed to find the record you had just inserted because it wasn’t there when the update query ran. Of course a few milliseconds later it appeared and is there are the end of the batch process.

Updates in Mongo consist of a query and a data change operation and there is an art in getting the query to work on the set of data you want it to. I find myself doing a conditional match in Scala and then thinking “at this point is that still going to be valid?” and then going back tweaking the query so that the update is guaranteed to be valid at the point it happens.

Today I spent a lot of time buggering about trying to avoid writing keys in the document that held no data, after doing it I realised that I could have just written a single remove statement that would have removed the empty keys in one big cleanup after the data had been stored.

Atomic independence also means losing some things that we take for granted like sequence ids. People like numbers but guaranteeing even ascending values can rapidly become a nightmare if you want to avoid contention and single point of failure.

Cursors are similarly tricksy, I have a long-running batch job and I realised today that it runs long enough that you cannot guarantee a known state by the time it finishes. Instead you have to do these kind of “loop until there’s nothing left to do” constructs where the loop condition expresses the state of the store you are trying to achieve and you get at least one cursor that has no entries.

There’s a lot of stuff about datastores that is ingrained deeper than you realise and it takes more than one difficult experience to start genuinely thinking differently about things.

Software, Work

The Joy of Acceptance Testing: Is my bug fixed yet?

Here’s a question that should be a blast from the past: “Is my bug fixed yet?”.

I don’t know, is your acceptance test for the bug passing yet?

Acceptance tests are often sold as being the way that stakeholders know that their signed-off feature is not going to regress during iterative development. That’s true, they probably do that. What they also do though is dramatically improve communication between developers and testers. Instead of having to faf around with bug tracking, commit comment buggery and build artifact change lists you can have your test runner of choice tell you the current status of all the bugs as well as the features.

Running acceptance tests is one example where keeping the build green is not mandatory. This creates a need for a slightly more complicated build result visualisation. I like to see a simple bar split into green and red with the number of failing tests. There may be a good day or two when that bar is totally green but in general your test team should be way ahead of the developers so the red bar represents the technical deficit you are running at any given moment.

If it starts to grow you have a prompt to decide whether you need to change your priorities or developer quality bar. Asking whether a bug has been fixed or when the fix will be delivered are the wrong questions. For me the right questions are: should we fix it and how important is it?

If we are going to fix a bug we should have an acceptance test for it and its importance indicates what time frame that test should pass in.

Programming

Redis 2, worth the hype

So I’ve flirted a little bit with Redis but never really had anything that fitted it’s solution profile, until this week!

I needed to cross reference postcodes and their corresponding longitude and latitudes. I tried a few other solutions but in the end I decided that a postcode was a nice normalisable key (it just needs the country details adding in) and that since I had thousands of records to relate, I really valued speed above everything else.

Redis (in conjunction with the Python client library) tore through the data set in terms of both inserting approximately 1.7M records and reading roughly 300K entries. I also liked using the hash functionality to store the long and lat under the same key rather than having to write my own logic to create the pairing.

Redis really helped solved my problem and lived up to its promises. It definitely has a place in my toolkit from now on.

Software, Work

Agile must be destroyed

Let’s stop talking about the post-Agile world and start creating it.

Agile must be destroyed not because it has failed, not because we want to return to Waterfall or to the unstructured, failing, flailing projects of the past but because it’s maxims, principles and fixed ideas have started to strangle new ideas. All we need to take with us are the four principles of the Agile Manifesto. Everything else are just techniques that were better than what preceded them but are not the final answer to our problems.

There are new ideas that help us to think about our problems: a philosophy of flowing and pulling (but not the training courses and books that purport to solve our problems), the principles of software craftmanship, “constellation” architecture, principles of Value that put customers first not our clients. There are many more that I don’t know yet.

We need to adopt them, experiment with them, change and adapt them. And when they too ossify into unchallengeable truisms we will need to destroy them too.

Echo One

Sequentially arranged sentences composed of words (and punctuation)

Truly open classes

The beauty of small things

Mockito and Scala

Python as a post-Java language

Clojure metadata, the compromise solution

Comparing Jabber servers

Learning to work atomically

The Joy of Acceptance Testing: Is my bug fixed yet?

Redis 2, worth the hype

Agile must be destroyed