Clojure, Programming, Scala

Horses for courses: choosing Scala or Clojure

So one of the questions after my recent talk trying to compare Scala and Clojure (something that I suspect is going to be an ongoing project as I hone the message and the tone) was about whether the languages had problem domains they were more suited too. That’s an interesting question because I think they do and I thought I might be interesting to go through some of the decision making process in a more considered fashion than answering questions after a talk allows you to do.

So some of the obvious applications are that if you want to leverage some Java frameworks and infrastructure then you definitely want to use Scala. Things like JPA, Spring-injection, Hibernate and bean-reflection are a lot easier with Scala; in Clojure you tend to be dancing around the expectations these frameworks have that they are working with concrete bean-like entities.

If you are going to work with concurrency or flexible data formats like CSV and JSON I think you definitely want to be using Clojure. Clojure has good multi-core concurrency that is pretty invisible to you as a programmer. The key thing is avoiding functions with side effects and making sure you update dependent state in a single function (transaction). After that you can rely on the language and its attendant frameworks to provide a lot of powerful concurrency.

Similarly LISP syntax and flexible data go hand in hand so writing powerful data transforms seems second nature because you are using fundamental concepts in the language syntax.

Algorithm and closed-domain problems are interesting. My personal view is that I find recursion easier in Clojure due to things like the explicit recur function and the support for variable-arity function definitions. Clojure’s default lazy sequences also make it easier to explore very large problem spaces. On the other hand if you have problems that can be expressed by state machines or transitions then you might be able to express the solution to a problem very effectively in a Scala case class hierarchy.

When it comes to exploring the capabilities of Java libraries I tend to use the Scala console but for general programming (slide code examples, exploratory programming) I do tend to find myself spending more time in LightTable‘s Instarepl.

When it comes to datastore programming both languages are actually pretty clunky because they devolve handling this down to various third-party libraries. Clojure does pretty well with document databases and key-value stores. Scala is great for interacting with the AWS Java libraries and neither deals particularly well with relational data.

For web programming neither is brilliant but Scala definitely has the edge in terms of mature and full-featured web frameworks. Clojure is definitely more in the log cabin phase of framework support currently.

Standard
Programming

Refactoring abuse and strong type compiler systems

“Refactoring” is one of the most abused terms in programming. It has a formal meaning but when generally used it tends to mean rewriting or restructuring code (or as I like to refer to it: changing stuff). One interesting new use of refactoring I heard recently was to describe extracting common code. Creating some new codebase is perhaps the opposite of refactoring.

So refactoring tends to mean developers are just changing things they have already written. Real refactoring is of course done to code under test so I was interested in a Stuart Halloway quote about compilation being the weakest form of unit-testing. Scala is used a lot at the Guardian and it has a more powerful type system and compiler than Java which means if you play along with the type system you actually get a lot of that weak unit-testing. In fact structuring your code to maximise the compiler guarantees and adding the various assertion methods to make sure that you fail fast at runtime are two of things that help increase your productivity with Scala.

If you’ve seen the Coursera Scala videos you can see Martin Odersky doing some of this “weak refactoring” in his example code where he simplifies chained collection operations by moving or creating simple functionality in his types.

Of course just like regular refactoring there have to be a few rules to this. Firstly weak refactoring absolutely requires you use explicit function type declarations. Essentially in a weak refactor what you are doing is changing the body of a function while retaining its parameters and return type. If you can still compile after you’ve changed code you are probably good.

However the other critical thing is how much covariance the return type has. A return type of Option for example is probably a bad candidate for weak refactoring as it is probably critical whether your changed code still returns Some or None for a given set of a parameters. Only conventional refactoring can determine whether that is true.

Standard
Scala

What’s the problem with Scala’s “lazy” vals?

Scala has a value modifier called “lazy” and I have a bit of a problem with it. Let’s start with the name, lazy is used to indicate that the value is not defined when declared but when the value is read. Really it isn’t lazy at all it is actually a memoisation wrapper around the value.

Now that’s a pretty handy thing and there are definitely some use cases for this. In particular where the calculation of the value is truly invariant and expensive. Some people suggest that it can be used to cache the results of service or database lookups but that is clearly madness unless the query really does return an invariant.

However I think that there is a very subtle problem with the misapplication of lazy and that is that it introduces state into an otherwise pure function, namely whether the value has been calculated or not and what the value is when calculated. This state to me actually reduces the utility of a pure function in terms or reuse as the consumer has to be very aware in time of when the function is going to be invoked and what its value is going to be over time.

It’s rare that lazy over-application goes wrong because I mostly see it in short-lived instances. However that very lifespan seems to suggest that there is no real value in lazy evaluation over eager declaration. After all if the function never gets called then it doesn’t cost anything and true constants should be in the companion object. In long-lived objects though I think there is serious potential for lazy to go wrong and therefore it should be applied sparingly where a positive impact can be proved.

Standard
Scala

Using function passing to separate concerns

So I had a difficult problem today where I needed to access a service but I couldn’t inject it due to the service having slightly too many concerns and hence a circular dependency in the Dependency Injection. So there is right way of dealing with this, which is to refactor the concerns of the collaborator until the dependency goes away. Unfortunately there is also the incomplete user journey that is holding everyone up unless a quick hacky fix is found.

Enter function passing to the rescue! In my case we are talking about an MVC web application where the Model is generating the circular dependency but it is possible to inject the collaborator into the Controller. The trouble is that my code separates the concerns of the Controller and the Model such that the Controller asks the Model to perform an operation and the Model returns the result of executing the operation. This is clean to develop and read but it also means that the Controller cannot access any of the internal operations of the Model, which is what I need for the invocation of the collaborator.

For a moment I thought I had to break the encapsulation between the two and pass the internal information of the Model operation out with the operation result, however function passing is a more elegant solution that keeps my concerns separate. Essentially the Controller will ask the Model to do something and also provide a function for the Model to execute if it wishes to invoke the collaborator.

Let’s try to illustrate this in code


class Controller (stuff: StuffRepository, messages : MessageDispatcher) {

post("/handle/stuff/:stuffId") {

def messageCallback(id: StuffId, message : Message) { messages.send(Messages.StuffUpdated, message, Some(id)) }

stuff.storeChange(params("stuffId"), messageCallback) }}

class StuffRepository {

def storeChange(id : String, callback : (StuffId, Message) => Unit) = {

makeChanges(id) match {

case Success => callback(StuffId(id), Message("Changes completed")) ; Success

case Failure => Incomplete }}

Hopefully that’s all relatively obvious, you can also type the callback for clarity and make it an Option if you don’t always want the callback invoked. If you don’t like the closure you can probably re-write is as a function that partially applies MessageDispatcher and then returns the function.

Some of you are going to be thinking this is totally obvious (particularly if you a Javascript fiend) and that’s cool but I do think it is an interesting technique for keeping responsibilities separated without very complex patterns. It is also something that is only possible with proper first order functions.

Standard
Scala

Scala can be the replacement for Enterprise Java

Last week I had the chance to take a look at one of my worse fears about Scala: really bad Scala code. As I thought it might it did include a partial application, I was very grateful it didn’t abuse apply. It also managed to bring in a lot of classic bad behaviour like catching Exception and then swallowing it.

It has always been one of my fears that when someone went rogue with Scala the resulting mess would be so bad it would be a kind of code Chernobyl, its fearsome incomprehensibleness would be like the Lift code only without the brevity.

When I finally had to confront it I thought: this is actually no worse than really bad Java code.

I know that it could potentially be worse but actually it would take someone who was actually really good at Scala to construct the nightmare-ish worst case scenario I had imagined. In most cases terrible Scala code is going to look a lot like terrible Java code.

So that is really my last concern out of the way. We might as well use Scala for the good things it offers like immutability and first-order functions. The boost to productivity is going to outweigh the potential crap that might also be generated.

Standard
Programming, Scala

Mockito and Scala

Scala is a language that really cares about types and mocking is the art of passing one type off as another so it is not that surprising that the two don’t get along famously. It is also a bit off probably that we are using Java mocking frameworks with Scala code which means you sometimes need to know too much about how Scala creates its bytecode.

Here are two recent problems: the “ongoing stubbing” issue and optional parameters with defaults (which can generally be problematic anyway as they change the conventions of Scala function calling).

Ongoing stubbing is an error that appears when you want to mock untyped (i.e. Java 1.4) code. You can recognise it by the hateful “?” characters that appear in the error messages. Our example was wanting to mock the request parameters of Servlet 2.4. Now we all know that the request parameters (like everything else in a HTTP request) are Strings. But in Servlet 2.4 they are “?” or untyped. Servlet 2.5 is typed and the first thing I would say about an ongoing stubbing issue is to see if there is Java 1.5 compatible version of whatever it is you are trying to mock. If it is your own code, FFS, add generics now!

If it is a library that you have no control over (like Servlet) then I have some bad news, I don’t know of any way to get around this issue, Scala knows that the underlying code is unknown so even if you specify Strings in your mock code it won’t let it compile and if you don’t specify a type your code still won’t compile. In the end I created a Stub sub-class of HttpServletRequest that returned String types (which is exactly what happens at runtime, thank you very much Scala compiler).

Okay so optional parameters in mocked code? So I love named parameters and default values because I think they are probably 100% (no, perhaps 175%) better at communicating a valid operating state for the code than dependency injection. However when you want to mock and set an expectation on a Scala function that uses a default value you will often get an error message saying that the mock was not called or was not called with the correct parameters.

This is because when the Scala code is compiled you are effectively calling a chain of methods and your mock needs to set matchers for every possible argument not the ones that your Scala code is actually calling. The simplest solution is to use the any() matcher on any default argument you will not be explicitly setting. Remember that this means the verification must consist entirely of matchers, e.g. eq() and so on.

What to do when you want to verify that a default parameter was called with an explicit value? I think you do it based on the order of the parameters in the Scala declaration but I haven’t done it myself and I’m waiting for that requirement to become a necessary thing for me to know.

Standard
Clojure

Control expressions are functions too

I am kind of simultaneously learning Clojure and Scala, the former through the London Clojure Dojo (come and join us) and the latter through work. It is an interesting experience but I am very glad that Clojure is part of the mix as it helps understand a lot of things that are seem strange in Scala.

In Clojure everything is essentially either a function or a sequence. It is therefore not surprising to see that an if statement works like a function.

(if false 1 2)

Evaluating the above in a Clojure REPL results in the answer 2: if is a function that evaluates its first argument returning the second parameter if the evaluation is true, the third if false.

The same is true of Scala, with the additional wrinkle that if the different clauses evaluate to different types you could be in type hell. Despite its superficial similarity to Java syntax it is in fact quite different and you can compose it just as you would any other function.

1 + (if(true) 2 else 3)

Evaluated in a Scala REPL gives the result 3.

Scala 2.8 seems much better about making everything return a value (and hence act like a function), val and for will both now return Unit, which is not useful but prevents the compiler moaning.

This kind of thing is much easier to appreciate and learn in a pure functional language than in Scala where you never really know whether something is going to operate in a functional or object-orientated way.

Standard
Programming

Continuous Testing

Continuous testing is one of those things that has crept up on me slowly. About two years ago I was aware of people using a trigger on their TextMate save to run tests and, if green, commit to git. At the time it felt too much effort for too little gain but it was a cool trick.

Now as we forage out into the post-Java world we are starting to get some pretty cool revisions of familiar tools and one of the most engaging for me are continuous build tools. The daddy is clearly SBT, which while simple is also tremendously sophisticated. Adding a REPL to a build is a simple change but has all kinds of nice consequences, my favourite of which currently is the continuous test (~test) target that detects changes in your source and test files, compiles them and runs your tests. SBT cuts out the whole compile-link-run cycle for you, you just make a change, hit save, see the consequences and code again. It’s very fast and far more effective in giving feedback than any of the current IDEs (all of which need to get on this bandwagon fast is they want to stay relevant).

Clojure by comparison has been suffering in this regard with Leiningen becoming an unfortunately early defacto standard despite it standing shoulder to shoulder with the benighted Maven. The key thing that Leiningen does wrong is stay at the command-line and force you to cold-boot a JVM with each new command (the second is dependency resolution, SBT favours Ivy). Fortunately Lazytest by Stuart Sierra can hopefully save us here. Although still alpha Lazytest is an awesome way of developing Clojure and it’s hard to beat that feeling of smug satisfaction as the tests go green.

It is these kind of step change enhancements in development that are going to carry us forward more than shopping list of features that the new languages have or lack.

Standard
Java

Scala and Voldemort: Types matter

So I have been taking a good look at Java object and key-value stores recently. One of the more interesting (because it is very aligned with Java rather than being language-agnostic) is Voldemort. However I got it into my head that it would be a good idea to actually play around a bit using the Scala console. That involved a lot of classpath tweaking, made simpler because the Voldemort distribution comes bundled with its dependencies. However I still rather strangely needed the Voldemort Test jar to be able to access Upper Case View class. I’m not whether I am doing something wrong here or whether the packaging has gone wrong during the build.

With the classpath complete I followed the quickstart instructions and quickly found that actually Scala’s strict type-system was revealing something interesting about the Voldemort implementation. If you translate the Quickstart example to Scala literally then you end up with a Store Client that cannot store anything as it is a Client of type [Nothing, Nothing] and if you supply a value you get an error. I realised that I needed to interact with a lower level version of the API and was surprised to find that the signature of the constructor seemed to need a Resolver instead of defaulting one and that I had to supply a Factory instead of the Config that the Factory consumes.

This is the code I came up with (note that you need to paste this into an interactive Scala REPL rather than putting it in a script). However I’m a nub with Voldemort and I’m not exactly fly with Scala either. Any advice or input is appreciated.

After writing this I was wondering how to try and store arbitrary objects, initially I thought it would be as simple as declaring the Client to be [String, Any] but that failed due to a Class Cast exception in the String Serializer, it looks like the server config comes into play here and says how a value should be serialised and stored. It’s an interesting experience, I have also been looking at Oracle’s BDB but Voldemort puts a much nicer (read Java 1.5) interface on top of it.

Standard
Java, Programming, Work

The Java Developer’s Dilemmia

I believe that Java developers are under a tremendous amount of pressure at the moment. However you may not feel it if you believe that Java is going to be around for a long time and you are happy to be the one maintaining the legacy apps in their twilight. Elliotte Rusty Harold has it right in the comments when someone says that there are a lot of Java jobs still being posted. If you enjoy feasting off the corpse then feel free to ignore the rest of this post because it is going to say nothing to you.

Java is in a tricky situation due to competition on all fronts. C# has managed to rally a lot of support. Some people talk nonsense about C# being what Java will look like in the future. C# is what Java would like if you could break backwards compatibility and indeed even runtime and development compatibility in some cases (with Service Packs). C# is getting mind share by leapfrogging ahead technology-wise at the expense of its early adoptors. Microsoft also does a far better job of selling to IDE dependent developers and risk-adverse managers.

Ruby and Python have also eaten Java’s lunch in the web space. When I am working on web project for fun I work with things like Sinatra, Django and Google App Engine. That’s because they are actually fun to work with and highly productive. You focus on your problem a lot sooner than you do in Java.

The scripting languages have also done a far better job of providing solutions to the small constant problems you face in programming. Automating tasks, building and deployment, prototyping. All these things are far easier to do in your favourite scripting language than they are in Java which will have to wait for JDK7 for a decent Filesystem abstraction for example.

Where does this leave Java? Well in the Enterprise server-side niche, where I first started to use it. Even there though issues of concurrency and performance are making people look to things like Erlang and JVM alternatives like Scala and Clojure.

While, like COBOL and Fortran there will always be a market for Java skills and development. The truth is that for Java developers who want to create new applications that lead in their field; a choice about what to do next is fast approaching. For myself I find my Java projects starting to contain more and more Groovy and I am very frustrated about the lack of support for mixed Java/Groovy projects in IDEs (although I know SpringSource is putting a lot of funding into the Eclipse effort to solve the problem).

If a client asks for an application using the now treadworn combination of Spring MVC and Hibernate I think there needs to be a good answer as to why they don’t want to use Grails which I think would increase productivity a lot without sacrificing the good things about the Java stack. Companies doing heavy lifting in Java ought to be investigating languages like Scala, particularly if they are arguing for the inclusion of properties and closures in the Java language spec.

Oracle’s purchase of Sun makes this an opportune moment to assess where Java might be going and whether you are going to be on the ride with it. It is hard to predict what Oracle will do, except that they will act in their perceived economic interest. The painful thing is that whatever you decide to do there is no clear answer at the moment and no bandwagon seems to be gaining discernible momentum. It is a tough time to be a Java developer.

Standard