Clojure

Clojurescript: is it any better yet?

A while ago I wrote about how Clojurescript was a square peg being hammered into a non-existent hole to complete indifference by everyone.

So has anything changed? Well Clojurescript has continued to be developed and it seems to have lost some of the insane rough-edges it launched with, it seems possible to use it with OpenJDK now for example.

Some people have made some very cool things by using Clojurescript as a dialect for writing NodeJS code. I was particularly struck by Philip Potter's talk on Marconi.

The appearance of core.async for Clojurescript is in my view the first genuine point of interest for non-Lisp fans. It provides a genuinely compelling model for handling events in an elegant way.

David Nolen, while being modest about his contribution, has also contributed some excellent blog posts about the virtues of Clojurescript. The most essential of which is the 101 on how to get a basic Clojurescript application going with core.async. This is an excellent tutorial but also is underpinned by hard work on getting the "out of the box" developer experience slick via the Mies Leiningen template.

Let's be clear about what a difference this makes. At the London Clojure dojo we once had a dojo about using Clojurescript, after a few hours all the teams limped in with various war stories about what bits of Clojurescript they had working and what was failing and why they thought it was failing. The experience was so bad I really didn't want to do Clojurescript as a general exercise again.

In the last dojo we did Clojurescript, we used Mies and David's blogpost as a template and all the teams were able to reproduce the blogpost and some of the teams were creating enhancements based on the basic idea of asynchronous service calls.

When someone pitches a Clojurescript idea in 2014 I'm no longer in fear of a travesty. That is a massive step forward.

And that's the good news.

Clojurescript! Who is good for!

After the dojo there were some serious discussions about using Clojurescript in anger. The conversation turned eventually to GWT. In case you don't remember or have never met it GWT is essentially a browser client kit for Java developers. In London it gets used a lot by financial institutions that need rich UIs for small numbers of people. Javascript developers are unlikely to be hired by those organisations so just like Google they end up with a need but the wrong kind of skills and GWT bridges the gap. A Java developer can use a familiar language and off-the-shelf components and will end up with a perfectly serviceable rich client-side app.

There is no chance in hell that a Javascript developer is going to use GWT to build their applications.

Clojurescript feels like the same thing. Clojure developers and LISP aficionados don't know a great deal of Javascript, they can program in Clojurescript and Google Closure and it is probably going to be okay. Better in fact than if they tried to create something in an unfamiliar language with all manner of gotchas.

But there is no chance in hell that a Javascript developer is going to use Clojurescript to build their applications.

Why Coffeescript failed

The reason I say this is because Coffeescript, a great rationalisation of Javascript programming is still viewed with suspicion by Javascript developers.

I asked some at a recent Javascript tech meeting why some people didn't use it and the interesting answer was that they couldn't really understand its syntax and were effectively translating the forms into Javascript. The terseness of the language was actually off-putting because it made it harder to mentally translate what the Coffeescript program was doing.

Adding LISP and Google Closure into that mix isn't going to make that mental disconnect any easier. The truth would appear to be that Javascript developers are simply not that disenchanted with their language. Clojurescript is going to have to offer something major to get over the disadvantage of a non-curly brace language and machine-optimised generated code.

At the Clojure dojo post-mortem people talked about the fact that Clojurescript helped avoid pitfalls and unusual behaviour in Javascript. That might seem a rationale argument to a Clojurian. However it was never an argument used on the JVM. "Hey Java just has all these edge-cases, why not use a LISP variant instead?".

In both languages practitioners of the language are deeply aware of the corner cases of their language. Since they are constantly working with it they are also familiar with the best practices required to make sure you don't encounter those corner cases.

Clojure on the JVM brought power, simplicity and a model of programming that made reasoning about code paths simple.

Clojurescript has the same advantages of code structure but doesn't really give more functionality over Javascript and still has a painful inter-operation story.

Javascript: the amazing evolving language

Javascript has a strong Scheme inheritance meaning that it already contains a lot of LISP inheritance.

Also unlike Java which ended up with a specification that was in the wilderness for years, Javascript has managed to keep its language definition moving. It's sorted out its split and with aggressive language implementers in the form of the competing browsers it is rapidly adding features to the core of the languages and standards for extensions.

Javascript is almost unique that when lots of people wrote languages that compiled into Javascript the community weren't stuffy about it but actually created a specification, Source Maps, that made it easier to support generated code.

Javascript has a lot of problem areas but it is also rapidly evolving and syntactic sugar and new language constructs are adding power without necessary creating new problems or complexity. It is expansionist and ruthless pragmatic, just like Clojure on JVM in many ways.

Better the devil you know?

Visual Basic isn't the best language in the world, it's certainly not the best language for creating apps on Windows. However for organisations and programmers who have invested a lot of time in a language and a platform it normally takes a lot to get them to change.

Usually, it will take an inability to hire people to replace those leaving or the desertion of major clients before change can really be countenanced.

Javascript developers are in the same sunk-cost quandary but there is nothing on the horizon that is going to force an external change. There may be better alternatives but Javascript is one of the easiest languages to learn. It's highly interactive and its right there in the console window of this very browser!

There's no lack of demand for good looking websites and browser hackery to differentiate one web product from the next.

Regardless of the technical merits of any alternative solution offers, and we are not just talking about Clojurescript, Elm offers a similar set of advantages, the herding effect is powerful. Your investment in Javascript is far more likely to pay off that putting time into one of the alternatives, none of which have momentum.

Sometimes there are real advantages to sticking close to the devil.

Node.js

For me the most interesting area of Clojurescript is being able to write Clojure and treat V8 as an alternative runtime.

People are already noticing some odd performance characteristics where some things run better on Node than they do on JVM, most particularly around Async.

Polyglot developers are a familiar sight in the server side, knowing a variety of languages is advantage for the general programmer. Server-side Javascript is really only for those who are a one-trick pony.

It is still going to be a niche area but it is much more likely to happen than in the client-side.

Standard
Programming

Getting to grips with Generators in Javascript

One of the things that makes Python such an amazing language is its far-sighted inclusion of generators and first-order list comprehensions. Both of these things are now coming to Javascript with a healthy sense of homage in the syntax. All the examples in this post have been tested in the console of Firefox 23.

What are generators?

Generators are functions that can be called multiple times and on each call they resume execution from the state they were in when they last stopped executing. To do this they have a special keyword called yield instead of return.

Let’s consider the following trivial example:

function hello() { return "Hello"; }

hello(); // "Hello"

function lazyHello() { yield "Hello"; }

lazyHello(); // [object Generator]

var g = lazyHello(); g.next(); // "Hello"

This example might seem trivial but it actually encapsulates all the important values of generators and lazy evaluation. The difference between hello and lazyHello is that lazyHello defers execution of itself. Instead of executing it returns a Generator object that can then be evaluated for an answer later, rather like Deferreds.

This lazy evaluation means that you describe very complex or expensive calculations but not have to execute them until you need them. Now for a function that returns a single value that isn’t going to seem very exciting.

Writing infinite sequences with Generators

How can describe the world of all the positive integers? Well one way is to take the value of 1 as the start of arithmetic sum and simply add one to the previous value of the sum.

function postiveIntegers() {
  var i = 1;

  while(true) {
    yield i++;
  }
}

var g = postiveIntegers();
g.next(); // 1
g.next(); // 2

When we call next on the Generator we start executing the function, so we initialise the value of the variable and enter the loop. We then call yield and the function stops running. When we call next again we do not run the function again, we resume from the point where we stopped last time, inside the while loop.

Generators provide strong encapsulation of their data since once we have obtained a generator we can no longer access its state, just the results of its calculations. If you wanted to do this sum computation with a regular function then you would need to hold the state of the last call to the function somewhere outside the function and pass it again when you wanted to calculate the next value.

However that infinite loop is still a bit tricky and if you get it wrong then you will lock up the browser. Here’s a safer version.

function postiveIntegers(maxValue) {
  var i = 1;

  while(i <= maxValue) {
    yield i++;
  }

}

var g = positiveIntegers(2);

g.next(); // 1
g.next() // 2
g.next() // [object StopIteration]

Truly infinite sequences are really powerful but in practice you probably want to use weak-sauce versions that have some kind of bound.

Why are generators awesome?

Well firstly they allow the expression of things that would otherwise be impossible in computer code, things like very long or infinite sequences that would not fit in memory if every member had to be expressed or stored.

As part of this lazy evaluation generators are actually very economic in terms of resources, you can define a lot of rules and potential values but then only express the ones you are interested in, minimising the actual allocations.

Finally they are strongly encapsulated functions which hide their internal implementation without the need to resort to object semantics.

Standard
Programming

Keeping NPM local

I’m pretty sure that at one point NPM might not have required sudo privileges to do things but by default it now seems that people are sudo installing all over the place. It’s the equivalent of just clicking “yes” on every Windows permission dialog.

There is no particularly good reason to use sudo with a package manager. Virtualenv is tres local by default and even gem (via rbenv and rvm) can now run happily in the user space. Even if you want to share package downloads to minimise network calls then you still don’t need to install everything globally as root.

NPM is special in almost every sense of the word and it’s author recommends that you switch permissions on /usr/local to be owned by your login user. That seems kind of crazy and the kind of thing that probably would only work out for OSX users.

In fact if you are willing to build your own NodeJS it really isn’t hard to bet Node and NPM working locally and still retaining all the advantages of the “global” NPM install. Just use the standard build option of –prefix to set Node to use your home directory and just add $HOME/bin to your PATH in .bashrc or the equivalent.

Then you should be able to use npm without ever having to sudo.

My view though is that you shouldn’t have to force everything into the user space just to make sure a package manager doesn’t need sudo privileges. It would be better for everyone if NPM used a directory in home for each user since it seems to be aimed at single-user installs anyway there is not a massive saving by having packages installed in /usr/local. For the few use-cases where that would be useful then it should be an override option.

Standard
Web Applications, Work

Guardian May 2013 Hackday

You can see the reportage in these two liveblogs: Day 1 and Day 2 (note the terrible naming conventions). The theme of the hackday was “growth”. For the most part I took the theme to mean growth hacking and I did a lot of work along those lines which is difficult to talk publicly about.

However my prior lunchtime hacks had revealed to me that one of the fundamental problems the Guardian has is the volume of content it produces. This is not inherently a bad thing but the key thing to understand is that there is vastly more content than can fit onto what are called “fronts” in the jargon. A front is something like the front page of the site or the Environment section. These fronts produce a lot of traffic to content and for regular readers they are the essential navigation tool for the Guardian’s content.

Therefore I was interested in how we consider the dimension of time and perhaps use it to our advantage to help present content. This aspect of my hackday work is more open because actually I need a lot of help to understand to and because I’ve made some effort to try and use the public Content API rather than our internal content.

I called this work the “Time Trilogy” because it consists of three web apps that each use time as a way of accessing Guardian content.

The three apps are Guardian Word Count which was the original and gives you a sense of the challenge of navigating the content. It is also pretty fun to watch during the day and see the words tick up. So the Word Count spawned TickTickTick and Guardian In Review. TickTickTick is really a daily content explorer and was the first tool I needed to start sorting and exploring the breakdown of what we produce. It is a tool at its heart for exploring the daily news cycle. In Review is slightly different, it takes the one hundred most popular pieces of content over the last seven days and renders it. Initially I wanted it to be a kind of automatically generated magazine but actually looking at what people liked meant that I couldn’t make my initial idea work. People really like videos of meteors and Russian car crashes. What it is now is a way to explore material in the medium term, for content that perhaps has left the news cycle but is still relevant.

Neither app is really finished and the way I work is that I am very reliant on having working software to understand what I am doing and what is wrong or right about my approach. TickTickTick is much closer to being a complete product than In Review and it is providing more insight into the nature of the content being produced. For example there is a massive cluster of material between three and five minutes long.

I am going to continue to work on the apps because they help give me feedback into my work and ultimately these prototypes and toys tend to graduate into working components or theory on the main site itself. I may blog a bit more about them individually as I move them closer to something that genuinely creates value. I’m curious about feedback but acting on it is limited by my aims for the apps and realistically the time I have available.

I also wanted to talk a little bit about how I was working this hack day because I decided to reject advice and work solo rather than part of a team (although I did a little bit of backseat driving on the online magazines product and I did come up with the idea that actually won the hackday (and will hopefully be implemented and awesome)). Working alone does mean that your creations are going to be quite rough but it helps cover a lot of ground, I ended up doing five hacks and working on a total of seven. Working with other people means communicating well whereas solo you just need to express what you want very quickly.

My preferred tool for these kinds of hacks is Python on App Engine, which is what I use for my lunchtime hacks and for which I have a standard application template. With each new application that I do I can start to move the common patterns into the template. To avoid having to faff around with testing I use a loosely functional paradigm that I’ve carried over from Wazoku. It generally works quite well but there are a lot of rules to doing it.

This time around I was doing a bit more frontend work than my day job requires because I was working solo. Again having the startup experience was useful because I was more rediscovering a skillset than learning it. Hacks also means selecting your platform and choosing for optimal output.

For that reason I only targeted Firefox and Chrome (Firefox was actually easier to develop for in terms of standards) and I made liberal use of client-side Less and Coffeescript. I was impressed with how good the error-handling was in both. An obscure bug can wipe out all the productivity gains of a higher-order language but both worked great for me.

On top of that I tried experimenting with the new departmental standard of SMACSS (or at least my cherry-picking of it) and I made a lot of use of both Knockout and Bacon.js.

When I say I made use of SMACSS essentially what I did was namespace my classes to produce simple selectors. This did get me out of a problem I had in In Review so while it is truly the ugliest CSS standard and I suspect in time we may come to hate its rejection of rich functionality I concede that it is effective. Expect to see some of it applied to the main website sometime soon.

Knockout isn’t that popular in the department due to performance issues at a particular level of complexity but for me it did a brilliant job of simply syncing the visual DOM to the data feeds. I was really happy with it, other people were using AngularJS for more dynamic applications but they also had a lot more code than I did and again working solo less is so much more.

Bacon.js was really interesting. A lot of my approach to Javascript is functional and event-based but so far the events have been manually worked via jQuery. Bacon made it easier to create event sources with generic handlers and I probably didn’t use 10% of its full features. I’m curious to see what the rest of the department thinks of it but for my hacks it has definitely earned a place.

It was nice to do something outside the run of normal work and one thing that is quite cool about the hackday is that you can use it to tackle a technology that is entirely new to you and not have to worry about whether you succeed or fail.

Next time (May I believe) I think I want to learn about browser plugins as this is a way of producing better functionality for the Guardian without the hassle of having to make it work for the general population of browsers. Some people’s hacks this time around could have been released to the app/plugin stores and we could have been getting valuable user feedback by now.

Standard
Clojure

Why I’m finding Clojurescript underwhelming

I noticed Clojurescript in Github before the big announcement and thought it was an interesting idea. I am a big fan in general about having a Clojure syntax that compiles to Javascript. As a platform it is even more ubiquitous than Java and it would be a great way of simplifying Javascript’s closure and function syntax.

However in practice Clojurescript has been desperately disappointing for me. Firstly there is the weird decision to not have the code run on OpenJDK. This really limits its utility: I don’t seem to have a machine with a compatible setup at the moment despite having various flavours of Javascript interpreters available.

Then while looking for an answer as to how soon this problem is likely to be resolved I discovered this thread which was another level of disappointment. The original post is undiplomatic, perhaps even inflammatory, however the response indicates a level of befuddling clueless-ness.

If you want something to compile into Javascript I think you actually do want it to compile into good idiomatic Javascript unless you have a really good reason not to. You also do want to be able to use really good existing frameworks like jQuery (which really is the defacto standard right now).

The reason I think these are reasonable requests is that Coffeescript seems to manage to do both. Before Coffeescript maybe Clojurescript’s idiosyncrasies would have been forgiveable but being late to the party as well as being less well-mannered makes the defiance in the response seem poorly judged.

I am not sure what Clojurescript is really for (apparently it is aimed at a future community of people that don’t exist yet, which is … helpful). I don’t feel that it is really simpatico with the existing Javascript code that works in the browser and I am not sure it really has a place in the server-side world of Node.js where it might have been a better fit.

I remain open-minded though and would be willing to give Clojurescript a second go once the dust has settled a bit.

Update: I’ve written a follow up to this post

Standard
Programming, Scripting

CouchDB: Querying data

CouchDB allows you to pass a map function to a special view URL to query the data in an ad-hoc way. Views can also be stored as JSON documents with a convention URL (_design on the server, accessed as _view by the client). These can then be obtained via a HTTP request.My functional and Javascript programming are weak but this is what I understand of writing queries in CouchDB. Let’s take an example of a set of library cards, each card represents a book but the amount of information I have on each book varies.

The basic find all function is this:

function(book) {
map(null, book)
}

This defines an anonymous function that takes one parameter, the target document, in this case a book, and returns an array of values. What is in the value list is controlled by the second parameter, in this query I return the entire document. The first parameter controls the sorting or ordering. So I wanted to return the title of all the books in my database then I would use:


function(book) {
map(book.name, book.name)
}

Sorting them by ISBN would go like this:

function(book) {
map(book.isbn, book.name)
}

One important thing to note is that if an object doesn’t have a value it doesn’t respond to the function and will not be included. So if I created some of my entries with a value title instead of name anything with a title and not a name will not be in the query. However if I use a non-existent entry as an ordering criteria the value will count as null and be sorted.

Because I can include any valid Javascript in my function I can actually put a lot of complexity into my queries. For example:

function(book) {
if(book.isbn != null) { map(book.name, {"Name": book.name, "ISBN": book.isbn})
} else { map(book.name, book.name) }
}

So I suspect this will either make you cheer or puke. What this function does is return a JSON object containing the Name and ISBN of the book if they are known or just the Book name as a String otherwise. Unlike SQL the heading of my query is almost completely arbitrary as long as the value on the right of my map function translates to a valid JSON object.

Now at work there are often a lot of debates as to whether things are “rigid” or “structured” or whether they are “flexible” or “formless”. It is a bit like the old meat and poison adage. CouchDB allows a client to construct an almost arbitrarily rich response to a query with almost no restriction on how the data that should be included in that response. In some cases this is going to allow you to easily interact with very complex unstructured data in some cases it is going to be an invitation to create a sprawling dataset with no value. There is no inherent right or wrong choice here but for a particular problem being solved there is probably going to be a wrong and right choice. SQL is powerful because of the restrictions and rules it builds into its grammar. Using Javascript is powerful because it relaxes those restrictions. Programmers and IT folks in general often fall into using the laxest possible implementation for reason of “flexibility” but then either have to impose order themselves or lose the power of the more restrictive choice.

So putting that into a concrete example, if a write a view with SQL I am going to have to follow a set of rules to get the data I want (for example my heading is going to have to be a set of tuples of equal size), using an arbitrary script and JSON means I am going to be able to get exactly the data I want in the form I want it. However since that return structure is customised to my query I might possibly be reducing my reuse by being over-specific or by building too much logic into my view code.

That’s quite a diversion just so I can say it’s horses for courses, so let’s wrap up this quick look at CouchDB views. All of CouchDB’s views are effectively JSON objects that are passed to a separate view server. This is a separate process that interacts with the main server via STDOUT and STDIN pipes. By default this is the view server that is built from the Spidermonkey library (it is called couchjs). However you can write a view parser for any language and plug it into CouchDB by creating an executable and mapping it to a MIME type in the couch.ini file. The view server essentially parses and readies the query function that is associated with the view and is then sent every document in the database as a JSON string. The view server picks up the results of reading every document and sends that back to the query request.

It is a pretty simple system and it works will for the relatively flat documents I have been trying with it. However I suspect that in a project with multiple developers some ground-rules for writing consistent query code would be a must.

Standard