Clojure

Data wrangling with Clojure

Clojure is a great language for wrangling data that is either awkwardly-sized or where data needs to be drawn from and stored in different locations.

What does awkward-sized data mean?

I am going to attribute the term “awkward-sized data” to Henry Garner and Bruce Durling. Awkward-sized data is neither big data nor small data and to avoid defining something by what it is not I would define it as bigger than would fit comfortably into a spreadsheet and irregular enough that it is not easy to map onto a relational schema.

It is about hundreds of thousands of data points and not millions, it is data sets that fit into the memory on a reasonably specified laptop.

It also means data where you need to reconcile data between multiple datastores, something that is more common in a microservice or scalable service world where monolithic data is distributed between more systems.

What makes Clojure a good fit for the problem?

Clojure picks up a lot of good data processing traits from its inheritance as a LISP. A LISP after all is a “list processor”, the fundamental structures of the language are data and its key functionality is parsing and processing those data structures into operations. You can serialise data structures to a flat-file and back into memory purely through the reader macro and without the need for parsing libraries.

Clojure has great immutable data structures with great performance, a robust set of data processing functions in its core library, along with parallel execution versions, it has well-defined transactions on data. It is, unusually, lazy be default which means it can do powerful calculations with a minimal amount of memory usage. It has a lot of great community libraries written and also Java compatibility if you want to use an existing Java library.

Clojure also has an awesome REPL which means you have a powerful way of directly interacting with your data and getting immediate feedback on the work you are doing.

Why not use a DSL or a specify datastore?

I will leave the argument as to why you need a general purpose programming language to Tommy Hall, his talk about cloud infrastructure DSLs is equally relevant here. There are things you reasonably want to do and you can either add them all to a DSL until it has every feature of poorly thought-out programming language or you can start directly with the programming language.

For me the key thing that I always want to do is read or write data, either from a datastore, file or HTTP/JSON API. I haven’t come across a single data DSL that makes it easier to read from one datastore and write to another.

Where can I find out more?

If you are interested in statistical analysis a good place to start is Bruce Durling’s talk on Incanter which he gave relatively early in his use of it.

Henry Garner’s talk Expressive Parallel Analytics with Clojure has a name that might scare the hell out of you but, trust me, this is actually a pretty good step-by-step guide to how you do data transformations and aggregations in Clojure and then make them run in parallel to improve performance.

Libraries I like

In my own work I lean on the following libraries a lot.

JSON is the lingua franca of computing and you are going to need a decent JSON parser and serialiser, I like Cheshire because it does everything I need, which is primarily produce sensible native data structures that are as close to native JSON structures as possible.

After JSON the other thing that I always need is access to HTTP. When you are mucking around with dirty data the biggest thing I’ve found frustrating are libraries that throw exceptions whenever you get something other than a status code of 200. clj-http is immensely powerful but you will want to switch off exceptions. clj-http-lite only uses what is in the JDK so makes for easier dependencies, you need to switch off exceptions again. Most of the time the lite library is perfectly usable, if you are just using well-behaved public APIs I would not bother with anything more complicated. For an asynchronous client there is http-kit, if you want to make simultaneous requests async can be a great choice but most of the time it adds a level of complexity and indirection that I don’t think you need. You don’t need to worry about exceptions but do remember to add a basic error handler to avoid debugging heartache.

For SQL I love yesql because it doesn’t do crazy things and instead lets you write and test normal SQL and then use inside Clojure programs. In my experience this is what you want to do 100% of the time and not use some weird abstraction layer. While I will admit to being lazy and frequently loading the queries into the default namespace it is far more sensible to load them via the require-sql syntax.

One thing I have had to do a bit of is parsing and cleaning HTML and I love the library Hickory for this. One of the nice things is that because it produces a standard Clojure map for the content you can use a lot of completely vanilla Clojure techniques to do interesting things with the content.

Example projects

I created a simple film data API that reads content from an Oracle database and simply publishes it as a JSON. This use Yesql and is really just a trivial data transform that makes the underlying data much more usable by other consumers.

id-to-url is a straight-forward piece of data munging but requires internal tier access to the Guardian Content API. Given a bunch of internal id numbers from an Oracle databases we need to check the publication status of the content and then extract the public url for the content and ultimately in the REPL I write the URLs to a flat file.

Asynchronous and Parallel processing

My work has generally been IO-bound so I haven’t really needed to use much parallel processing.

However if you need it then Rich Hickey does the best explanation of reducers and why reduce is the only function you really need in data processing. For transducers (in Clojure core from 1.7) I like Kyle Kingsbury’s talk a lot and he talks about Tesser which seems to be the ultimate library for multicore processing.

For async work Rich, again, does the best explanation of core.async. For IO async ironically is probably the best approach for making the most of your resources but I haven’t yet been in a situation where

Standard
Clojure, Programming

Creating Javascript with Clojure

This post is an accompaniment to my lightning talk at Clojure Exchange 2014 and is primarily a summary with lots of links to the libraries and technologies mentioned in the presentation.

The first step is to to use Wisp a compiler that can turn a Clojure syntax into pure Javascript, with no dependencies. Wisp will translate some Clojure idioms into Javascript but does not contain anything from the core libraries including sequence handling. Your code must work as Javascript.

One really interesting thing about Wisp is that it supports macros and therefore can support semantic pipelining with the threading macros. Function composition solved!

If you want the core library functionality the logical thing to add in next is a dependency on Mori which will add in data structures and all the sequence library functions you are used to with a static invocation style that is closer to Clojure syntax.

At this point you have an effective Clojure coding setup that uses pure Javascript and requires a 50 to 60K download.

However you can go further. One alternative to Mori is ImmutableJS which uses the JavaScript interfaces (object methods) for Array and Map. If you use ImmutableJS you can also make use of a framework called Omniscient that allows you develop ReactJS applications in the same way you do in Om.

ImmutableJS can also be used by TransducersJS to get faster sequence operations so either library can be a strong choice.

Standard
Clojure, Programming

Transducers at the November London Clojure Dojo 2014

One of the topics for the November ThoughtWorks dojo was transducers (something I’ve looked at before and singularly failed to get working). Tranducers will be coming to clojure.core in 1.7, the code is already in Clojurescript and core.async.

There were two teams looking at transducers, one looked more at the foundations of how transducers are implemented and the other at their performance. These are my notes of what they presented back at the dojo.

How do transducers work?

One of the key ideas underpinning transducers (and their forebears reducers) is that most of the sequence operations can be implemented in terms of reduce. Let’s look at map and filter.

(defn my-map-1 [f coll]
  (reduce
     (fn [acc el] (conj acc (f el))) [] coll))

(defn my-filter-1 [pred coll]
  (reduce
     (fn [acc el]
       (if (pred el)
         (conj acc el)
         acc))
   [] coll))

Now these functions consist of two parts: the purpose of the function (transformation or selection of values) and the part that assembles the new sequence representing the output. Here I am using conj but conj can also be replaced by an implementation that uses reduce if you want to be purist about it.

If we replace conj with a reducing function (rf) that can supplied to the rest of the function we create these abstractions.

(defn my-map-2 [f]
  (fn [rf]
    (fn [acc el]
      (rf acc (f el))))

(defn my-filter-2 [pred]
  (fn [rf]
    (fn [acc el]
      (if (pred el)
        (rf acc el)
        acc))))

And this is pretty much what is happening when we call the single-arity versions of map and filter; in tranducers. We pass a function that is the main purpose of the operation, then a reducing function and then finally we need to do the actual transducing, here I am using reduce again but transduce does the same thing.


((my-map-2 inc) conj) ; fn
(reduce ((my-map-2 inc) conj) [] (range 3)) ; [1 2 3]

(reduce ((my-filter-2 odd?) conj) [] (range 7)) ; [1 3 5 7 9]

The team’s notes have been posted online.

How do transducers perform?

The team that was working on the performance checking compared a transduced set of functions that were composed with comp to the execution of the same functions pipelined via the right-threading macro (->>).

The results were interesting, for two or three functions performance was very similar between both approaches. However the more functions that are in the chain then the better the transduced version performs until in the pathological case there is a massive difference.

That seems to fit the promises of transducer performance as the elimination of intermediate sequences would suggest that performance stays flat as you add transforms.

There was some discussion during the dojo as to whether rewriting the historical sequence functions was the right approach and whether it would have been better to either make transducers the default or allow programmers to opt into them explicitly by importing the library like you do for reducers. The team showed that performance was consistently better with transducers (if sometimes by small margins) but also that existing code does not really need to be modified unless you previously had performance issues in which case transducers allows a simpler, direct approach to transformation chaining than was previously possible.

Closing thoughts

I suggested the transducers topic as I had singly failed to get to grips with them by myself and I was glad it sparked so much investigation and discussion. I certainly got a much better understanding of the library as a result. My thanks got to the dojo participants, particularly James Henderson.

Standard
Clojure, Programming

London Clojure unconference July 2014 report

For the first session I was interested in trying to continue the discussion about the Clojure “sweet spot” we had had on the mailing list. But there was only a smattering of interest so we rolled it up with the discussion on how to convince people in investment banks to use Clojure.

I think Jon Pither’s approach to this is still the best which is to find a business problem and then say that you’re going to address the problem and use Clojure to solve the real problem. A pure technical argument is not really going to get buy-in from outside the developers.

A lot of organisations want to have an approved list of technologies and for institutions that have chronic and acute technical problems like banks then perhaps that is appropriate given the need for external regulation. Where these things exist I usually think it is a case of going through the bureaucratic hoops.

The approval system is not there to be opinionated but to provide oversight. Where individuals have “weaponised” the approval process to advance their view of “right” technology you need to tackle the root problem not just sneak things in as jars.

My personal view is that financial institutions have profound technology problems but that they have no incentive to address them while they continue to make a lot of money. Really their problems should be providing opportunities for new approaches but as the existing institutions have created massive barriers to entry it doesn’t happen and we’re all really just waiting for the next financial crisis to happen, maybe then…

However in the session there was a lot of discussion about whether it is appropriate for managers to determine technology choices: on the one side you want to devolve decisions to the people close to the problem, on the other programmers commonly change jobs in a shorter period that the lifespan of the software they create.

One thing I took away was that before conservative organisations adopt Clojure they will need to see widespread adoption in the companies they see as good leading indicators and the presence of a large hiring population. In these respects Scala is literally years ahead.

Our final conclusion as a group was simply that the easiest way to approve the use of Clojure was to get into management and leadership first and then do it.

For the second session I went to the discussion on React and Om. I’m looking at React currently and there were a lot of questions about what Om layers on top of the basic JS library. Anna Pawlicka provided a number of the Om answers and others chipped in with bits of React and reactive JS knowledge. I was reminded to go and look at the current state of Om and also the new tutorials. There was also some interesting talk of how to define React components, Anna used Sablono but is there still a need for JSX?

The final session of the evening was on Riemann, which in addition to be a basic introduction to what it does was a helpful reminder of the functionality that Riemann has but that I haven’t used personally. Jason Neylon mentioned that every new service they set up has a Riemann instance attached so you can just dump all events somewhere and then build dashboards dynamically as you go along (a lot better than our approach with Graphite).

Tom Crayford introduced me to the effect of clock skew on Riemann (events from the “future” relative to the Riemann server clock are dropped) and then pointed out that clock skew can actually be monitored via Riemann! Also some interesting stuff about pumping logs into Riemann and some personal experience of crazy volumes of events being successfully handled.

Just before the end of the event I dropped in to the Gorilla REPL session to see Jony Hudson demoing his amazing notebook repl that he has been using to share assignments and research with students and colleagues in his department. A really interesting application and I suspect once we get our heads round it a really interesting way of sharing problems and potential solutions as developers.

Mind slightly blown, I was personally really happy with the event and felt that I’d got a mix of advice and the kind of innovation that make the Clojure community so interesting.

Standard
Clojure

Clojurescript: is it any better yet?

A while ago I wrote about how Clojurescript was a square peg being hammered into a non-existent hole to complete indifference by everyone.

So has anything changed? Well Clojurescript has continued to be developed and it seems to have lost some of the insane rough-edges it launched with, it seems possible to use it with OpenJDK now for example.

Some people have made some very cool things by using Clojurescript as a dialect for writing NodeJS code. I was particularly struck by Philip Potter's talk on Marconi.

The appearance of core.async for Clojurescript is in my view the first genuine point of interest for non-Lisp fans. It provides a genuinely compelling model for handling events in an elegant way.

David Nolen, while being modest about his contribution, has also contributed some excellent blog posts about the virtues of Clojurescript. The most essential of which is the 101 on how to get a basic Clojurescript application going with core.async. This is an excellent tutorial but also is underpinned by hard work on getting the "out of the box" developer experience slick via the Mies Leiningen template.

Let's be clear about what a difference this makes. At the London Clojure dojo we once had a dojo about using Clojurescript, after a few hours all the teams limped in with various war stories about what bits of Clojurescript they had working and what was failing and why they thought it was failing. The experience was so bad I really didn't want to do Clojurescript as a general exercise again.

In the last dojo we did Clojurescript, we used Mies and David's blogpost as a template and all the teams were able to reproduce the blogpost and some of the teams were creating enhancements based on the basic idea of asynchronous service calls.

When someone pitches a Clojurescript idea in 2014 I'm no longer in fear of a travesty. That is a massive step forward.

And that's the good news.

Clojurescript! Who is good for!

After the dojo there were some serious discussions about using Clojurescript in anger. The conversation turned eventually to GWT. In case you don't remember or have never met it GWT is essentially a browser client kit for Java developers. In London it gets used a lot by financial institutions that need rich UIs for small numbers of people. Javascript developers are unlikely to be hired by those organisations so just like Google they end up with a need but the wrong kind of skills and GWT bridges the gap. A Java developer can use a familiar language and off-the-shelf components and will end up with a perfectly serviceable rich client-side app.

There is no chance in hell that a Javascript developer is going to use GWT to build their applications.

Clojurescript feels like the same thing. Clojure developers and LISP aficionados don't know a great deal of Javascript, they can program in Clojurescript and Google Closure and it is probably going to be okay. Better in fact than if they tried to create something in an unfamiliar language with all manner of gotchas.

But there is no chance in hell that a Javascript developer is going to use Clojurescript to build their applications.

Why Coffeescript failed

The reason I say this is because Coffeescript, a great rationalisation of Javascript programming is still viewed with suspicion by Javascript developers.

I asked some at a recent Javascript tech meeting why some people didn't use it and the interesting answer was that they couldn't really understand its syntax and were effectively translating the forms into Javascript. The terseness of the language was actually off-putting because it made it harder to mentally translate what the Coffeescript program was doing.

Adding LISP and Google Closure into that mix isn't going to make that mental disconnect any easier. The truth would appear to be that Javascript developers are simply not that disenchanted with their language. Clojurescript is going to have to offer something major to get over the disadvantage of a non-curly brace language and machine-optimised generated code.

At the Clojure dojo post-mortem people talked about the fact that Clojurescript helped avoid pitfalls and unusual behaviour in Javascript. That might seem a rationale argument to a Clojurian. However it was never an argument used on the JVM. "Hey Java just has all these edge-cases, why not use a LISP variant instead?".

In both languages practitioners of the language are deeply aware of the corner cases of their language. Since they are constantly working with it they are also familiar with the best practices required to make sure you don't encounter those corner cases.

Clojure on the JVM brought power, simplicity and a model of programming that made reasoning about code paths simple.

Clojurescript has the same advantages of code structure but doesn't really give more functionality over Javascript and still has a painful inter-operation story.

Javascript: the amazing evolving language

Javascript has a strong Scheme inheritance meaning that it already contains a lot of LISP inheritance.

Also unlike Java which ended up with a specification that was in the wilderness for years, Javascript has managed to keep its language definition moving. It's sorted out its split and with aggressive language implementers in the form of the competing browsers it is rapidly adding features to the core of the languages and standards for extensions.

Javascript is almost unique that when lots of people wrote languages that compiled into Javascript the community weren't stuffy about it but actually created a specification, Source Maps, that made it easier to support generated code.

Javascript has a lot of problem areas but it is also rapidly evolving and syntactic sugar and new language constructs are adding power without necessary creating new problems or complexity. It is expansionist and ruthless pragmatic, just like Clojure on JVM in many ways.

Better the devil you know?

Visual Basic isn't the best language in the world, it's certainly not the best language for creating apps on Windows. However for organisations and programmers who have invested a lot of time in a language and a platform it normally takes a lot to get them to change.

Usually, it will take an inability to hire people to replace those leaving or the desertion of major clients before change can really be countenanced.

Javascript developers are in the same sunk-cost quandary but there is nothing on the horizon that is going to force an external change. There may be better alternatives but Javascript is one of the easiest languages to learn. It's highly interactive and its right there in the console window of this very browser!

There's no lack of demand for good looking websites and browser hackery to differentiate one web product from the next.

Regardless of the technical merits of any alternative solution offers, and we are not just talking about Clojurescript, Elm offers a similar set of advantages, the herding effect is powerful. Your investment in Javascript is far more likely to pay off that putting time into one of the alternatives, none of which have momentum.

Sometimes there are real advantages to sticking close to the devil.

Node.js

For me the most interesting area of Clojurescript is being able to write Clojure and treat V8 as an alternative runtime.

People are already noticing some odd performance characteristics where some things run better on Node than they do on JVM, most particularly around Async.

Polyglot developers are a familiar sight in the server side, knowing a variety of languages is advantage for the general programmer. Server-side Javascript is really only for those who are a one-trick pony.

It is still going to be a niche area but it is much more likely to happen than in the client-side.

Standard
Clojure, Java, Programming

Clojure versus Java: Why use Clojure?

I just gave an introductory talk to Clojure and one of the questions after the event was when would a Java programmer might want to switch to using Clojure?

Well Java is a much more complex language then Clojure and requires a lot of expert knowledge to use properly. You really have to know Effective Java pretty well, Java still contains every wrinkle and crease from version 1 onwards. By comparison Clojure is a simpler and more consistent language with less chance of shooting yourself in the foot. However as a new language Clojure does not have the same number of tutorials, faqs and Stack Overflow answers. It also has a different structure to curly brace languages, so it feels quite different to program than Java. If you are proficient Java programmer and you are adept with the standard build tools and IDEs then why would you consider changing?

One example is definitely concurrency, even if you were going to do Java then you’re probably going to let Doug Lea handle the details via java.util.concurrent. However Doug Lea didn’t get to rewrite the whole of Java to be concurrent. In Clojure you are letting Rich Hickey handle your concurrency and the whole language is designed around immutability and sensible ways of sharing state.

Another is implementing algorithms or mathematical programming. A lot of mathematical functions are easy to translate into LISP expressions and Clojure supports variable arity functions for recursion and stackless recursion via the recur function. In Java you either end up using poor-man’s functions via static class methods or modelling the expression of the rule into objects which is a kind of context disconnect.

Similarly data processing and transformation work really well in Clojure (as you might expect of a list processing language!), if you want to take some source of data, read it, normalise it, apply some transform functions, perhaps do some filtering, selection or aggregation then you are going to find a lot of support for the common data functions in Clojure’s sequence library. Only Java 8 has support for lambda function application to collections and even then it has a more complex story that Clojure for chaining those applications together to a stream of data.

You might also find Clojure’s lazy sequences helpful for dealing with streams based on large quantities of data. Totally Lazy offers a port of a lot of Clojure’s functionality to Java but it is often easier to just go direct to the source than try and jury-rig a series of ports together to recreate the same functionality.

A final point to consider is how well your Java code is working out currently, if you are having a lot of problems with memory leaks, full GC and so on then it might be easier to work with Clojure than put in the effort to find what pieces of your code are causing the problems. Not that Clojure is a silver bullet but if you use side-effect free functional programming your problems are going to be limited to the execution path of just the function. In terms of maintaining clear code and reasoning about what is happening at run time you may find it a lot easier. Again this isn’t unique to Clojure but Clojure makes it easier to write good code along these lines.

Standard
Java

How do you change or remove a Heroku buildpack?

A weird edge case came up today around a buildpack that turned out to be unneeded for my application. How do you change the Buildpack once you’ve associated it with your application? It turns out that a Buildpack is essentially just a piece of config, so you can see it and change it by using the heroku config command and the config variable name BUILDPACK_URL.

In my case I just needed to remove the configured Buildpack with a simple heroku config unset command:

heroku config:unset BUILDPACK_URL
Standard
Clojure

Getting started with Riemann stream processing

Riemann is a great application for dealing with event processing but it doesn’t have a lot of documentation or newbie friendly tutorials. There are some cool pictures that explain the principles of the app but nothing beyond that. At some point I want to try and contribute some better documentation to the official project but in the meantime here’s a few points that I think are useful for getting started.

I’m assuming that you’ve followed these instructions to get a working Riemann installation and you’ve followed the instructions on how to submit events to Riemann via the Ruby Riemann client interface.

At this point you want to start making your own processing rules and it is not clear how to start.

Well the starting point is the idea of streams when an event arrives in Riemann it is passed to each stream that has been registered with what is called the core. Essentially a stream is a function that takes an event and some child streams and these functions are stored in a list in the core atom under the symbol :streams.

Okay let’s look at an example. The first obvious thing you want to do is print out the events that you are sending to Riemann. If you’ve got the standard download open the etc/riemann.config file, set the syntax for the file to be Clojure, as this is read into Clojure environment in the riemann/config namespace and you can use full Clojure syntax in it. In the config file add the following at the end. Now either run the server or if it is running reload the config file with kill -HUP <Riemann PID>.

(streams prn)

prn is a built-in function that will print an event and pass it on to following streams.

In irb let’s issue an event:

r << {host: "rrees.me", service: "posts", metric:  5}

You should see some output in the Riemann log along the following lines.


#riemann.codec.Event{:host "rrees.me", :service "posts", :state nil, :description nil, :metric 5, :tags nil, :time 1366450306, :ttl nil}

I’m going to assume this has worked for you. So now let’s see how events get passed on further down the processing chain. If we change our streams function to the following and reload it.

(streams prn prn)

Now we send the event it should get printed twice! Simples!

Okay now let’s look at how you can have multiple processing streams working off the same event. If we add a second print stream we should get three prints of the event.

(streams prn prn)

(streams prn)

Each stream that is registered can effectively process the event in parallel so some streams can process an event and send it to another system while another can write it to the index.

Let’s change one of our prints slightly so we can see this happen.

(streams (with :state "normal" prn) prn)

(streams prn)

We should now get three prints of the event and in one we should see that the event has the state of “normal”. Okay great! Let’s break this down a bit.

Every parameter of streams is a stream and a stream can take an event and child streams. So when an event occurs it is passed to each stream, each stream might specify more streams that the transformed event should be passed to. That’s why we pass prn as the final parameter of the with statement. We’re saying add the key-value pair to the event and pass the modified event to the prn stream.

Let’s try implementing this by ourselves, there is a bit of magic left here, call-rescue is an in-built function that will send our event to other streams you can think of it as a variant of map:

(defn change-event [& children]
  (fn [event]
    (let [transformed-event (assoc event :hello :world)]
      (call-rescue transformed-event children))))

(streams (change-event prn))

If this works then we should see an event printed out that has the “hello world” key-value pair in it. change-event is a stream handler that takes a list of “children” streams and returns a function that handles an event. If the function does not pass the event onto the children streams then the event stream stops processing, which is a bit like a filter. The event is really just a map of data like all good Clojure.

At this point you actual have a good handle on how to construct your own streams. Everything else is going to be a variation on this pattern of creating a stream function that returns an event handler. The next thing to do now is go and have a look at the source code for things like withprn and call-rescue. Peeking behind the curtain will take a certain amount Clojure experience but it really won’t be too painful, I promise, the code is reasonable and magic is minimal. Most of the functions are entirely self-contained with no magic so everything you need to know is in the function code itself.

Standard
Clojure, Programming

Leiningen doesn’t compile Protocols and Records

I don’t generally use records or protocols in my Clojure code so the fact that Clojure compiler doesn’t seem to detect changes in the function bodies of either took me by surprise recently. Googling turned up this issue for Leiningen. Reading through the issue I ended up specifying all the namespaces containing these structures in the :aot definition in the lein project.clj. This meant that the namespace was re-compiled every time but that seemed the lesser of two evils compared to the clean and build approach.

Where this issue really stung was in the method-like function specifications in the records and as usual it felt that structure and behaviour was getting muddled up again when ideally you want to keep them separate.

Standard
Clojure, Programming, Web Applications

A batteries included Clojure web stack

Inspired by the developer experience of the Play framework as well as that of Django and Ruby on Rails I’ve been giving some thought to what a “batteries included” experience might be for Clojure web development. Unlike things like Pedestal which focuses on trying to keep LISPers happy and writing LISP as much as possible I’m approaching this from the point of view of what would be attractive to frontend developers who choose between things like Rails, Sinatra or Express.

First lets focus on what we already have. Leiningen 2 gives us the ability to create application templates that define the necessary dependencies and directory structures as well as providing an excellent REPL. This should allow us to build a suitable application with a single command. The Compojure plugin already does a lot of the setup necessary to quickstart an application. It downloads dependencies and fires up a server that auto-reloads as the application changes.

The big gap though is that the plugin creates a very bare bones application structure, useful for generating text on the web but not much else. To be able to create a basic (but conventional) web app I think we need to have some standard things like a templating system that works with conventional HTML templates and support for generating and consuming JSON.

Based on my experience and people’s feedback I think it would be worth basing our package on the Mustache templating language via Clostache and using Cheshire to generate and parse the JSON (I like core.data’s lack of dependencies but this is web programming for hackers so we should favour what hackers want to use).

I also think we need to set up some basic static resources within the app like Modernizr and jQuery. A simple, plain skin might also be a good idea unless we can offer a few variations within the plugin such as Bootstrap and Foundation which would be even better.

Supporting a datastore is probably too hard at the moment due to the lack of consensus about what a good allround database is. However I think it would be sensible to offer some instructions as to how to back the app with Postgres, Redis and MongoDB.

I would include Friend by default to make authentication easy and because its difficult to to do that much interesting stuff without introducing some concept of a user. However I think it is important that by default the stack is essentially stateless so authentication needs to be cookie-based by default with an easy way of switching between persistence schemes such as memory and memcache.

Since webapps often spend a lot of time consuming other web services I would include clj-http by default as well. Simple caching that can be backed by memcache also seems important since wrapping Spymemcache is painful and the current Clojure wrappers over it don’t seem to work well with the environment constraints of cloud platforms like Heroku.

A more difficult requirement would be asset pipelining. I think by default the application should be capable of compiling and serving LESS and Coffeescript, with reloading, for development purposes. However ideally during deployment we want to extract all our static resources and output the final compiled versions for serving out of a static handler or alternatively a static resource host. I hate asset fingerprinting due to the ugliness it introduces into urls, I would prefer an ETag solution but fingerprinting is going to work with everything under the sun. I think it should be the default with an option to use ETags as an alternative.

If there was a lein plugin that allowed me to create an application like this with one command I would say that we’re starting to have a credible web development platform.

Standard