Clojure

Getting started with Riemann stream processing

Riemann is a great application for dealing with event processing but it doesn’t have a lot of documentation or newbie friendly tutorials. There are some cool pictures that explain the principles of the app but nothing beyond that. At some point I want to try and contribute some better documentation to the official project but in the meantime here’s a few points that I think are useful for getting started.

I’m assuming that you’ve followed these instructions to get a working Riemann installation and you’ve followed the instructions on how to submit events to Riemann via the Ruby Riemann client interface.

At this point you want to start making your own processing rules and it is not clear how to start.

Well the starting point is the idea of streams when an event arrives in Riemann it is passed to each stream that has been registered with what is called the core. Essentially a stream is a function that takes an event and some child streams and these functions are stored in a list in the core atom under the symbol :streams.

Okay let’s look at an example. The first obvious thing you want to do is print out the events that you are sending to Riemann. If you’ve got the standard download open the etc/riemann.config file, set the syntax for the file to be Clojure, as this is read into Clojure environment in the riemann/config namespace and you can use full Clojure syntax in it. In the config file add the following at the end. Now either run the server or if it is running reload the config file with kill -HUP <Riemann PID>.

(streams prn)

prn is a built-in function that will print an event and pass it on to following streams.

In irb let’s issue an event:

r << {host: "rrees.me", service: "posts", metric:  5}

You should see some output in the Riemann log along the following lines.


#riemann.codec.Event{:host "rrees.me", :service "posts", :state nil, :description nil, :metric 5, :tags nil, :time 1366450306, :ttl nil}

I’m going to assume this has worked for you. So now let’s see how events get passed on further down the processing chain. If we change our streams function to the following and reload it.

(streams prn prn)

Now we send the event it should get printed twice! Simples!

Okay now let’s look at how you can have multiple processing streams working off the same event. If we add a second print stream we should get three prints of the event.

(streams prn prn)

(streams prn)

Each stream that is registered can effectively process the event in parallel so some streams can process an event and send it to another system while another can write it to the index.

Let’s change one of our prints slightly so we can see this happen.

(streams (with :state "normal" prn) prn)

(streams prn)

We should now get three prints of the event and in one we should see that the event has the state of “normal”. Okay great! Let’s break this down a bit.

Every parameter of streams is a stream and a stream can take an event and child streams. So when an event occurs it is passed to each stream, each stream might specify more streams that the transformed event should be passed to. That’s why we pass prn as the final parameter of the with statement. We’re saying add the key-value pair to the event and pass the modified event to the prn stream.

Let’s try implementing this by ourselves, there is a bit of magic left here, call-rescue is an in-built function that will send our event to other streams you can think of it as a variant of map:

(defn change-event [& children]
  (fn [event]
    (let [transformed-event (assoc event :hello :world)]
      (call-rescue transformed-event children))))

(streams (change-event prn))

If this works then we should see an event printed out that has the “hello world” key-value pair in it. change-event is a stream handler that takes a list of “children” streams and returns a function that handles an event. If the function does not pass the event onto the children streams then the event stream stops processing, which is a bit like a filter. The event is really just a map of data like all good Clojure.

At this point you actual have a good handle on how to construct your own streams. Everything else is going to be a variation on this pattern of creating a stream function that returns an event handler. The next thing to do now is go and have a look at the source code for things like withprn and call-rescue. Peeking behind the curtain will take a certain amount Clojure experience but it really won’t be too painful, I promise, the code is reasonable and magic is minimal. Most of the functions are entirely self-contained with no magic so everything you need to know is in the function code itself.

Standard
Clojure, Programming

Leiningen doesn’t compile Protocols and Records

I don’t generally use records or protocols in my Clojure code so the fact that Clojure compiler doesn’t seem to detect changes in the function bodies of either took me by surprise recently. Googling turned up this issue for Leiningen. Reading through the issue I ended up specifying all the namespaces containing these structures in the :aot definition in the lein project.clj. This meant that the namespace was re-compiled every time but that seemed the lesser of two evils compared to the clean and build approach.

Where this issue really stung was in the method-like function specifications in the records and as usual it felt that structure and behaviour was getting muddled up again when ideally you want to keep them separate.

Standard
Clojure, Programming, Web Applications

A batteries included Clojure web stack

Inspired by the developer experience of the Play framework as well as that of Django and Ruby on Rails I’ve been giving some thought to what a “batteries included” experience might be for Clojure web development. Unlike things like Pedestal which focuses on trying to keep LISPers happy and writing LISP as much as possible I’m approaching this from the point of view of what would be attractive to frontend developers who choose between things like Rails, Sinatra or Express.

First lets focus on what we already have. Leiningen 2 gives us the ability to create application templates that define the necessary dependencies and directory structures as well as providing an excellent REPL. This should allow us to build a suitable application with a single command. The Compojure plugin already does a lot of the setup necessary to quickstart an application. It downloads dependencies and fires up a server that auto-reloads as the application changes.

The big gap though is that the plugin creates a very bare bones application structure, useful for generating text on the web but not much else. To be able to create a basic (but conventional) web app I think we need to have some standard things like a templating system that works with conventional HTML templates and support for generating and consuming JSON.

Based on my experience and people’s feedback I think it would be worth basing our package on the Mustache templating language via Clostache and using Cheshire to generate and parse the JSON (I like core.data’s lack of dependencies but this is web programming for hackers so we should favour what hackers want to use).

I also think we need to set up some basic static resources within the app like Modernizr and jQuery. A simple, plain skin might also be a good idea unless we can offer a few variations within the plugin such as Bootstrap and Foundation which would be even better.

Supporting a datastore is probably too hard at the moment due to the lack of consensus about what a good allround database is. However I think it would be sensible to offer some instructions as to how to back the app with Postgres, Redis and MongoDB.

I would include Friend by default to make authentication easy and because its difficult to to do that much interesting stuff without introducing some concept of a user. However I think it is important that by default the stack is essentially stateless so authentication needs to be cookie-based by default with an easy way of switching between persistence schemes such as memory and memcache.

Since webapps often spend a lot of time consuming other web services I would include clj-http by default as well. Simple caching that can be backed by memcache also seems important since wrapping Spymemcache is painful and the current Clojure wrappers over it don’t seem to work well with the environment constraints of cloud platforms like Heroku.

A more difficult requirement would be asset pipelining. I think by default the application should be capable of compiling and serving LESS and Coffeescript, with reloading, for development purposes. However ideally during deployment we want to extract all our static resources and output the final compiled versions for serving out of a static handler or alternatively a static resource host. I hate asset fingerprinting due to the ugliness it introduces into urls, I would prefer an ETag solution but fingerprinting is going to work with everything under the sun. I think it should be the default with an option to use ETags as an alternative.

If there was a lein plugin that allowed me to create an application like this with one command I would say that we’re starting to have a credible web development platform.

Standard
Clojure, Programming, Scala

Horses for courses: choosing Scala or Clojure

So one of the questions after my recent talk trying to compare Scala and Clojure (something that I suspect is going to be an ongoing project as I hone the message and the tone) was about whether the languages had problem domains they were more suited too. That’s an interesting question because I think they do and I thought I might be interesting to go through some of the decision making process in a more considered fashion than answering questions after a talk allows you to do.

So some of the obvious applications are that if you want to leverage some Java frameworks and infrastructure then you definitely want to use Scala. Things like JPA, Spring-injection, Hibernate and bean-reflection are a lot easier with Scala; in Clojure you tend to be dancing around the expectations these frameworks have that they are working with concrete bean-like entities.

If you are going to work with concurrency or flexible data formats like CSV and JSON I think you definitely want to be using Clojure. Clojure has good multi-core concurrency that is pretty invisible to you as a programmer. The key thing is avoiding functions with side effects and making sure you update dependent state in a single function (transaction). After that you can rely on the language and its attendant frameworks to provide a lot of powerful concurrency.

Similarly LISP syntax and flexible data go hand in hand so writing powerful data transforms seems second nature because you are using fundamental concepts in the language syntax.

Algorithm and closed-domain problems are interesting. My personal view is that I find recursion easier in Clojure due to things like the explicit recur function and the support for variable-arity function definitions. Clojure’s default lazy sequences also make it easier to explore very large problem spaces. On the other hand if you have problems that can be expressed by state machines or transitions then you might be able to express the solution to a problem very effectively in a Scala case class hierarchy.

When it comes to exploring the capabilities of Java libraries I tend to use the Scala console but for general programming (slide code examples, exploratory programming) I do tend to find myself spending more time in LightTable‘s Instarepl.

When it comes to datastore programming both languages are actually pretty clunky because they devolve handling this down to various third-party libraries. Clojure does pretty well with document databases and key-value stores. Scala is great for interacting with the AWS Java libraries and neither deals particularly well with relational data.

For web programming neither is brilliant but Scala definitely has the edge in terms of mature and full-featured web frameworks. Clojure is definitely more in the log cabin phase of framework support currently.

Standard
Clojure

London Clojure Maze solver dojo

Last month we had another team code competition, this time centered around writing code that trys to solve a maze. Clojure seems quite apt for creating these kind of challenges as it has a lot of support for dynamic code evaluation and the functional paradigm makes writing callbacks a lot easier.

Just like the Battleships dojo it was interesting in that the random strategy was a good local maximum. However one revalation that the maze wasn’t cyclic later then left-wall hugging was kicking everyone ass. That then left dead-end elimination as the only possible way to produce a faster solver. Which our team failed to do sadly. Right idea, wrong turning table.

We also got bogged down on a Clojure issue which has come up a few times at the dojo. I’ll summarise it here: should you be using Clojure 1.4? Your library syntax and server compatibility depends on the answer to this and there is no good error message that is going to tell you that the language syntax has changed.

The competitive dojo is an interesting environment where only the best work process and most pragmatic code can thrive. It is an interesting critique of hammock-style as the result of all thinking and beard-stroking better be order of magntiudes better than the obvious answer.

We also got to see a good example of beard-stroking abstraction this month with Chris Ford’s introduction to the theory of music and its abstractions in a general purpose computing language. An amazing talk which combined education with an amazing abstraction over music itself.

Standard
Clojure

EuroClojure Day 2

Okay so this post maybe happening a little later than Friday but in my defence there were some excellent conversations to go with the after-conference drinks.

Day 2 featured two talks by Rich Hickey, I had already seen some of the Datomic stuff from QCon and the web so I found the stuff on the new reducers library more engaging. I have never thought of map having an implicit ordering promise.

Meikel Brandmeyer gave a historical review of lazy seq which was really helpful for understanding laziness (something I have a bit of a problem with). One of the real highlights though was Chris Ford’s talk about canon music. It started with a good gag about sheet music being a DSL for using the finite state machine otherwise known as a musician. However the really amazing thing was Chris’s abstraction of the score and subsequent transformations of the abstract score to end up with variations on the base canon he had chosen. Really amazing. Chris’s talk really shouldn’t have been a lightning talk, it is about the only quibble I had with the programming.

Sam Newman also had an excellent closing line in his lightning talk on Riemann, which was if people want Clojure to be adopted widely then the secret is to create great things with Clojure.

Standard
Java

EuroClojure 2012 Day 1

So there were definitely two big themes in the talks on the first day of the conference.

The first has been about how to use event-based systems to create flexible aggregate data models. All speakers seem to have settled on a reduce or foldLeft approach for creating the aggregate but there have been two models put forward already CQRS and a kind of Aggregate query bus but really it seems that responsibility for accepting event data and allow querying and access to aggregated views seem to be responsibilities in the same system.

The other thing has been creating query systems using logical predicates. The were no less than three generic query systems put forward: core.logic for low-level flexible implementations that identify either data or results and two general query libraries: one from Datomic and the other from Cascalog.

Standard
Clojure, Programming

January’s London Clojure Dojo

January meant Battleships. More specifically battling battleships. Five teams created players and duked it out during the dojo with a tremendously narrow margin of victory. So what did we learn?

Well first of all randomly placing ships and shooting is actually a pretty good strategy. This is what the default player does and any deviation from it can be pretty badly punished by it.

One simple thing that people did to start improving over the random start was restricting placement of ships to a single half or quarter of the board. Doing this allowed most teams to start beating the initial strategy.

However clustering your ships is only effective against random shot placement so when people start implementing targeting you actually become more vulnerable. The first effective targeting strategy was surprisingly simple, if you hit something choose an adjacent square as your next target.

The team that squeezed to the top refined this by choosing an adjacent square that hadn’t already been fired at. The next level of improvement would probably be a non-trivial look at the probability that another ship square lay in the adjacent squares by looking at the information surrounding them.

There was a lot of work around the concepts of adjacency and whether the square had been fired at and the teams all seemed to converge towards the clojure.set library (if they were aware of it).

I’m now thinking of what fiendish problem would force and exploration of this library as it seems incredibly powerful for all different kinds of problems.

Standard
Clojure

Why I’m finding Clojurescript underwhelming

I noticed Clojurescript in Github before the big announcement and thought it was an interesting idea. I am a big fan in general about having a Clojure syntax that compiles to Javascript. As a platform it is even more ubiquitous than Java and it would be a great way of simplifying Javascript’s closure and function syntax.

However in practice Clojurescript has been desperately disappointing for me. Firstly there is the weird decision to not have the code run on OpenJDK. This really limits its utility: I don’t seem to have a machine with a compatible setup at the moment despite having various flavours of Javascript interpreters available.

Then while looking for an answer as to how soon this problem is likely to be resolved I discovered this thread which was another level of disappointment. The original post is undiplomatic, perhaps even inflammatory, however the response indicates a level of befuddling clueless-ness.

If you want something to compile into Javascript I think you actually do want it to compile into good idiomatic Javascript unless you have a really good reason not to. You also do want to be able to use really good existing frameworks like jQuery (which really is the defacto standard right now).

The reason I think these are reasonable requests is that Coffeescript seems to manage to do both. Before Coffeescript maybe Clojurescript’s idiosyncrasies would have been forgiveable but being late to the party as well as being less well-mannered makes the defiance in the response seem poorly judged.

I am not sure what Clojurescript is really for (apparently it is aimed at a future community of people that don’t exist yet, which is … helpful). I don’t feel that it is really simpatico with the existing Javascript code that works in the browser and I am not sure it really has a place in the server-side world of Node.js where it might have been a better fit.

I remain open-minded though and would be willing to give Clojurescript a second go once the dust has settled a bit.

Update: I’ve written a follow up to this post

Standard
Programming

Intuitive versus Reasoning Programmers

During the last year I’ve been helping run a monthly series of dojos for the London Clojure User Group. In the course of it I have had the chance to watch a lot of people grapple with functional programming. As a result of this and also looking at the way a lot of my colleagues at ThoughtWorks work I think programmers can roughly be divided into two groups: Intuitive and Reasoning.

To characterise both of them a little, I think that Intuitive programmers tend to use domain language a lot, rely heavily on tests and TDD, often find it difficult to articulate what they are doing in their work, they like small source code files because they like to see everything a file does at a glance, they prefer outside-in problem solving and like the opportunity to go back and revise their work.

Reasoning programmers like to discuss a problem before coding and explore edge-cases and what-ifs, they like to work in the REPL or have in-editor code evaluation, they quickly move from a domain to an abstraction and then work in that abstraction, they don’t mind a thousand line source file as long as it is logically structured (that’s what the search function is there for), they prefer “bottom-up” coding where they distil their abstraction to its essence and having implemented that create the required behaviour by composing their abstractions.

Both sets of programmers can be great at their work but often if they are unaware of the characteristics of how they like to work then there can be massive amounts of tension as the two wrestle back and forth. The Intuitive developers feel angsty when the Reasoning programmers start hacking out code rather than refactoring, Reasoning developers get frustrated at the aimlessness and game playing of the Intuitive’s attempts to generate the minimum amount to make their tests pass.

Reasoning programmers probably feel that in terms of communicating they have the superior style as they are able to advance arguments and logical constructs that can be interrogated. Those articulated models though can feel arid and irrelevant to Intuitive programmers, what does some obscure mathematical formula have to do with trying to make sure the frog doesn’t get run over when it crosses the road?

Intuitive programmers seem to be better at switching contexts and adapting to change as they can quickly see the “outlines” of any problem and their techniques are about honing that initial perception into a functional solution. By contrast a Reasoning program is wary of uncertainty and is unhappy drawing unjustified analogies between different situations.

In FP terms Reasoning programmers have their behaviour emerge as a logical consequence of the operation of lower level abstractions; their code looks like algebra and domain data is passed in to the top of their processing chain. Intuitive programmers on the other hand, fix behaviour in their tests and fill their code with domain language that aims to match the natural language of the organisation, the depth of their function calls is usually smaller and they are swifter to bind calculations to intermediate variables.

I think I am an Intuitive programmer (and therefore I worry that I am miscasting Reasoning programmers through my lack of understanding) and in my work most of my colleagues are as well, it is in the nature of consultancy to have to adapt to constantly changing domains, code bases and expectations. We do have a few Reasoning programmers though who do exactly the same work but do it via thinking deeply through problems and drawing logical inferences.

If there are issues between the two groups then I think it occurs when there are no concessions between the two; common flashpoints being testing, writing defensive/guard code and giving time to discuss problems and problem solving strategies. The two also agree on a lot of things for very different reasons: for example both like to rewrite code, refactoring is a formalised practice for Intuitive developers whereas Reasoning developers often want to apply new insight or learned ideas (“This could all just be monads!”).

An important point is that neither approach is “right” both types of programmer arrive at the same results if their experience, ability and other factors are equal. They are purely styles of working.

Standard