Programming

Keeping NPM local

I’m pretty sure that at one point NPM might not have required sudo privileges to do things but by default it now seems that people are sudo installing all over the place. It’s the equivalent of just clicking “yes” on every Windows permission dialog.

There is no particularly good reason to use sudo with a package manager. Virtualenv is tres local by default and even gem (via rbenv and rvm) can now run happily in the user space. Even if you want to share package downloads to minimise network calls then you still don’t need to install everything globally as root.

NPM is special in almost every sense of the word and it’s author recommends that you switch permissions on /usr/local to be owned by your login user. That seems kind of crazy and the kind of thing that probably would only work out for OSX users.

In fact if you are willing to build your own NodeJS it really isn’t hard to bet Node and NPM working locally and still retaining all the advantages of the “global” NPM install. Just use the standard build option of –prefix to set Node to use your home directory and just add $HOME/bin to your PATH in .bashrc or the equivalent.

Then you should be able to use npm without ever having to sudo.

My view though is that you shouldn’t have to force everything into the user space just to make sure a package manager doesn’t need sudo privileges. It would be better for everyone if NPM used a directory in home for each user since it seems to be aimed at single-user installs anyway there is not a massive saving by having packages installed in /usr/local. For the few use-cases where that would be useful then it should be an override option.

Standard
Web Applications, Work

Guardian May 2013 Hackday

You can see the reportage in these two liveblogs: Day 1 and Day 2 (note the terrible naming conventions). The theme of the hackday was “growth”. For the most part I took the theme to mean growth hacking and I did a lot of work along those lines which is difficult to talk publicly about.

However my prior lunchtime hacks had revealed to me that one of the fundamental problems the Guardian has is the volume of content it produces. This is not inherently a bad thing but the key thing to understand is that there is vastly more content than can fit onto what are called “fronts” in the jargon. A front is something like the front page of the site or the Environment section. These fronts produce a lot of traffic to content and for regular readers they are the essential navigation tool for the Guardian’s content.

Therefore I was interested in how we consider the dimension of time and perhaps use it to our advantage to help present content. This aspect of my hackday work is more open because actually I need a lot of help to understand to and because I’ve made some effort to try and use the public Content API rather than our internal content.

I called this work the “Time Trilogy” because it consists of three web apps that each use time as a way of accessing Guardian content.

The three apps are Guardian Word Count which was the original and gives you a sense of the challenge of navigating the content. It is also pretty fun to watch during the day and see the words tick up. So the Word Count spawned TickTickTick and Guardian In Review. TickTickTick is really a daily content explorer and was the first tool I needed to start sorting and exploring the breakdown of what we produce. It is a tool at its heart for exploring the daily news cycle. In Review is slightly different, it takes the one hundred most popular pieces of content over the last seven days and renders it. Initially I wanted it to be a kind of automatically generated magazine but actually looking at what people liked meant that I couldn’t make my initial idea work. People really like videos of meteors and Russian car crashes. What it is now is a way to explore material in the medium term, for content that perhaps has left the news cycle but is still relevant.

Neither app is really finished and the way I work is that I am very reliant on having working software to understand what I am doing and what is wrong or right about my approach. TickTickTick is much closer to being a complete product than In Review and it is providing more insight into the nature of the content being produced. For example there is a massive cluster of material between three and five minutes long.

I am going to continue to work on the apps because they help give me feedback into my work and ultimately these prototypes and toys tend to graduate into working components or theory on the main site itself. I may blog a bit more about them individually as I move them closer to something that genuinely creates value. I’m curious about feedback but acting on it is limited by my aims for the apps and realistically the time I have available.

I also wanted to talk a little bit about how I was working this hack day because I decided to reject advice and work solo rather than part of a team (although I did a little bit of backseat driving on the online magazines product and I did come up with the idea that actually won the hackday (and will hopefully be implemented and awesome)). Working alone does mean that your creations are going to be quite rough but it helps cover a lot of ground, I ended up doing five hacks and working on a total of seven. Working with other people means communicating well whereas solo you just need to express what you want very quickly.

My preferred tool for these kinds of hacks is Python on App Engine, which is what I use for my lunchtime hacks and for which I have a standard application template. With each new application that I do I can start to move the common patterns into the template. To avoid having to faff around with testing I use a loosely functional paradigm that I’ve carried over from Wazoku. It generally works quite well but there are a lot of rules to doing it.

This time around I was doing a bit more frontend work than my day job requires because I was working solo. Again having the startup experience was useful because I was more rediscovering a skillset than learning it. Hacks also means selecting your platform and choosing for optimal output.

For that reason I only targeted Firefox and Chrome (Firefox was actually easier to develop for in terms of standards) and I made liberal use of client-side Less and Coffeescript. I was impressed with how good the error-handling was in both. An obscure bug can wipe out all the productivity gains of a higher-order language but both worked great for me.

On top of that I tried experimenting with the new departmental standard of SMACSS (or at least my cherry-picking of it) and I made a lot of use of both Knockout and Bacon.js.

When I say I made use of SMACSS essentially what I did was namespace my classes to produce simple selectors. This did get me out of a problem I had in In Review so while it is truly the ugliest CSS standard and I suspect in time we may come to hate its rejection of rich functionality I concede that it is effective. Expect to see some of it applied to the main website sometime soon.

Knockout isn’t that popular in the department due to performance issues at a particular level of complexity but for me it did a brilliant job of simply syncing the visual DOM to the data feeds. I was really happy with it, other people were using AngularJS for more dynamic applications but they also had a lot more code than I did and again working solo less is so much more.

Bacon.js was really interesting. A lot of my approach to Javascript is functional and event-based but so far the events have been manually worked via jQuery. Bacon made it easier to create event sources with generic handlers and I probably didn’t use 10% of its full features. I’m curious to see what the rest of the department thinks of it but for my hacks it has definitely earned a place.

It was nice to do something outside the run of normal work and one thing that is quite cool about the hackday is that you can use it to tackle a technology that is entirely new to you and not have to worry about whether you succeed or fail.

Next time (May I believe) I think I want to learn about browser plugins as this is a way of producing better functionality for the Guardian without the hassle of having to make it work for the general population of browsers. Some people’s hacks this time around could have been released to the app/plugin stores and we could have been getting valuable user feedback by now.

Standard
Clojure

Why I’m finding Clojurescript underwhelming

I noticed Clojurescript in Github before the big announcement and thought it was an interesting idea. I am a big fan in general about having a Clojure syntax that compiles to Javascript. As a platform it is even more ubiquitous than Java and it would be a great way of simplifying Javascript’s closure and function syntax.

However in practice Clojurescript has been desperately disappointing for me. Firstly there is the weird decision to not have the code run on OpenJDK. This really limits its utility: I don’t seem to have a machine with a compatible setup at the moment despite having various flavours of Javascript interpreters available.

Then while looking for an answer as to how soon this problem is likely to be resolved I discovered this thread which was another level of disappointment. The original post is undiplomatic, perhaps even inflammatory, however the response indicates a level of befuddling clueless-ness.

If you want something to compile into Javascript I think you actually do want it to compile into good idiomatic Javascript unless you have a really good reason not to. You also do want to be able to use really good existing frameworks like jQuery (which really is the defacto standard right now).

The reason I think these are reasonable requests is that Coffeescript seems to manage to do both. Before Coffeescript maybe Clojurescript’s idiosyncrasies would have been forgiveable but being late to the party as well as being less well-mannered makes the defiance in the response seem poorly judged.

I am not sure what Clojurescript is really for (apparently it is aimed at a future community of people that don’t exist yet, which is … helpful). I don’t feel that it is really simpatico with the existing Javascript code that works in the browser and I am not sure it really has a place in the server-side world of Node.js where it might have been a better fit.

I remain open-minded though and would be willing to give Clojurescript a second go once the dust has settled a bit.

Update: I’ve written a follow up to this post

Standard
Programming, Scripting

CouchDB: Querying data

CouchDB allows you to pass a map function to a special view URL to query the data in an ad-hoc way. Views can also be stored as JSON documents with a convention URL (_design on the server, accessed as _view by the client). These can then be obtained via a HTTP request.My functional and Javascript programming are weak but this is what I understand of writing queries in CouchDB. Let’s take an example of a set of library cards, each card represents a book but the amount of information I have on each book varies.

The basic find all function is this:

function(book) {
map(null, book)
}

This defines an anonymous function that takes one parameter, the target document, in this case a book, and returns an array of values. What is in the value list is controlled by the second parameter, in this query I return the entire document. The first parameter controls the sorting or ordering. So I wanted to return the title of all the books in my database then I would use:


function(book) {
map(book.name, book.name)
}

Sorting them by ISBN would go like this:

function(book) {
map(book.isbn, book.name)
}

One important thing to note is that if an object doesn’t have a value it doesn’t respond to the function and will not be included. So if I created some of my entries with a value title instead of name anything with a title and not a name will not be in the query. However if I use a non-existent entry as an ordering criteria the value will count as null and be sorted.

Because I can include any valid Javascript in my function I can actually put a lot of complexity into my queries. For example:

function(book) {
if(book.isbn != null) { map(book.name, {"Name": book.name, "ISBN": book.isbn})
} else { map(book.name, book.name) }
}

So I suspect this will either make you cheer or puke. What this function does is return a JSON object containing the Name and ISBN of the book if they are known or just the Book name as a String otherwise. Unlike SQL the heading of my query is almost completely arbitrary as long as the value on the right of my map function translates to a valid JSON object.

Now at work there are often a lot of debates as to whether things are “rigid” or “structured” or whether they are “flexible” or “formless”. It is a bit like the old meat and poison adage. CouchDB allows a client to construct an almost arbitrarily rich response to a query with almost no restriction on how the data that should be included in that response. In some cases this is going to allow you to easily interact with very complex unstructured data in some cases it is going to be an invitation to create a sprawling dataset with no value. There is no inherent right or wrong choice here but for a particular problem being solved there is probably going to be a wrong and right choice. SQL is powerful because of the restrictions and rules it builds into its grammar. Using Javascript is powerful because it relaxes those restrictions. Programmers and IT folks in general often fall into using the laxest possible implementation for reason of “flexibility” but then either have to impose order themselves or lose the power of the more restrictive choice.

So putting that into a concrete example, if a write a view with SQL I am going to have to follow a set of rules to get the data I want (for example my heading is going to have to be a set of tuples of equal size), using an arbitrary script and JSON means I am going to be able to get exactly the data I want in the form I want it. However since that return structure is customised to my query I might possibly be reducing my reuse by being over-specific or by building too much logic into my view code.

That’s quite a diversion just so I can say it’s horses for courses, so let’s wrap up this quick look at CouchDB views. All of CouchDB’s views are effectively JSON objects that are passed to a separate view server. This is a separate process that interacts with the main server via STDOUT and STDIN pipes. By default this is the view server that is built from the Spidermonkey library (it is called couchjs). However you can write a view parser for any language and plug it into CouchDB by creating an executable and mapping it to a MIME type in the couch.ini file. The view server essentially parses and readies the query function that is associated with the view and is then sent every document in the database as a JSON string. The view server picks up the results of reading every document and sends that back to the query request.

It is a pretty simple system and it works will for the relatively flat documents I have been trying with it. However I suspect that in a project with multiple developers some ground-rules for writing consistent query code would be a must.

Standard