The web is a graph

Last week I gave a talk on how I have been creating web applications that very lightly wrap an underlying graph to provide not just content for a page but also the workflow and state of the user’s current interaction with the application.

As part of the talk I have created two demo apps that are available on Heroku. Crumbly Castle is inspired by Dark/Demon Souls and allows you to explore a castle that is populated by the ghosts of everyone who has ever played it. The other offers a questionnaire system that generates characters in the style of the Elder Scroll or Fallout games. The code for the applications is on Github so you can fork it and deploy it for yourself. Both use the hosted Neo4J addon for Heroku which provides hassle-free hosting but is currently only available to beta program members.

You can obviously use both on your local machine.

Both of the demos are metaphors for more serious kinds of enterprise applications but I think it is often easier to produce prototypes or demos that are based on immediately engaging concepts. It certainly helps to have something that the audience can play with during the talk!

So briefly I just wanted to summarise the points I try to make during the talk and explain why you might want to look at using a graph as your web application store. So my major point is that web application development is usually page-centric, when you hit a page the controller tends to examine the whole state of the application to find out why you came to the page. Are you logged in? Were you trying to look at something? Is there a session associated with you?

I posit that we should instead be looking at the journeys between the pages as being the interesting things. Given where you are in the journey graph where can you go next? Essentially I am taking the same logic as a state machine or rule engine uses and instead expressing it as a relationships in a graph.

The most common trick the applications use is to assign a fixed url to a user session that identifies a node in the graph. Then with each transition I change the relationships the node has to other data based on the user’s actions and then simply send a redirect back to the fixed url which will then render a different result based on the current state of the node.

This means that the web application becomes very simple to write and the controller simply has to select the template and the related nodes that are needed to generate links and actions.

I think it is a really interesting approach that is a really natural fit for simplifying a lot of session-state heavy apps.

Programming

Thoughts on Neo4J REST server 1.3

So the new Neo4J GPL release 1.3 (with added REST) is the release that kind of moves Neo4J out of its niche and into the wider arena. There are less barriers to experimenting with it in production situations and now it is no longer necessary to embed it you don’t have any language barriers, it can just be another piece of infrastructure.

Last week I used the new release to prototype an application and these are my observations. The big thing is that the REST server is easy to use out of the box and the API is well-thought out and practical and it looks like there’s some nice wrappers already for various languages (I used this one for Python).

However compared to other web-based stores there are still a few wrinkles. The first is that it would be nice to clearly be able to mark nodes that you consider “roots” in the graph. It’s almost the first thing you want to do and there should be a facility to support it in the API.

Indexing is still cronky, I think there is a reasonable expectation that Neo4J should be able to have a dynamic index that indexes the properties of nodes as they change without having to explicitly add the indexable fields to the index. I don’t mind if I need to do some explicit work to switch on this mode but having to manage it all explicitly is a huge pain. Particularly because you don’t have any root node support so you are almost inevitably going to want to search before moving over to relationship following.

It would be nice to have list and map properties on nodes, at the moment we’re getting around that by using JSON in the property string but again it feels like a value add that’s necessary if you want to use Neo4J for a big chunk of your data processing.

This is a personal peeve but I do think it would be nice to be able to override the id of a node. One usecase I can definitely think of is when you’re translating a document database into a graph form and you want to cross-reference the two (particularly if you’re using the richer data on the document rather than tranferring it into the node).

One other feature that is a bit strange (but I might be doing this wrong) is that the server seems to correspond to just one database. It would seem pretty easy to incorporate a database-style data-partition name into the REST scheme and allow one server to serve many different databases.

Now that the GPL version is available I am also looking forward to Neo4J entering the Linux package distributions in the same way that CouchDB has become a ubiquitous deployment choice.

So in short Neo4J has left it’s niche but it still hasn’t fully left the ghetto but I have high hopes for the next release.

Java

Playing around with Neo4J and Groovy

After hearing about Neo4J at Ruby Manor I decided to have a play around with the graph database but for me playing doesn’t mean creating a whole Java project anymore. Since using Python/Ruby/Scala I want to be doing it in an interactive session.

I had quite a few issues getting Neo4J to run but the summary is that for convenience you want all three jars from the distribution in your classpath and you pass a String representing a directory path to the EmbeddedNeo constructor. Once you do have it running make sure to shutdown the database on an exception otherwise you will have to shutdown the JVM (i.e. close the whole Groovy Console session) to unlock the underlying file resources.

Okay so once you have the right jars you can now start playing around with Neo4J. I immediately felt that the library is actually quite heavy with a lot of ceremony to get things done. Some of the feedback I have been hearing from Neo Technology and the Neo4J list is that Neo4J is more of a low-level infrastructure component that is meant to be wrapped up in higher-level APIs.

Working with Groovy it should be possible to cut that ceremony down a bit and put a nicer front-end on things. The first thing I’ve tried is using closures to execute code in Neo database and transaction contexts.

If you have the three jars in your .groovy/lib directory you should be able to run this script from the Groovy Console and have it create a node in your directory. It will be the same node each time but I have some ideas for using builders for both nodes and traversers (which allow you to search the graphs) and I am going to work on (and post) them later.

Echo One

Sequentially arranged sentences composed of words (and punctuation)

Tag Archives: graph database

The web is a graph

Thoughts on Neo4J REST server 1.3

Playing around with Neo4J and Groovy