Programming

Google Cloud Functions

I managed to get onto the Google Cloud Functions (GCF) alpha so I’ve had a chance to experiment with it for a while. The functionality is now in beta and seems to be generally available.

GCF is a cloud functions, functions as a service, AWS Lambda competitor. However thanks to launching after Lambda it has the advantage of being able to refine the offering rather than cloning it.

The major difference between GCF and Lambda is that GCF allows functions to be bound to HTTP triggers trivially and exposes HTTPS endpoints almost without configuration. There’s no messing around with API Gateway here.

The best way I can describe the product is that it brings together the developer experience of App Engine with the on-demand model of Lambda.

Implementing a Cloud Function

The basic HTTP-triggered cloud function is based on Express request handling. Essentially the function is just a single handler. Therefore creating a new endpoint is trivial.

Dependencies are automagically handled by use of a package.json file in the root of the function code.

I haven’t really bothered with local testing, partly because I’ve been hobby-programming but also because each function is so dedicated the functionality should be trivial.

For JSON endpoints you write a module that takes input and generates a JSON-compatible object and test that. You then marshal the arguments in the Express handler and use the standard JSON response to send the result of the module call back to the user.

Programming

Migrating Neo4J Python apps on Heroku

Okay this is quite specialised by the four or five of you who will have the same problem I wanted to save you some time and suffering.

So Neo Technologies have been incubating a plugin of their excellent graph database on Heroku for a while. So far the plugin was only available in beta but now anyone can have it. This is excellent news and I would recommend it as a way of getting starting with graph-based web programming. However if you were in the beta program then you now need to migrate from the beta plugin to the new one.

The instructions that went out to the beta program implied that this was simply a case of dumping a backup zip, switching out plugins and then uploading your zip. Well the good news is that exporting and importing the zips works exactly as advertised but the bad news is that the two plugins are quite different in terms of the environment they expose. The beta plugin had an extensive list of variables that exposed the various parts of your hosted environment. The new one just exposes the variable NEO4J_URL which is a url to the server and contains an embedded username and password.

Now the new variable does actually encode all the information that the original manifest did but in a very limited way and your library is going to have to work quite hard to correctly construct the base urls and requests required to access the REST API. I’m not sure which libraries do do it (I presume the Java ones) but neither of the Python ones do.

I’m going to describe what you need to do for the neo4j-rest-client which is the one I use in my apps but it will probably be similar for py2neo which is what you might want to use if you want to use a lot of Cypher.

So the simplest way to explain the solution is code.

import urlparse
from neo4jrestclient.client import GraphDatabase

def extract_credentials(url):
	parsed_url = urlparse.urlparse(url)

	if parsed_url.username and parsed_url.password:
		return (parsed_url.username, parsed_url.password)

	return None

GRAPH_URL = os.environ.get('NEO4J_URL', "http://localhost:7474") + "/db/data/"

credentials = extract_credentials(GRAPH_URL)

if credentials:
	db = GraphDatabase(GRAPH_URL, username = credentials[0], password = credentials[1])
else:
	db = GraphDatabase(GRAPH_URL)

So the neo4j-rest-client library supports username and password credentials but doesn’t parse them out of the url itself. Fortunately urlparse makes this pretty trivial. The conditional pieces of the code deal with the situation where we are running locally, essentially if we can’t see the Heroku environment variables we want to fallback to the local case (most Heroku stuff works this way).

A more frustrating issue is the difference between the url of the server and the root resource for the REST API. Naturally these are not the same but few libraries handle been given the wrong url that gracefully. Since the host URL does return successfully you usually get some failure about parsing or unpacking the root document. Submitting a patch to detect where the url ends in db/data or not would seem to be the logical solution.

So this code should boot a successful wrapper around the REST interface and your app should work again.

Except, that there seems to be another issue in the registering and deregistering of the plugin manifests. What I have observed is that heroku config lists the beta environment variables and not the new values. So even if you do this the library still gets 404 errors on the root document (because it is looking for the Neo4J environment that has been deallocated).

So the best way to migrate your app in my view is:

go to your current app and download a database backup
create a new app with a temporary name (or something like my-app-2)
carry out your code changes as described above
load your new code into your instance
upload your backup into the new instance
if the app is working rename your old app (to something like my-app-old)
name your new app whatever your old app was called

This seems easier and less hassle than migrating in place. Once the beta plugin is turned off you should be able to delete the old app.

This process has allowed me to migrate my two demo apps successfully (pending bug reports): Crumbly Castle and Flow Demo.

Web Applications

Heroku versus GAE & GAE/J

At the Geek Night last night there was a lot of questions about how Heroku compared with Google App Engine (and I guess implicit in that how it compares to GAE/J with the JRuby gem). Since it came up several times I thought it would be interesting to capture some of the answers.

Firstly there are the similarities, both deal with the application as the base unit of cloud scaling. Heroku offers several services that are similar to GAE, e.g. mailing, scheduled and background jobs. Some aspects like Memcache are in beta with Heroku but fundamentally both services seem to concur.

They are both free to start developing and then the pricing for the system varies a bit between the two, GAE might be cheaper but also has a more complex set of variables that go into the pricing, Heroku prices on: data size, web processes required and then options, so it’s easier to project your costs.

Differences then: well a big one is that Heroku uses a standard SQL database and also has excellent import/export facilities. GAE uses Google BigTable which is great for scaling and availability but data export and interactions are very limited. Heroku has an amazing deployment mechanism that involves simply pushing to a remote branch. GAE is slightly more involved but it is still relatively easy compared with deploying your own applications.

GAE provides access to a variety of Google services that are handy such as Google Account integration and image manipulation. GAE also has XMPP integration which is pretty unique if you have the requirement. However these do tie you to the platform and if you wanted to migrate you’d have to avoid using the Google webapp framework and the API. If flexibility is an issue you are probably going to prefer Heroku’s use of standard Ruby libraries and frameworks.

Heroku has few restrictions on what can and can’t be done in a web process, you also have quite powerful access to the user space your application runs in. GAE and particularly GAE/J has a lot of restrictions which can be quite painful particularly as there can be a discrepency between development and deployment environments.

However running a JVM in the cloud provides a lot of flexibility in terms of languages that can be used and deployed. Heroku is focussing on Ruby for now.

From my point of view, choose Heroku if you need: uncomplicated Rails/Rack application deployment, a conventional datastore that you want to have good access to, a simple pricing model. Choose GAE if you want to have out of the box access to Google’s services and a complete scaling solution.

Programming, Ruby, Web Applications

Why does Heroku matter?

Heroku is an amazing cloud computing service that I think is incredibly important but which really isn’t getting the visibility it should be at the moment.

Why does Heroku matter?

Firstly it gets deployment right. What is deployment in Heroku? It is pushing a version of your app to the cloud. That’s it. The whole transformation into a deployable application slug is handled by the Heroku, pushing it out to your dynamos is handled by Heroku. Everything is handled by Heroku except writing your application.

Secondly, it is the first cloud/scalable computing targeted directly at Ruby that has the right pricing structure to encourage experimentation and innovation. It has the low barrier to entry that Google App Engine has but requires no special code and imposes no special restrictions.

Thirdly it does the right thing with your data. It respects your right to access, mine and manipulate your data and provides a mechanism for doing this easily.

For me Heroku has set a new benchmark for Cloud Computing and build and deployment in the enterprise as well.