Programming, Scripting

CouchDB: Querying data

CouchDB allows you to pass a map function to a special view URL to query the data in an ad-hoc way. Views can also be stored as JSON documents with a convention URL (_design on the server, accessed as _view by the client). These can then be obtained via a HTTP request.My functional and Javascript programming are weak but this is what I understand of writing queries in CouchDB. Let’s take an example of a set of library cards, each card represents a book but the amount of information I have on each book varies.

The basic find all function is this:

function(book) {
map(null, book)
}

This defines an anonymous function that takes one parameter, the target document, in this case a book, and returns an array of values. What is in the value list is controlled by the second parameter, in this query I return the entire document. The first parameter controls the sorting or ordering. So I wanted to return the title of all the books in my database then I would use:


function(book) {
map(book.name, book.name)
}

Sorting them by ISBN would go like this:

function(book) {
map(book.isbn, book.name)
}

One important thing to note is that if an object doesn’t have a value it doesn’t respond to the function and will not be included. So if I created some of my entries with a value title instead of name anything with a title and not a name will not be in the query. However if I use a non-existent entry as an ordering criteria the value will count as null and be sorted.

Because I can include any valid Javascript in my function I can actually put a lot of complexity into my queries. For example:

function(book) {
if(book.isbn != null) { map(book.name, {"Name": book.name, "ISBN": book.isbn})
} else { map(book.name, book.name) }
}

So I suspect this will either make you cheer or puke. What this function does is return a JSON object containing the Name and ISBN of the book if they are known or just the Book name as a String otherwise. Unlike SQL the heading of my query is almost completely arbitrary as long as the value on the right of my map function translates to a valid JSON object.

Now at work there are often a lot of debates as to whether things are “rigid” or “structured” or whether they are “flexible” or “formless”. It is a bit like the old meat and poison adage. CouchDB allows a client to construct an almost arbitrarily rich response to a query with almost no restriction on how the data that should be included in that response. In some cases this is going to allow you to easily interact with very complex unstructured data in some cases it is going to be an invitation to create a sprawling dataset with no value. There is no inherent right or wrong choice here but for a particular problem being solved there is probably going to be a wrong and right choice. SQL is powerful because of the restrictions and rules it builds into its grammar. Using Javascript is powerful because it relaxes those restrictions. Programmers and IT folks in general often fall into using the laxest possible implementation for reason of “flexibility” but then either have to impose order themselves or lose the power of the more restrictive choice.

So putting that into a concrete example, if a write a view with SQL I am going to have to follow a set of rules to get the data I want (for example my heading is going to have to be a set of tuples of equal size), using an arbitrary script and JSON means I am going to be able to get exactly the data I want in the form I want it. However since that return structure is customised to my query I might possibly be reducing my reuse by being over-specific or by building too much logic into my view code.

That’s quite a diversion just so I can say it’s horses for courses, so let’s wrap up this quick look at CouchDB views. All of CouchDB’s views are effectively JSON objects that are passed to a separate view server. This is a separate process that interacts with the main server via STDOUT and STDIN pipes. By default this is the view server that is built from the Spidermonkey library (it is called couchjs). However you can write a view parser for any language and plug it into CouchDB by creating an executable and mapping it to a MIME type in the couch.ini file. The view server essentially parses and readies the query function that is associated with the view and is then sent every document in the database as a JSON string. The view server picks up the results of reading every document and sends that back to the query request.

It is a pretty simple system and it works will for the relatively flat documents I have been trying with it. However I suspect that in a project with multiple developers some ground-rules for writing consistent query code would be a must.

Standard

4 thoughts on “CouchDB: Querying data

  1. cozumelkid says:

    Maybe the programs have just gotten too complicated. I have been playing with computers since Tandy marketed the TRS-80. In those days the computer was slow. As time went by, the computers have increased in speed and decreased in size. I once worked with an IBM AS400 with 11 gig hard drives, and RPG. The program was huge. I think it was called IBM Office. The system could handle a tremendous amount of data, for that time. Now, computers are incredably faster. Is it really necessary to go beyond basic programming when our computers are so fast?

  2. I think the commonest answer would be that as the capabilities of our hardware platform rise so do our expectations of what can be achieved. Modern web architectures have the possibility of serving data around the entire globe. It is an intriguing possibility which takes us a long way onwards from what one physical machine can do, no matter how powerful.

  3. Randy says:

    “groking” the power and flexibility of the CouchDB view, map, and reduce is the key for this technology. It does take some getting used to.

    I think a straight forward client library will make this technology much easier to use. (Ruby’s RelaxDb for example).

  4. Pingback: couchdb: couchdb 101 | breaker of stuff, destroyer of things

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s