I have been looking at a lot of the different databases that are working outside the traditional SQL style RDBMS (and I want to point out that having a true RDBMS might actually be something very different to what we have now). I have spent a lot of time with CouchDB compared to the others because it seems to occupy a special niche in data storage. It also builds in a REST interface right off the bat which effectively decouples the storage mechanism from the delivery and makes the whole system very language neutral.
CouchDB represents the first serious contact I have had with JSON and in deciding whether CouchDB is right for your project you really need to understand what JSON does and doesn’t offer. The first important thing is that JSON offers a kind of compromise from the heavy data definition in SQL or XML Schema and totally typeless data of something like flat files or Amazon SimpleDB. However to achieve that simplicity there are two consequences. Firstly data integrity is palmed off onto client applications and you will need to check data going in and coming out. I don’t personally believe that there is ever an application that doesn’t have some explicit or implicit data integrity. Second JSON documents can be very rich and complex but they cannot have detailed field structures, a telephone number is simply going to decompose into two numbers and its hard to get much finer grain description of the data. As a rule of thumb if you would normally create a well-formed XML document for the data or you would define a SQL data definition that would simply declare all the types of the column using all the defaults and never specifying a NOT NULL column then you probably have a good match.
JSON also favours dynamic languages over static types like Java. You need to decide if you are going to be able to take advantage of the full set of features in your chosen client language. XML might remain a better choice for static languages as you describe document instances declaratively.
The negative features are tricky because we obviously have an alpha project here that is probably closer to the start of the public testing phase than the end. Some of these cons may well be addressed before the first release candidate. The first obvious omission for a server product is that there is no security built into the server. Just supporting optional HTTP Authentication would be enough to make it practical to start running some experimental servers for something more than sandbox exercises. One thing that is a major difference to the feature set in the standard pseudo-RDBMS is that there is no way of interacting with sets of data. There is a method for changing multiple documents in one pass but what you really want to be able to do is apply a set of changes to documents identified by a query, the equivalent of SQL’s UPDATE.
Related to this is the fact that data in CouchDB is currently heavily based on silos. If I have a set of data referring to authors and a set of data relating to books then I currently need to duplicate data in both. One of the problems that the developers are going to face in evolving CouchDB is how to address this without introducing a solution so complex that you ask why you are not using an RDBMS? Similarly I notice in the roadmap there is an item about data validation. If you start introducing data validation and rules for validating data then before long there is going to be a question as to why you don’t simply use one of the existing document systems as all the current simplicity will have gone.