Learning to work atomically

I have been doing a lot of work with MongoDb recently and I have made a few noob mistakes despite being relatively well-grounded in the theory. One of the key mistakes I have made using the Java driver is to not have the driver in the right mode. By default the driver will not block on an insert, you need to be in Safe Mode for that to happen.

What is the impact? Well if you are trying to update a record that you have just inserted and the update neither fails nor is applied then chances are that the update failed to find the record you had just inserted because it wasn’t there when the update query ran. Of course a few milliseconds later it appeared and is there are the end of the batch process.

Updates in Mongo consist of a query and a data change operation and there is an art in getting the query to work on the set of data you want it to. I find myself doing a conditional match in Scala and then thinking “at this point is that still going to be valid?” and then going back tweaking the query so that the update is guaranteed to be valid at the point it happens.

Today I spent a lot of time buggering about trying to avoid writing keys in the document that held no data, after doing it I realised that I could have just written a single remove statement that would have removed the empty keys in one big cleanup after the data had been stored.

Atomic independence also means losing some things that we take for granted like sequence ids. People like numbers but guaranteeing even ascending values can rapidly become a nightmare if you want to avoid contention and single point of failure.

Cursors are similarly tricksy, I have a long-running batch job and I realised today that it runs long enough that you cannot guarantee a known state by the time it finishes. Instead you have to do these kind of “loop until there’s nothing left to do” constructs where the loop condition expresses the state of the store you are trying to achieve and you get at least one cursor that has no entries.

There’s a lot of stuff about datastores that is ingrained deeper than you realise and it takes more than one difficult experience to start genuinely thinking differently about things.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s