Programming, Python

Django and JSON stores: a match in heaven

My current project is using CouchDB as its store and Django to implement the web frontend. When you have a JSON store such as CouchDB then Python is a natural complement due to its brilliant parsing of JSON into native data structures and its powerful dictionary data type that makes it really easy to work with maps.

In a previous project using Python and Mongo we used Presentation objects to provide domain logic on top of the raw data maps but this time around I wanted to try and cut out a layer and just work with maps as much as possible (perhaps the influence of the Clojure programming I’ve been doing on the side).

However this still leaves two problems that need addressing. Firstly Django templates generally handle dictionaries like a dream allowing to address them with the standard dot syntax. However both Mongo and Couch use leading underscores to indicate “special” variables and this clashes with the Python convention of having a leading underscore indicate a private member of the class. The most immediate time you encounter this is when you want to use the id of a document in a url and the naive doc._id does not work.

The other problem is the issue of legitimate domain logic. In Wazoku we want to use people’s names if they have supplied them and fallback to their email if they haven’t supplied their name.

The answer to both of these problems (without resorting to an intermediary object) is Django’s filters. The necessary logic can be written in a few lines of Python that simply examines the dictionary and does the necessary transformation to derive the id or user’s name. This is much lighter than the corresponding Presentation solution.

Programming, Python, Ruby

Truly open classes

Here’s an interesting observation, I needed to write a little script to automate some number calculating for me. I was wondering whether to do it in Ruby or Python. I’m doing a lot of Python at the moment so I felt I ought to give Ruby a little go. Share some of the love.

However the solution I had in mind really didn’t work with Ruby because while Ruby has open classes it has a comparatively fixed idea of attributes. In Python you can set attributes very freely on any object so I have got in the habit of creating something and then enhancing by applying a function. Example? Okay.

def make_captain(actor): actor.rank = "Captain" return actor

class Person: pass

captain = make_captain(Person())

So this little trick doesn’t work, or rather is much more difficult to do in Ruby as Ruby, at is dynamic heart, is a language that believes in object-orientation and that classes should encapsulate rather than being little collections of data. You can use instance_variable_get/set but it lacks the elegance of the Python syntax.

In Ruby it would be easier to define the attributes in the class using the existing metaprogramming constructs and then have a class method to generate the content (effectively encapsulating my script logic).

Now this isn’t a straight “Ruby sux, no Python sux more” post. Between Scala, Clojure and Python I have been doing a lot more in a functional style that depreciates objects as anything more than value carriers. The Ruby vision of a class would give me something with a stronger sense of purpose and encapsulation, something that is hard to benefit from in a script for a particular purpose.

What is going to be interesting this year is trying to identify when the value of a piece of code is in the structure of it’s data-definition (i.e. objects) versus its process (functions). Having had a think about it I should perhaps rewrite my script to use some OO modelling because it may answer similar requirements down the line. However from a strict Lean/Waste point of view I should have gone with the Python solution as Ruby was imposing a restriction on me while providing benefits that I was unlikely to realise.

Python, Web Applications, Work

Declare for simplicity

I was struggling with an issue today on a template that was blowing up due to missing keys in a template data hash. At first I tried to write some conditional code in the template. That ended up being quite ugly so then I was trying to find a way to condition the template on whether the key was present.

Then it struck me that the issue was really the missing key. As the hash is prepared in Python code the original implementation frugally avoided creating the entry if the underlying data wasn’t present. While this is clever and minimal it actually just pushes the complexity out of the Presenter logic and into the template where it is much worse because you can only interact with the Presentation’s abstraction of the original data.

The problem here is that dynamic languages sometimes allow you to be too clever. In a declared structure system you have to declare all your data and provide sensible defaults. This makes writing the templating solution at lot easier as you never have a missing data problem to deal with (you still have headaches with duff defaults but that is a different post).

So I went back to the presenter and declared an empty list under the key that had been causing me grief. Bingo! My failure went away and the default behaviour of not generating any content for the empty list kicked off, reducing my template back to the unconditional single-line I had originally.

Python, Web Applications

Juno, a micro web framework for Python

I love microframeworks and I love Python to it therefore follows that I love the minimal web frameworks you can get for Python like CherryPy. However as soon as I heard about Juno, a clone of my favourite microframework Sinatra I had to give it a go.

For me the archetypal application for Sinatra is Git-Wiki and therefore to kick the tires on Juno I decided to try and do the same thing with Bazaar. Hence: Bzr-Wiki on Juno!

You need Juno, Jinja2 and SQLAlchemy as well the source from Launchpad. Once you have it all then got to the repo directory and type bzr init. You should then be good to fire up wiki.py and muck around with it.

It is slightly ugly and lacks access to revision history, diffs and user tracking but apart from that it is a surprisingly functional wiki. It is also bi-directional in that you can add files to the Bazaar repo and they get reflected in the app.

So what was Juno like to work with? Well overall I thought this was the best Python microframework I have used so far. I really like the idea of decorating methods to avoid having to generate a mapping table. The syntax is terse and comprehensible, the conventions around the framework made sense to me.

By comparison with its progenitor I think I really missed the autoloading/dynamic evaluation that allows you to change code in Sinatra and have it immediately take effect. The function of the Request decorator was initially quite obscure (it binds all HTTP verbs to that method, if you want to map GET to another method you must specify all verbs independently) and I am still not sure it is right. I think the most specific decorator should take precedence. Other than that I think the framework ports a lot of the concepts from Sinatra in a sympathetic way.

The dependency on SQL Alchemy is also really clunky. If you specify that you are not using a database (as is the case here) then it is annoying to have to download a dependency and makes installation on Windows a pain I didn’t even want to try and tackle.

Juno is really promising though and I look forward to it developing. I think it would be a real delight to use it in an environment like Google App Engine.

Java, Programming, Python, Scripting, Web Applications, Work

JMeter or The Grinder, so which one is better, like?

Or for the benefit of Google: Apache JMeter versus The Grinder. Fight!

A while ago I got paid to put these two tools head to head and I think it’s been long enough that the people who put up the money will have got the benefit now.

I had used The Grinder in a previous job where it had done excellent service finding out what level of peak load our site could handle. It was convenient in that you script it in Jython because Jython was our chosen scripting language.

JMeter on the other hand was a whole new thing to me. It is a kind of graphical programming interface where you drop controls into a tree structure and the framework generates threads that run through the tree generating interactions with the target application via things like HttpClient.

The scripting versus component structure is a pretty major difference between the two and is probably one of the key things that is going to help you choose between two products that are both, to be honest, pretty mature, capable pieces of software.

If you don’t feel comfortable with Python or Jython then Grinder is likely to be out. JMeter is much more friendly to non-programming testers. However this decision isn’t as clear as you might think, JMeter actually includes Javascript functions, access to BeanShell and its own expression language. As we created more complex traffic simulations we found ourselves having extremely complex expressions that often mixed a scripting language with an expression evaluation. In some ways The Grinder is more honest in telling you up front that you are going to have to program some of this stuff yourself.

Another big differentiator is in reporting, if you need to generate reports then JMeter is a much better choice as it has components that collect and collate information and you can dump data to XML for XSLT transformation into pretty much any output format you want. It is also possible to run JMeter headless once you have the test scripts so it is possible to encorporate it into a continuous integration process. Debugging slightly flaky applications is much easier with JMeter because you can easily capture a lot of result information and then just delete the data gathering component once the test flow is working.

A lot of the functionality of both products overlaps: they can both be used a browser proxy and be used to capture scripts as a user “drives” around the site. These captures are likely to form the basis of your test plans unless you have a very clear URL scheme with little session state.

Both use thread pools to generate load and both tend to get blocked on your application before they max out their local machine (although both are capable of using 90% of memory and CPU). JMeter has a few nice options around randomly ramping up the thread activation (recreating a spike in usage more closely). The Grinder tends to give more bang for the buck as its runtime is very simple and minimal.

Both are also capable of using distributed agents and collating the data from these agents back to the coordinator.

In many ways there isn’t a “bad” choice to be made between them. So what might sway your choice one way or another? Well JMeter is an Apache project and that might make a difference for some people as you have all those good things like project governance and a good chance the project is going to continue to be developed and enhanced. JMeter was also the only project to have a test suite last time I checked out both code bases.

The Grinder does have one unbeatable quality in my opinion and that’s flexibility. When you look at the features listed on the websites you might think that JMeter does all this cool stuff with HTTP, SOAP and POP3 and the like and that The Grinder is comparatively light in the feature set.

The opposite is actually true, as The Grinder website points out the The Grinder can test anything with a Java API. There really is no limit to what you can make it do or how you can execute your test plan. In fact most of the things I’ve complained about with The Grinder are fixable from within the test scripts (what I am really complaining about is that some things like capturing HTML output per run is something that should be available as part of the standard package). On the other hand I felt that creating a new JMeter component was actually quite tricky as you have to deal with both the GUI aspects and the actual functionality you are trying to create in your test component.

If you really want to have total control of your concurrent volume testing then The Grinder is probably the best product for you.

Perhaps the last topic for consideration is The Grinder’s use of Jython. Python is a great language and you get a lot of power for some very concise code. Just as some people are going to find it off-putting others are actually going to find it very compelling.

Okay so yet another weak “horses for courses” post on the InterWeb. Perhaps the easiest summary to end on is that your test teams will probably take a shine to JMeter while your developer teams who are also building scale tests are probably going to like The Grinder; unless they dislike Python or learning a new language.

Programming, Python, Ruby

Mocking Random

Mocking calls to random number generators is a useful and important technique. Firstly it gives you a way into testing something that should operate randomly in production and because random number generation comes from built-in or system libraries normally it is also a measure of how well your mocking library actually works.

For Ruby I tend to use RSpec and its in-built mocking. Here the mocking is simple, of the form (depending on whether you are expecting or stubbing the interaction):

Receiver.should_receive(:rand)
Receiver.stub!(:rand)

However what is tricky is determining what the receiver should be. In Ruby random numbers are generated by Kernel so rand is Kernel.rand. This means that if the mocked call occurs in a class then the class is the receiver of the rand call. If the mocked call is in a module though the receiver is properly the Kernel module.

So in summary:

For a class: MyClass.should_receive(:rand)
For a module: Kernel.should_receive(:rand)

This is probably obvious if you a Ruby cognoscenti but is actually confusing compared to other languages.

In Python random functions are provided by a module, which is unambiguous but when using Mock I still had some difficultly as to how I set the expectation. Mock uses strings for the method called by the instance of the item for the mock anchor. This is how I got my test for shuffling working in the end.

from mock import Mock, patch_object
import random

mock = Mock()
class MyTest(unittest.TestCase):
    @patch_object(random, 'shuffle', mock)
    def test_shuffling(self):
            thing_that_should_shuffle()
            self.assertTrue(mock.called, 'Shuffle not called')

You can see the same code as a technicolor Gist.

This does the job and the decorator is a nice way of setting out the expectation for the test but it was confusing as to whether I am patching or patch_object’ing and wouldn’t it be nice if that mock variable could be localised (maybe it is and I haven’t figured it out yet).

Next time I’m starting some Python code I am going to give mocktest a go. It promises to combine in-built mock verification which should simplify things. It would be nice to try and combine it with Hamcrest matchers to get good test failure messages and assertions too.

Python, Web Applications

Django in 24 hours

Last Saturday afternoon I decided to learn Django. It was 2pm on the first day of SiCamp 2008 in London and being the only developer in the room at that point I decided that I should do whatever I felt would be the best option to get an application running by 2pm the next day.

Previously I have done some Google App Engine and the experience convinced me to give Django a go after I found myself, by intuition, creating a GAE project structure of handlers.py (views), models.py and a directory called templates that contained templates. I was then disappointed to find the whole world had got there before me.

So, Django in 24 hours, baptism of fire. What do I think now looking back on the experience?

Everyone has told me that the Django documentation is good and I think I have to concur. Not everything is so clear that when you’re speed reading in one window and typing in another it works first time but importantly nothing in the documentation is actually wrong. When stuff is not working a second, careful look at the documentation got me back on track.

Importantly Django’s core model of web development is sound and intuitive. My editor had around ten files open for the project and the flow of adding something to the application did naturally flow from url to handler to view to model. Maybe the only quibble I have is that the views.py file is deceptively named in MVC terms.

The core of the framework is amazingly concise, I spent the majority of my time thinking about the problem not about the framework API. Binding a URL to function made sense, having to specify a template instead of having one inferred from the method name was maybe my one criticism in the method handler but on the plus side it does allow for flexibility in handling requests. Handing off from one request handler to another was very easy.

Django templates are both amazing and annoying. The syntax and principles are amazing, it was easy to play around with the pages and the template inheritance was really powerful for avoiding duplication. However when I transferred the application from self-serving to mod_python the template generation was very wobbly when compiling changes from the file system. Of course this could also have been mod_python but it was the latest 3 series stable source compiled for the machine. I’ve used Jinja2 previously when generating HTML in Python and might be tempted to stick with it in future.

Django models are great, I hate ORM but I really liked syntax for defining persistence properties and I liked the way that you don’t have the fact that you are really dealing with SQL hidden away from you. It genuinely seemed a more convenient way of expressing the data model rather than an OO wallpaper over relational data storage. I didn’t feel the need to add domain logic to the models but I felt like it wasn’t really polluting the model to do that either.

One thing that didn’t work at all was changing the relations between models; it took me two or three attempts to finally model the relationships between the data concepts. Each time I changed a Foreign Key or Many to Many relationship I ended up deleting the database (SQLite3) as I couldn’t figure how to migrate from the old schema to the new.

One reason for choosing Django was the idea that I wouldn’t have to write the backend code as the admin stuff would be right there for me. It took me a while to get the 1.0 admin to fire up but once it was running it did perform as advertised. One of the attractive things about the application was that the data model followed the conceptual language of the solution in a really powerful way. You could use the admin interface to have a Devotee perform Devotion to an Action. My geek excitement peaked anyway, YMMV.

So them’s the highlights of the experience. Overall Django delivered me a rapid web development process in an intuitive, powerful way and lived up to nearly all of the claims made on the tin. Deploying to Apache/mod_python was painful but most of the pain surrounded the infrastructure of my box (multiple versions of Python, Apache config files) and my lack of mad Apache admin skillz.

I would happily tackle another project in it again.

Perhaps of interest is how the Django development experience matches up against Rails or GAE which would have been the other obvious choices. GAE would have been very similar but the deployment would have been better and I wouldn’t have had any automatic admin. In retrospect it may have been a better choice for a hack party type event. It certainly would have been my choice for personal projects for easy of deployment but now I have one Django app running perhaps that isn’t as relevant any more. Certainly the thing that has kept me from GAE before, the pain of data migration, doesn’t seem that better in Django (except that you control the datastore and its contents).

Compared to Rails?

Admin is much more awesome than scaffolding.
Django ORM is much less complex than Active Record, all the data required to create, deploy and use the object is in one place. Django doesn’t have Migrations but has its own brand of database versioning pain.
RSpec is awesome (despite its monkey patching of Object) so you aren’t going to beat Rails for easy testing.
Django templates are more powerful and easier to use than Erb but you have a lot of Ruby templating options so it’s hard to make a complete comparision. They probably both similar in the sense that you can find a template library that suits your preferences. Django is the purely solution out of the box.
Routing and Controllers are much less involved in Django than Rails.
Django is less opinionated about how you structure your application directories, which I like.
Django doesn’t bake-in AJAX components but is “batteries included”, Rails probably generates better Web2.0 style apps for less effort.
Finally Django uses only a few code generators because its basic structure is far less involved. It also generates far less “stuff” for each MVC element which I quite like as I don’t tend to use everything Rails generates.

Okay detailed analysis over, what’s the high-level view? Django and Rails are similar experiences but I think the major differences between them are almost what you could say about Python and Ruby. In Django you are going to get simplicity, clarity and a real choice of how you plug your infrastructure components together. In Rails you are going to get magic up front which is cool but also magic at the back end, which is not cool. Ultimately I think the answer is how opinionated you like your software. Well punk? How opinionated do you like it?

Programming, Python

Setting up a Mercurial work environment

So today I had a frustrating experience trying to get a Mercurial environment going for the current project I am working on. I am convinced for the kind of work we are doing the distributed branch model is going to be the right solution and I had been using a local Mercurial instance during my tech spiking earlier on the product.

Having built Python and Mercurial on the server and created a Mercurial user with the SSH keys of the individual developers I thought it would be easy sailing this morning. Not a bit of it and all of the problems stemmed from the lack of by default support for SSH in Windows.

Like most people I use the Putty Tool Suite for my SSH needs on windows. However simply aliasing putty to ssh doesn’t work. There are some key command switches that are different (including the trivial but annoying -P instead of -p for port).

Coming from a Java Subversion background I am used to having a portable library to take care of my ssh protocol needs. I also use OSX for my home development and that obviously provides a UNIX ssh command-line implementation, hence hassle-free SSH based source control.

The command line tool turned out to be the root of all my problems, both Eclipse and NetBeans Mercurial plugins don’t provide an implementation of the Mercurial client, instead they delegate it to the Mercurial client on the host OS, that in turns delegates ssh protocol invocations to the command line SSH.

The solution is simple once you know it, in the Mercurial.ini file you can alias ssh to Putty’s plink executable. There is in fact an example in the Mercurial book. Better still you don’t have to specify the key used if you are using Pageant.

However getting the first station to work was incredibly hard work. I even downloaded Cygwin just to get a more normal Unix ssh but by that point I had put a typo into the ssh alias and was getting a very weird error whenever I invoked Mercurial and I’d lost the forest in the trees.

At one stage I even gave Bazaar a go, hoping that SFTP might cure all my ills. However Bazaar uses Paramiko for it’s SSH support and on Windows that was failing due to its inability to find an OS source of entropy.

Server-side Python also let me down a bit as I was having an issue with the zlib module and after similar experiences with Ruby I knew that this would be because I must have compiled Python before zlib. Despite cleaning and re-configuring Python it still didn’t build the zlib module and in then end I had to go and run configure within the zlib module manually prior to a full Python build. This is exactly the same issue I have had with Ruby, what is the problem with this library?

Once you’ve done it once Mercurial repositories are actually easier to setup and manage than SVN (and that itself is pretty easy once you’ve done it a few times) and if you are working in a UNIX environment then they are extremely compelling.

However on Windows you are currently going to either do a lot of preparatory reading or be ready to set aside six hours to prevent frustration. In my case it probably didn’t help that I am also the one selling the utility of Mercurial. The pioneers always have the arrows in their backs; on the other hand they also get to better places long before anyone else.

Java, Python, Scripting

Sun recruits Jython leads

I know this isn’t exactly breaking news but it is further evidence that Sun is aiming to turn the JVM in a language agnostic platform. It’s also good news for the Jython project which has suffered a long period of hibernation and which has fallen far behind CPython in terms of its compatibility.

I like Python as a language and its clarity is great for one time scripts (which never really are). I would really like it to be a full member of the JVM-compatibles.

Groovy, Java, Python, Ruby, Scripting

Groovy or JRuby?

Martin Fowler blogged about the question a couple of days ago and ever since I have pondering that maybe it is not really the right question to ask.

I currently toodle between Jython, JRuby and Groovy for various reasons and I am an expert in none. The interesting thing I have found is that it is hard to pick one and just focus on that. To some extent they overlap heavily: they are all cross-platfrom, they are all dynamic, they all integrate with Java API stack I’ve committed to memory.

The first thing to say is that I am interested in scripting languages for prototyping and admin style scripting. I have never used Rails and the Grails data-model means that you need a specific kind of project to work on. If you want to use a particular product and that is only on one platform then that kind of makes your decision for you.

Each language has its own strengths, from my point of view I would categorise them in the following way. Jython has Python’s readability and solid language design, JRuby has Sun’s support, excellent community contributed library code and is very dynamic, Groovy is mini-Java, so it’s easy to learn and most importantly it has a functioning interactive console.

The last point might seem a bit weird, what about jirb and jython‘s interactive mode? Well Martin makes a very important point in his post about the purpose of these ports. Both JRuby and Jython aim to stay faithful to their source languages and be able to run code from their parent C implementations while expanding the API by accessing the Java libraries. Groovy on the other hand stays close to Java syntax and is the only one of the three that allows you to cut and paste code from a regular Java application and then play around with it in an interactive session. That is a very powerful and compelling feature.

Almost all the Groovy I do either comes from wanting to leverage or understand a piece of Java code.

All my Jython work on the other hand is about wanting to automate administration or manual tasks in a clear and concise fashion. Python’s dynamic data structures help, but so does zxJDBC the Jython specific database library that mixes DBI with JDBC to create a highly portable but simple database connectivity solution with no boilerplate!

JRuby on the other hand is something that only really comes up because Alpha Geeks love it. The syntax is gnarled and there is a significant learning curve before someone from a Java background can get “The Ruby Way” of things. The new integration of JRuby into NetBeans though makes developing in the language a comparative snap and I would suspect that JRuby will be a valid choice of application development language alongside Java now. The choice will be driven by the problems you are trying to solve not because one language is inherently “better” than the other.

Echo One

Sequentially arranged sentences composed of words (and punctuation)

Category Archives: Python