Programming

London Django Meetup May 2023

Just one talk this time and it was more of a discussion of the cool things you can do with Postgres JSON fields. These are indeed very cool! Everything I wanted to do with NoSQL historically is now present in a relational database without compromise on performance or functionality, that is an amazing achievement by the Postgres team.

The one thing I did learn is that all the coercion and encoding information is held in the Django model and query logic which means you only have basic types in the column. I previously worked on a codebase that used SQLAlchemy and a custom encoder and decoder which split custom types into a string field with the Python type hint (e.g. Decimal, UUID) and the underlying value. By comparison with the Django implementation which appears to just use strings this is a leaky abstraction where the structure of the data is compromised by the type hint.

Using the Django approach would have been easier when using direct SQL on the database and followed the principle of least surprise.

The speaker was trying to make a case for performing aggregate calculations in the database but via the Django ORM query language which wasn’t entirely convincing. Perhaps if you have a small team but the resulting query language code was more complex that the underlying query and was quite linked to the Postgres implementation so it felt that maybe a view would have been a better approach unless you have very dynamic calculations that are only applied for a fixed timespan.

It was based on an experience report so it clearly worked for the implementing group but if felt like the approach strongly coupled the database, the web framework and the query language.

Standard
Python

London Django Meetup April 2023

I’m not sure whether I’ve ever been to this Meetup before but it is definitely the first since 2020. It was hosted by Kraken Energy in their offices which have a plywood style auditorium with a nice AV setup for presentations and pizza and drinks (soft and hard) for attendees.

There were two talks: one on carbon estimates for websites built using Django and Wagtail; the other about import load times when loading a Django app into a shell (or more generally expensive behaviour in Python module imports).

Sustainable or low impact computing is a topic that is slowly gaining some traction in the wider development community and in the case of the web there are some immediate quick wins in the form of content negotiation on image formats, lazy loading and caching to be had.

One key takeaway from the talk is that the end user space is the area where most savings are possible. Using large scale cloud hosting means that you are already benefiting from energy efficiencies so things like the power required for a mobile phone screen matters because the impact of inefficient choices in content delivery is multiplied by the size of your audience.

There was a mention in passing that if a web application could be split into a Functions as a Service (FaaS) deployable then, for things like Django that have admin paths and end user paths, you can scale routes independently and save on overprovisioning. If this could be done automatically in the deployment build it would be seamless from the developer’s point of view. I think you can do this via configuration in the Serverless framework. It seems an interesting avenue for making more efficient deployments but at a cost in complexity for the framework builders.

There was quite an interesting research opportunity mentioned in the talk around serverless-style databases. For sites with intermittent or cyclical usage having an “always on” database represents a potentially big saving on cost and carbon. There was mention of the service neon.tech which seems to have a free personal tier which might be perfect for hobby sites where usage is very infrequent and a spin up time would be acceptable.

The import time talk was interesting, it focused on the developer experience of the Django shell boot time (although to be honest a Python shell for any major framework has the same issues). There were some practical tips on avoiding libraries with way too much going on during the import phase but really the issue of Python code doing expensive eager activity during import has been a live one for a long time.

I attended a talk about cold starts with Python AWS Lambdas in 2019 that essentially boiled down to much of the same issues (something addressed, but not very well in this AWS documentation on imports). Little seems to have improved since and assumptions about whether a process is going to be short or long-lived ultimately boils down to the library implementer and the web/data science split in Python means that code is run in very different contexts making sharing libraries across these two use cases hard.

The core language implementation is getting faster but consensus on good practice in import time behaviour is not a conversation that seems to be happening between the major library maintainers.

The performance enhancements for core Python actually linked the two talks because getting existing code onto more efficient runtimes helps reduce compute demands across all usage.

Standard
Programming, Python

Django and JSON stores: a match in heaven

My current project is using CouchDB as its store and Django to implement the web frontend. When you have a JSON store such as CouchDB then Python is a natural complement due to its brilliant parsing of JSON into native data structures and its powerful dictionary data type that makes it really easy to work with maps.

In a previous project using Python and Mongo we used Presentation objects to provide domain logic on top of the raw data maps but this time around I wanted to try and cut out a layer and just work with maps as much as possible (perhaps the influence of the Clojure programming I’ve been doing on the side).

However this still leaves two problems that need addressing. Firstly Django templates generally handle dictionaries like a dream allowing to address them with the standard dot syntax. However both Mongo and Couch use leading underscores to indicate “special” variables and this clashes with the Python convention of having a leading underscore indicate a private member of the class. The most immediate time you encounter this is when you want to use the id of a document in a url and the naive doc._id does not work.

The other problem is the issue of legitimate domain logic. In Wazoku we want to use people’s names if they have supplied them and fallback to their email if they haven’t supplied their name.

The answer to both of these problems (without resorting to an intermediary object) is Django’s filters. The necessary logic can be written in a few lines of Python that simply examines the dictionary and does the necessary transformation to derive the id or user’s name. This is much lighter than the corresponding Presentation solution.

Standard
Programming

The Locality Problem

One of the key problems to solve if the Carbon Co-op application is to work is the Locality problem. That is: who is near who? For the SiCamp weekend my approach was going to be simply to hack it by saying that everyone who shared the first same segment of postcode were in the same Locality.

This is blatently not true though. In London Boroughs matter more and can consist of different Postcode stems due to the alphabetic numbering system. In small cities like Swansea, Bath and Bristol you can effectively consider most of the city codes to be in the same Locality. Swansea postcodes also cover Carmarthen which is clearly not in the same locality as Swansea.

I have seen this handled in a variety of ways before, usually by creating a hierarchy where a Postcode stem is mapped to a leaf node in a tree which is normally the county, city or metropolitan area the area belongs to. The counties then belong to regions, regions to countries and countries to the UK, then Europe.

However for the Carbon Co-op I didn’t want to have the hassle of having to create and maintain a hierarchy. You also need to deal with the fact that people who share the same Postcode actually may have special meaning in terms of the projects they can partake in.

So I thought about create a Locality model that is simply named and arbitrary and which has many (and at least one) Locality Filters that are actually Regular Expressions. You then run someone’s postcode through the Filters to see which Localities they belong to. This deals with the situation where someone may be able to take part in Locality linked Actions for both Scotland and Fife.

Design-wise I’m quite taken with this and would like to implement it to see if it works in practice. However it is a techie solution as only programmers really love Regular Expressions and even then… not so much. To simplify the interface you would probably just have the user type the extent of the postcode they wanted to group. There is also a question of whether this filtering would actually scale. You might have to associate the Devotee with their Localities, which then raises the issue of when it should be regenerated, which makes you wonder whether you can back a Django model with a view, which makes a good case for doing it!

Standard
Python, Web Applications

Django in 24 hours

Last Saturday afternoon I decided to learn Django. It was 2pm on the first day of SiCamp 2008 in London and being the only developer in the room at that point I decided that I should do whatever I felt would be the best option to get an application running by 2pm the next day.

Previously I have done some Google App Engine and the experience convinced me to give Django a go after I found myself, by intuition, creating a GAE project structure of handlers.py (views), models.py and a directory called templates that contained templates. I was then disappointed to find the whole world had got there before me.

So, Django in 24 hours, baptism of fire. What do I think now looking back on the experience?

Everyone has told me that the Django documentation is good and I think I have to concur. Not everything is so clear that when you’re speed reading in one window and typing in another it works first time but importantly nothing in the documentation is actually wrong. When stuff is not working a second, careful look at the documentation got me back on track.

Importantly Django’s core model of web development is sound and intuitive. My editor had around ten files open for the project and the flow of adding something to the application did naturally flow from url to handler to view to model. Maybe the only quibble I have is that the views.py file is deceptively named in MVC terms.

The core of the framework is amazingly concise, I spent the majority of my time thinking about the problem not about the framework API. Binding a URL to function made sense, having to specify a template instead of having one inferred from the method name was maybe my one criticism in the method handler but on the plus side it does allow for flexibility in handling requests. Handing off from one request handler to another was very easy.

Django templates are both amazing and annoying. The syntax and principles are amazing, it was easy to play around with the pages and the template inheritance was really powerful for avoiding duplication. However when I transferred the application from self-serving to mod_python the template generation was very wobbly when compiling changes from the file system. Of course this could also have been mod_python but it was the latest 3 series stable source compiled for the machine. I’ve used Jinja2 previously when generating HTML in Python and might be tempted to stick with it in future.

Django models are great, I hate ORM but I really liked syntax for defining persistence properties and I liked the way that you don’t have the fact that you are really dealing with SQL hidden away from you. It genuinely seemed a more convenient way of expressing the data model rather than an OO wallpaper over relational data storage. I didn’t feel the need to add domain logic to the models but I felt like it wasn’t really polluting the model to do that either.

One thing that didn’t work at all was changing the relations between models; it took me two or three attempts to finally model the relationships between the data concepts. Each time I changed a Foreign Key or Many to Many relationship I ended up deleting the database (SQLite3) as I couldn’t figure how to migrate from the old schema to the new.

One reason for choosing Django was the idea that I wouldn’t have to write the backend code as the admin stuff would be right there for me. It took me a while to get the 1.0 admin to fire up but once it was running it did perform as advertised. One of the attractive things about the application was that the data model followed the conceptual language of the solution in a really powerful way. You could use the admin interface to have a Devotee perform Devotion to an Action. My geek excitement peaked anyway, YMMV.

So them’s the highlights of the experience. Overall Django delivered me a rapid web development process in an intuitive, powerful way and lived up to nearly all of the claims made on the tin. Deploying to Apache/mod_python was painful but most of the pain surrounded the infrastructure of my box (multiple versions of Python, Apache config files) and my lack of mad Apache admin skillz.

I would happily tackle another project in it again.

Perhaps of interest is how the Django development experience matches up against Rails or GAE  which would have been the other obvious choices. GAE would have been very similar but the deployment would have been better and I wouldn’t have had any automatic admin. In retrospect it may have been a better choice for a hack party type event. It certainly would have been my choice for personal projects for easy of deployment but now I have one Django app running perhaps that isn’t as relevant any more. Certainly the thing that has kept me from GAE before, the pain of data migration, doesn’t seem that better in Django (except that you control the datastore and its contents).

Compared to Rails?

  • Admin is much more awesome than scaffolding.
  • Django ORM is much less complex than Active Record, all the data required to create, deploy and use the object is in one place. Django doesn’t have Migrations but has its own brand of database versioning pain.
  • RSpec is awesome (despite its monkey patching of Object) so you aren’t going to beat Rails for easy testing.
  • Django templates are more powerful and easier to use than Erb but you have a lot of Ruby templating options so it’s hard to make a complete comparision. They probably both similar in the sense that you can find a template library that suits your preferences. Django is the purely solution out of the box.
  • Routing and Controllers are much less involved in Django than Rails.
  • Django is less opinionated about how you structure your application directories, which I like.
  • Django doesn’t bake-in AJAX components but is “batteries included”, Rails probably generates better Web2.0 style apps for less effort.
  • Finally Django uses only a few code generators because its basic structure is far less involved. It also generates far less “stuff” for each MVC element which I quite like as I don’t tend to use everything Rails generates.

Okay detailed analysis over, what’s the high-level view? Django and Rails are similar experiences but I think the major differences between them are almost what you could say about Python and Ruby. In Django you are going to get simplicity, clarity and a real choice of how you plug your infrastructure components together. In Rails you are going to get magic up front which is cool but also magic at the back end, which is not cool. Ultimately I think the answer is how opinionated you like your software. Well punk? How opinionated do you like it?

Standard