Month notes

June 2025 month notes

Foreign keys and link tables in SQL Alchemy

Foreign keys are surprisingly difficult to define as where as normally a basic Foreign Key is unidirectional with a parent and a child relationship SQL Alchemy often needs you to define the attribute on both models in a circular dependency that it then has to resolved by using strings to define the object relationships as Python doesn’t support forward references yet.

Indexed Constraints in Postgres

Primary Keys and Unique Constraints both generate associated indexes automatically but foreign keys, while they need to reference indexed columns do not automatically have an associated index and potentially don’t need them until your query planner tells you that is the bottleneck. I found this last idea a bit counter-intuitive but on reflection I think it must make sense given the lookup times of the parent rows. I guess the index may matter more if the relationship is one to many with potentially large numbers of children.

Thoughts on FastAPI

I finally did a lot of work on a FastAPI codebase recently, my first use of that framework. It is a lot like Flask but its support for Web Sockets and Async routes means that depending on what you’re working with it might be the only practical choice for your application.

FastAPI is actually an aggregrate over other libraries, in particular Starlette and as you dig through the layers you see that the key foundations are maintained by a few individuals with a very opaque governance structure.

I’ve used more obscure frameworks before but I didn’t really think it was a smart idea then and I don’t think it is great now. In theory the FastAPI layer could switch between implementations without breaking consumers but it all seems a bit more fragile than I had realised especially when you add on some of the complications of the various Python async ecosystem.

It’s made me wonder whether you’re better off sticking to synchronous until you have a situation where that can’t possibly work.

Coding with LLMs

LLM generated code means that some relatively uncommon idioms in programming languages come up more and more often when looking at colleagues code. The effect is quite incogrous when you have relatively inexperienced programmers using quite sophisticated or nuanced constructions.

This month in Python I noticed use of the else clause in loops which is akin to a finally block in iterations but isn’t widly used because (I think) the else clause can be hard to related to the enclosing loop and is easily visually confused with conditional clauses inside the loop.

Use of sum instead of len or count, Python’s sum allows you to pass a generator function directly to it instead of having to use an intermediate list comprehension or similar. This means you can save a bit of memory, some programmers use this habitually but I’ve only ever really seen it where people are trying to get a bit of extra performance in special cases. Most of the iterables in code often contain too few items for it to matter much and compared to the performance gain of moving to a more recent version of Python I’m not sure the gain is really noticeable.

Reading list

Standard
Programming, Python

London Python Coding Dojo February 2025

The Python coding dojo is back and this time it allows AI assisted coding which means that some of the standard katas become trivial and instead the challenges have to be different to either require different problems to be combined in an interesting way or have a very hard problem that doesn’t have a standard solution.

The team I was in worked on converting image files to ASCII art (with a secondary goal of trying to create an image that would work with the character limit of old-school Twitter).

We used ChatGPT and ran the code in Jupyter notebooks. To be honest ChatGPT one-shotted the answer, clearly this is a thing that has many implementations. Much of the solution was as you would expect, reading the image and converting it to greyscale. The magic code is this line (this is regenerated from Mistral rather than the original version).

ascii_chars = "@%#*+=-:. "

This string is used to map the value of each pixel to a character. It is really the key to a good solution in terms of the representation of the image and also when we tried to refine the solution to add more characters this was the bit of the code that went wrong as the generated code tends not to understand that the pixel mapping depends on the length of this string. A couple of versions of the code had an indexing issue as it kept the original mapping calculation but changed the size of the string.

On the one hand the experience was massively deflating, we were probably done in 15 or 20 minutes. Some of the team hadn’t used code assistance this way before so they got something out of it. Overall though I’m not sure what kind of learning experience we were having and whether the dojo format really helps build learning if you allow AI-assistance.

If the problems become harder to allow for the fact that anything trivial is already in the AI databank then the step up into understanding the problem as well as the output is going to be difficult for beginners.

There’s lots to think about here and I’m not sure there are any easy answers.

Standard
Month notes

February 2025 month notes

Winter viruses knocked me about a bit this month so I wasn’t able to get out to all the tech events I had hoped to get to and there were a few bed bound days which were pretty disappointing.

I also have a bit of a backlog on writing up the things that I did attend this month.

Synchronising

While I pay for a few synchronisation services (SpiderOak and pCloud) their Linux integration is a bit cumbersome so I’ve been looking for simpler ways to share files between my local machines. I read a great tutorial about SyncThing. The project’s documentation on getting started with SyncThing was also pretty great.

It took less than an hour to get everything going and now I have two of my laptops sharing content in a single folder so it’s possible to move things like credential files around between them simply and hopefully securely. It doesn’t seem to be taking up any meaningful system resources so far.

I also want to spend more time with LocalSend which looks an app version of the PWA PairDrop (cute name, based on Snap Drop). All the functionality looks good and it seems to be lightweight. I’m not quite sure why the app makes a difference over the PWA version.

Zettelkasten

This month I had a bunch of headaches with Roam Research failing logins on Linux and AppImage having a bug which meant that Obsidian and Logseq have to be run outside the sandbox. Getting things working properly was frustrating and while Roam is web-based it has no really mobile web version.

So instead I’d like to stop subscribing to Roam and figure out what I’m using it for. The knowledge connecting is still the most valuable thing compared to pure outliners or personal wikis. Both Logseq and Obsidian are good for this and currently my preference is for Logseq but I think Obsidian is better maintained and has a bigger community.

The other thing I was doing was dropping links into the daily journal for later sorting and processing. I’ve created a little web app to make this easier, currently I’m just building a backlog but it will be interesting to see what I find useful when I do want to dig up a link.

I also started using Flatnotes deployed via PikaPods to have an indie web way of taking a note on the phone but editing and refining it on laptops.

It’s interesting that it has taken so many different services to replace Roam, maybe that’s a sign of value but I think that I was overloading it with different functionality and I’m refining things into different workloads now.

Eleventy

Eleventy is a very cool static website builder that I would like to use as my main website generator in the long run. For now though I am still trying to learn the 11ty way (I currently use Jekyll); this month I was trying to figure out how to use data files and tags, things that power a lot of tricks in my current site.

Eleventy is ridiculously powerful because you can define data files to be executing Javascript files that read URLs or the filesystem and generate data that is then passed on to the page generation context. As an example you can read the directory where the data file is located, read the contents, filter out the directories and then generate a derived value from the directory name and use that as a data value in the rendered page.

In the past I’ve tended to use templates and front-matter in Markdown posts but with Eleventy you can use a mix of shared templates, including inheritance, and a Nunjucks page using these powerful data files and not really need to use Markdown or front-matter so much. You can also share snippets between the Nunjucks pages to get consistency where you need it but have a lot more flexibility about the way page renders.

It is amazing how flexible the system is but it also means that as there are multiple ways to do things there can be a lot of reading to do to figure out what the best way to do something is for your context. Documentation of the basics is good but examples of approaches are spread out across example repos and people’s blogs.

Power is great but so is one obvious way of doing things.

Interesting links

It’s not a fun subject but my former colleague Matt Andrew’s post about coping with redundancy was a good read with good advice for any kind of job seeking regardless of the cause.

Ofcom is making a dog’s dinner of applying the Online Safetry Act (OSA) to small communities and it seems to be down to community members to try and engage them in the problems, this writeup gives examples of the problems and pointers on how the regulator can improve.

Standard
Month notes

November 2024 month notes

Rust tools

Rust seems to be becoming the defacto standard for tooling, regardless of the language being used at a domain level. This month I’ve talked to people from Deno who build their CLI with it, switched to the just command runner and ruff code formatter.

It’s an interesting trend in terms of both other languages being more comfortable about not writing their tooling in a different language and why Rust seems to have a strong showing in this area.

Gitlab pipelines

I have been working a lot with Gitlab CI/CD this month, my first real exposure to it. Some aspects are similar to Github Actions, you’re writing shell script in YAML and debugging is hard.

Some of the choices in the Gitlab job environments seems to make things harder than they need to be. By default the job checks out the commit hash of the push that triggered the build in a detached (fetch) mode. Depending on the natural of the commit (in a merge request, to a branch, to the default (main) branch) you seem to get different sets of environment variables populated. Choose the wrong type and things just don’t work, hurrah!

I’ve started using yq as tool for helping validate YAML files but I’m not sure if there is a better structural tool or linter for the specific Gitlab syntax.

Poetry

I’ve also being doing some work with Poetry. As everyone has said the resolution and download process is quite slow and there doesn’t seem to be a huge community around it is a tool. Its partial integration with pyproject.toml makes it feel more standard that it actually is with things under the Poetry key requiring a bit of fiddling to be accessible to other tools. Full integration with the later standard is expected in v2.

Nothing I’ve seen so far is convincing me that it can really make it in its current form. The fragmentation between the pure Python tools seems to have taken its toll and each one (I’ve typically used pipenv) has problems that they struggle to solve.

RSS Feeds

One of the best pieces of advice I was given about the Fediverse was that you need to keep following people until your timeline fills up with interesting things. I’ve been trying to apply that advice to programmers. Every time I read an interesting post I’m now trying to subscribe. Despite probably tripling the number of feeds I have subscribed to my unread view is improved but still dominated by “tech journalism”. I guess real developers probably don’t post that frequently.

Lobsters has been really useful for highlighting some really good writers.

CSS

Things continue to be exciting in the CSS world with more and more new modules entering into mainstream distribution (although only having three browsers in the world is probably helping). I had a little play around with Nested Selectors and while I don’t do lots of pseudo-selectors it is 100% a nice syntax for them. In terms of scoping rules, these actually seem a bit complex but at least they are providing some modularity. I think I’m going to need to play more to get an opinion.

The Chrome developer relations team have posted their review of 2024.

Not only is CSS improving that but Tailwind v4 is actually going to support (or improve support) some of these new features such as containers. And of course its underlying CSS tool is going to be Rust-powered, natch.

Standard
Month notes

October 2024 month notes

For small notes, links and thoughts see my Prose blog.

Web Components versus frameworks

Internet drama erupted over Web Components in what felt a needless way. Out of what often felt wasted effort there were some good insights Lea Verou had a good overview of the situation, along with an excellent line about standards work being “product work on hard mode”

Chris Ferdinandi had a good response talking about how web components and reactive frameworks can be used together in a way that emphasises their strengths.

One of my favourite takes on the situation was by Cory LaViska who pointed out that framework designers are perhaps not the best people to declare the future of the platform.

Web Components are a threat to the peaceful, proprietary way of life for frameworks that have amassed millions of users — the majority of web developers.

His call to iterate on the standard and try to have common parts to today’s competing implementations was echoed in Lea’s post.

The huge benefit of Web Components is interoperability: you write it once, it works forever, and you can use it with any framework (or none at all). It makes no sense to fragment efforts to reimplement e.g. tabs or a rating widget separately for each framework-specific silo, it is simply duplicated busywork.

The current Balkanisation of component frameworks is really annoying and it is developer’s fear and tribalism that has allowed it to happen and which has sustained it.

Postgres generated UUIDs

In my work I’ve often seen UUIDs be generated in the application layer and pushed into the database. I tried this in a hobby project this month and rapidly came to the conclusion that it is very tedious when you can just have the database handle it. In Postgres a generated UUID can just be the column default and I don’t think I’m going to do anything else in future if I have a choice about it.

Python 3.13

I’ve started converting my projects to the new Python version and it seems really fast and snappy even on projects that have a lazy container spin-up. I haven’t done any objective benchmarking but things just feel more responsive than 3.11.

I’m going to have a push to set this as the baseline for all my Python projects. For my Fly projects extracting out the Python version number as a Docker variable has meant migrating has been as simple as switching the version number so far.

For the local projects I’ve also been trying to use asdf for tool versioning more consistently and it has made upgrading easier where I’ve adopted it but it seems I have quite a few places where I still need to convert from either language specific tools or nothing.

uvx

uvx is part of the uv project and I started using it this month and its rapidly becoming my default way to run Python CLIs. The first thing I started using it with was pg-cli but I found myself using it to quickly run pytest over some quick scripting code I’d done as well as running ad-hoc formatters and tools. It’s quick and really handy.

There’s still the debate about whether the Python community should go all-in on uv, looking at the messy situation in Node where all manner of build and packaging tools could potentially be used (despite the ubiquity of npm) the argument for having a single way to package and run things is strong.

Standard
Python

Django October 2024 Hackfest

This session was a little more informal than I thought it was going to be but it wasn’t time wasted as it provided an incentive to switch some of projects over to Python 3.13 (which was a great idea so far by the way).

As part of the suggested activities at the session I tried testing a Django template formatting tool called dJade (pronounced just Jade) (introductory post). It worked and seemed pretty good to me, although I don’t really have any complicated projects to work and had to use some off the internet for the testing.

I used uvx to run the formatter and felt that there was something strange going on when I’m running a Rust tool to run a Rust tool and the only Python element was a Pypi listing and the fact that it formats Python code.

The suggestions also included helping out on Narwhals, which I hadn’t heard of before but aims to be a compatibility layer between different dataframe implementations. It seemed an interesting project but not one I have the right background to help with.

Standard
Month notes

August 2024 month notes

Co-pilot

Ever late to the party I’ve finally been using AI assisted coding on a work project. It’s been a really interesting experience, sometimes helpful and sometimes maddening.

Among the positives are that it was easy to get the LLM to translate between different number systems like rgb and hex or pixels, rems and Tailwind units.

It was pretty good at organising code according to simple rules like lexical sorting but it was defeated by organising imports according to linting rules. This makes it a great tool for organising crufty code that hasn’t been cared for in a while and has often been more powerful than pure AST-based refactoring.

At one point it correctly auto-populated stub airport code data into a test data structure which felt that something I hadn’t seen in assistance before.

It also helped my write a bash script in a fraction of the time it would normally take. The interesting thing here was that I know a reasonable amount of bash but can never remember the proper bracketing and spacing. Although I tweaked every line that was produced it was much quicker than Googling the correct syntax or running and repeating.

What wasn’t so great was that the interaction between the Co-pilot and Intellisense suggestions aren’t really differentiated in the UI so it was really unclear what completions are the result of reflection or inference from the code and which ones are based on probability. If you’re having a field name suggested then that should only be via reflection in my view. All too often the completion resulted in an immediate check error due to the field having a slightly different name or not existing at all.

I’m almost at the point of switching off Co-pilot suggestions because they aren’t accurate enough right now.

Would I pay for this myself right now? No, I don’t think this iteration has the right UX and ability to understand the context of the code. However there will be a price point that is right in the future for things like the script writing.

Atuin

I started a new job recently and probably the most useful tool I’ve used since starting is Atuin which gives you a searchable shell history. I’ll probably write up more about my new shell setup but I think being able to pull back commands quickly has made it massively easier to cope with a new workflow and associated commands and tools.

Form Data

This little web standards built-in was the best thing to happen to my hobby coding this month. I can’t believe I’ve gone this long without having ever used it. You can pass it a DOM reference and access the contents of the form programmatically or you can construct and instance and pass it along to a fetch call.

It’s incredibly useful and great for using in small frontends.

Reading list

Gotchas in using SQLite in production: https://blog.pecar.me/sqlite-prod

Practical SVG has been published for free on the internet after publisher A Book Apart stopped distributing its catalogue.

Let’s bring about the end of countless hand-rolled debounce functions: https://github.com/whatwg/dom/issues/1298

Python packaging tool uv had a major release this month. Simon Willison shared a number of interesting observations over at the Lobsters thread on the release. I’m still uncertain about the wisdom of trying to fund developer tooling with venture capital, I don’t believe the returns are there, however I did come round to people’s arguments that the tools could be brought into community stewardship if needed. Thinking of recent licensing forks the argument seems persuasive.

I currently happily mimbling along with pipenv but I need to update some hobby apps to Python 3.12/3.13 soon so I think I’m going to give uv a go and see what happens.

I also started a small posts blog this month so I’m probably going to post these items there in the future.

Standard
Month notes

July 2024 month notes

Dockerising Python

Fly have changed their default application support to avoid buildpacks and provide a default Dockerfile when starting new projects. I’ve been meaning to upgrade my projects to Python 3.12 as well and when one of my buildpack projects stopped deploying I ended up spending some time on how to best package Python applications for a PaaS deployment.

I read about which distribution to use as your base image but I haven’t personally encountered those problems and my image sizes are definitely smaller with Alpine.

Docker’s official documentation is a nightmare with no two Dockerfiles being consistent in approach. This page has some commented example files under the manual tabs but there doesn’t seem to be an easy way to generate a direct link to it which seems, actually typical of my documentation experience.

There also doesn’t seem to be a consistent view as to whether an application should use the system Python or a virtual environment within the container. The latter seems more logical to me and is what I was doing previously but the default Fly configuration isn’t set up that way.

Services

I have quite a few single user hobby web projects and I’ve been wondering if they wouldn’t work a lot better with a local SQLite datastore but it is actually often easier to use a cloud Postgres service than it is have a secure read-write directory available to an app and manage backups and so on yourself.

Turso is taking this idea one step further to try and solve the multi-tenancy issue by providing every client with a lightweight database.

I gave Proton Docs a whirl this month and they are pretty usable with the caveat that I haven’t tried sharing and collaboratively editing them yet. The one thing that is missing for me at the moment is keyboard shortcuts which seem pretty necessary when you’re typing.

I had previously tried de-Googling with Cryptpad which is reasonable for spreadsheet but has a really clunky document interface compared to Google Docs and which I ended up using more out of principle than because it was an equivalent product.

Reading list

It’s possible to get hung up on what good image description looks but this WAI guide to writing alt text for images is straight-forward and breaks down the most common cases with examples.

Smolweb is a manifesto for a smaller, lighter web which aligns for me with the Sustainable Web initiatives. There are a few interesting ideas in the manifesto such as using a Content Security Policy to stop you from including content from other sites (such as CDNs).

Following up on this theme is a W3 standard for an Ethical Web which also felt very inspiring. Or maybe depressing that some of these things need to be formulated in a common set of principles.

I also found out about the hobby Spartan protocol this month which seems like it would be a fun thing to implement and is closer to the original HTTP spec which was reasonable easy for people to follow and implement.

Standard
Month notes

May 2024 month notes

Updating CSS

My muscle memory on CSS is full of left and right, top and bottom. The newer attributes of -inline and -block use start and end qualifiers to avoid confusion with right to left languages. This month I made an effort to try and convert my older hobby code over to the new format to try and get the new names ingrained in my memory.

Another example of things in web development that have now to be unlearnt is that target="blank" is now safe by default. This used to be something that used to be drilled into web developers..

Learning with LLMs

I had my first positive experience using a LLM-based model to learn to code something this month. It was an interesting set of circumstances that led to it really working for me where it hadn’t before.

  • I didn’t know much about the topic, therefore I didn’t know how to formulate search queries that gave me good results
  • The official documentation was complete but poorly written and organised, exploring text can be the perfect task for an LLM
  • Information was scattered over several sites, including Medium. There wasn’t one article or site that really had a definitive answer so synthesising across several sources really helped. I wanted the text of the official documentation combined with the working code from a real person’s blog post.

I used a couple of different systems but Codemate was the most helpful follow by Google’s Gemini.

Previously I’ve been searching for information that I know quite well and therefore instead of getting a lot of value from the information compared to any hallucinated misses the mistakes were irritating me. Summarising data from multiple sources is genuinely an LLM superpower so this consolidation of several not great sources was probably right in its sweet spot.

URL exploring and saving

I needed to build up some queries on a system’s API this month. I decided to give Slumber a go after trying some local Postman-style clones.

The tool is a TUI and uses a YAML file as its store and dynamically syncs the UI when the file is saved. There were a couple of issues; for example it would be helpful to be able to save the content of a response to file and if something is marked sensitive (like the bearer token) then I would prefer to see it masked in the UI.

Overall though I got what I needed to done and the system was a lot easier than most web-based GUI tools that I’ve used as the underlying storage and its relation to the interface is really clear.

Also a shout out to chains, initially these seemed to be an example of making simple things complicated but as I understood them more then they are amazingly powerful for coordinating setups for calls.

Community events

I went to the May Day Data Science event for the first time. It seems the best talks were in rooms that had the least capacity and there was a strict no standing rule. Despite this I did pick up some useful bits and pieces, in particular around prompt design.

I also went to the Django Meetup held at the Kraken offices and was really struck by what a great engineering team they have built up there. Dave Seddon gave a great introduction to the “native library escape hatch” that exists in Python. This time showing how to bring in Rust code to help execution time.

I also went to the Python Meetup this month and spent a day in Milton Keynes at the Juxt 24 conference which had a lot of interesting talks and where I could have spent a lot more time at the afterparty.

Standard
Python

London Python meetup May 2024

The meetup was held at Microsoft’s Reactor offices near Paddington which have a great view down the canal towards Maida Vale. Attendees got an email with a QR code to get in through the gate which all felt very high-tech.

The first talk was not particularly Python related but was an introduction to vector databases. These are having a hot moment due to the way that machine learning categorisation maps easily into flat vectors that can then be stored and compared through vector stores.

Then can then be used to complement LLMs through the Retrieval Augmented Generation (RAG) which combines the LLM’s ability to synthesis and summarise content with more conventional search index information.

It was fine as it went and helped demystify the way that RAG works but probably this langchain tutorial is just as helpful as to the practical application.

The second talk was about langchain but was from a Microsoft employee who was demonstrating how to use Bing as a agent augmentation in the Azure hosted environment. It was practical but the agent clearly spun out of control in the demo and while the output was in the right ballpark I think it illustrated the trickiness of getting these things to work reliably and to generate reliable output when the whole process is essentially random and different each run.

It was a good shop window into the hosted langchain offering but could have done more to explore the agent definition.

The final talk was by Nathan Matthews CTO of Retrace Software. Retrace allows you to capture replay logs from production and then reproduce issues in other environments. Sadly there wasn’t a demo but it is due to be released as open source soon. The talk went through some of the approaches that had been taken to get to the release. Apparently there is a “goldilocks zone” for data capture that avoids excessive log size and performance overhead. This occurs at the library interface level with a proxy capture system for C integration (and presumably all native integration). Not only is lower level capture chatty but capturing events at a higher-level of abstraction makes the replay process more robust and easier to interact with.

The idea is that you can take the replay of an issue or event in production, replay it on a controlled environment with a debugger attached to try and find out the cause of the issue without ever having to go onto a production environment. Data masking for sensitive data is promised which then means that the replay logs can have different data handling rules applied to them.

Nathan pointed out that our currently way of dealing with unusual and intermittent events in production is invest heavily in observability (which often just means shipping a lot of low-level logging to a search system). The replay approach seems to promise a much simpler approach for analysing and understand unusual behaviour in environments with access controls.

It was interesting to hear about poking into the internals of the interpreter (and the OS) as it is not often that people get a chance to do it. However the issue of what level of developer access to production is the bigger problem to solve and it would be great to see some evidence of how this works in a real environment.

Standard