Month notes

December 2024 month notes

Not a whole lot to report on due to this being holiday season.

Colab

I started using Google’s Colab for quick Python notebooks. It’s pretty good and the notebook files integrate into regular Drive. Of course there is always the fear that Google will cancel it at a moment’s notice so I might look at the independent alternatives as well.

I’ve been looking at simulation code recently and it has been handy to run things outside a local setup and across laptops.

tRPC

You can’t spend much time in Typescript world without using Zod somewhere in your codebase. Zod was created by Colin McDonnell and this month I read an old blog post of his introducing the ideas behind tRPC. The post or essay is really interesting as it identifies a lot of problems that I’ve seen with GraphQL usage in projects (and to be fair some OpenAPI generated code as well).

It is quite rare to see a genuine REST API in the commercial world, it is more typical to see a REST and HTTP influenced one. GraphQL insists on a separation of the concepts of read (query) and write (mutation) which makes it more consistent than most REST-like interfaces but it completely fails to make use of HTTP’s rich semantics which leaves things like error handling as a bit of joke.

Remote Procedure Calls (RPC) preceded both REST and GraphQL and while the custom protocols and stub generators were dreadful the mental model associated with RPC is actually pretty close to what most developers actually do with both REST and GraphQL. They execute a procedure and get a return result.

Most commercial-world REST APIs are actually a kind of RPC over HTTP using JSON. See the aside in the post about GraphQL being RPC with a schema.

Therefore the fundamental proposition of the post seems pretty sound.

The second strong insight is that sharing type definitions is far preferable and less painful than sharing generated code or creating interface code from external API definitions (I shudder when I see a comment in a codebase that says something like “the API must be running before you build this code”). This is a powerful insight but one that doesn’t have a totally clean answer in the framework.

Instead the code reaches out to import the type definition from the server by having the local codebase available in some agreed location. I do think this is better than scraping a live API and type sharing code is clearly less coupled than sharing data structures but I’m not sure it is quite the panacea being claimed.

What it undoubtedly does improve on is generated code, generated code is notorious hard to read, leads to arguments about whether it should be version controlled or not and when it goes wrong there is almost inevitably the comparison dance between developers who have working generated code and those who don’t. Having a type definition that is version controlled and located in one place is clearly a big improvement.

I’ve only seen a few mentions of commercial use of tRPC and I haven’t used it myself. It is a relatively small obscure project but I’d be interested in reading production experience reports because on the face of it it does seem to be a considered improvement over pseudo-REST and GraphQL interfaces.

God-interfaces

The article did also remind me of a practice that I feel might be an anti-pattern but which I haven’t had enough experience so far to say for sure. That is taking a generated type of a API output and using it as the data type throughout the client app. This is superficially appealing: it is one consistent definition shared across all the code!

There are generally two problems I see with this approach, firstly is protocol cruft (which seems to be more of a problem with GraphQL and automagic serialisation tools) which is really just a form of leaky abstraction; secondly, if a data type is a response structure from a query type of endpoint then the response often has a mass of optional fields that continuously accrue as new requirements arrive.

You might be working on a simple component to do a nicely formatted presentation of a numeric value but what you’re being passed are twenty plus fields, none of which might exist or have complex dependencies between one another.

What I’ve started doing, and obviously prefer, is to try and isolate the “full fat” API response at the root component or a companion service object. Every other component in the client should use a domain typed definition of its interface.

Ideally the naming of the structures in the API response and the client components would allow each domain interface to be a subset of the full response (or responses) if the component is used across different endpoints.

In Typescript terms this means components effectively define interfaces for their parameters and passing the full response object to the component works but the code only needs to describe the data actually being used.

My experience is that this has led to code that is easier to understand, is easier to modify and is less prone to breaking if the definition of the API response changes.

The death of the developer

I’ve been reading this Steve Yegge post a lot as well The Death of the Stubborn Developer. Steve’s historical analysis has generally been right which gives me a lot of pause for thought in this post. He’s obviously quite invested in the technology that underpins this style of development though and I have a worry that it is the same kind of sales hustle that was involved in crypto. If people don’t adopt this then how is the investment in this kind of assisted coding going to be recouped.

Part of what I enjoy about coding is the element of craft involved in putting together a program and I’m not sure that the kind of programming described in the post is the kind of thing I would enjoy doing and that’s quite a big thing given that it has been how I’ve made a living up until now.

Standard
Programming, Work

August 2023 month notes

I have been doing a GraphQL course that is driven by email. I can definitely see the joy of having autocompletion on the types and fields of the API interface. GraphQL seems to have been deployed way beyond its initial use case and it will be interesting to see if its a golden hammer or genuinely works better than REST-based services outside the abstraction to frontend service. It is definitely a complete pain in the ass compared to HTTP/JSON for hobby projects as having to ship a query executor and client is just way too much effort compared to REST and more again against maybe not doing a Javascript app interface.

I quite enjoyed the course, and would recommend it, but it mostly covered creating queries so I’ll probably need to implement my own service to understand how to bind data to the query language. I will also admit that while it is meant to be quite easy to do each day I ended up falling behind and then going through half of it on the weekend.

Hashicorp’s decision to change the license on Terraform has caused a lot of anguish on my social feeds. The OpenTerraform group has already announced that they will be creating a fork and are also promising to have more maintainers than Hashicorp. To some extent the whole controversy seems like a parade of bastards and it is hard to choose anyone as being in the right but it makes most sense to use the most open execution of the platform (see also Docker and Podman).

In the past I’ve used CloudFormation and Terraform, if I was just using AWS I would probably be feeling smug with the security of my vendor lock-in but Terraform’s extensibility via its provider mechanisms meant you could control a lot of services via the same configuration language. My current work uses it inconsistently which is probably the worst of all worlds but for the most part it is the standard for configuring services and does have some automation around it’s application. Probably the biggest advantage of Terraform was to people switching clouds (like myself) as you don’t have to learn a completely new configuration process, just the differences with the provider and the format of the stanzas.

The discussion of the change made we wonder if I should look at Pulumi again as one of the least attractive things about Terraform is its bizarre status as not quite a programming language, not quite Go and not quite a declarative configuration. I also found out about Digger which is attempting to avoid having two CI infrastructures for infrastructure changes. I’ve only ever seen Atlantis used for this so I’m curious to find out more (although it is such an enterprise level thing I’m not sure I’ll do much than have an opinion for a while).

I also spent some time this month moving my hobby projects from Dataset to using basic Pyscopg. I’ve generally loved using Dataset as it hides away the details of persistence in favour of passing dictionaries around. However it is a layer over SQLAlchemy which is itself going through some major point revisions so the library in its current form is stuck with older versions of both the data interaction layer and the driver itself. I had noticed that for one of my projects queries were running quite slowly and comparing the query time direct into the database compared to that arriving through the interface it was notable that some queries were taking seconds rather than microseconds.

The new version of Psycopg comes with a reasonably elegant set of query primitives that work via context managers and also allows results to be returned in a dictionary format that is very easy to combine with NamedTuples which makes it quite easy to keep my repository code consistent with the existing application code while completely revamping the persistence layer. Currently I have replaced a lot of the inserts and selects but the partial updates are proving a bit trickier as dataset is a bit magical in the way it builds up the update code. I think my best option would be to try and create an SQL builder library or adapt something like PyPika which I’ve used in another of my projects.

One of the things that has surprised me in this effort is how much the official Python documentation does not appear in Google search results. Tutorial style content farms have started to dominate the first page of search results and you have to add a search term like “documentation” to surface it now. People have been complaining about Google’s losing battle with content farms but this is the first personal evidence I have of it. Although I always add “MDN” to my Javascript and CSS searches so maybe this is just the way of the world now, you have to know what the good sites are to find them…

Standard