Data wrangling with Clojure

Clojure is a great language for wrangling data that is either awkwardly-sized or where data needs to be drawn from and stored in different locations.

What does awkward-sized data mean?

I am going to attribute the term “awkward-sized data” to Henry Garner and Bruce Durling. Awkward-sized data is neither big data nor small data and to avoid defining something by what it is not I would define it as bigger than would fit comfortably into a spreadsheet and irregular enough that it is not easy to map onto a relational schema.

It is about hundreds of thousands of data points and not millions, it is data sets that fit into the memory on a reasonably specified laptop.

It also means data where you need to reconcile data between multiple datastores, something that is more common in a microservice or scalable service world where monolithic data is distributed between more systems.

What makes Clojure a good fit for the problem?

Clojure picks up a lot of good data processing traits from its inheritance as a LISP. A LISP after all is a “list processor”, the fundamental structures of the language are data and its key functionality is parsing and processing those data structures into operations. You can serialise data structures to a flat-file and back into memory purely through the reader macro and without the need for parsing libraries.

Clojure has great immutable data structures with great performance, a robust set of data processing functions in its core library, along with parallel execution versions, it has well-defined transactions on data. It is, unusually, lazy be default which means it can do powerful calculations with a minimal amount of memory usage. It has a lot of great community libraries written and also Java compatibility if you want to use an existing Java library.

Clojure also has an awesome REPL which means you have a powerful way of directly interacting with your data and getting immediate feedback on the work you are doing.

Why not use a DSL or a specify datastore?

I will leave the argument as to why you need a general purpose programming language to Tommy Hall, his talk about cloud infrastructure DSLs is equally relevant here. There are things you reasonably want to do and you can either add them all to a DSL until it has every feature of poorly thought-out programming language or you can start directly with the programming language.

For me the key thing that I always want to do is read or write data, either from a datastore, file or HTTP/JSON API. I haven’t come across a single data DSL that makes it easier to read from one datastore and write to another.

Where can I find out more?

If you are interested in statistical analysis a good place to start is Bruce Durling’s talk on Incanter which he gave relatively early in his use of it.

Henry Garner’s talk Expressive Parallel Analytics with Clojure has a name that might scare the hell out of you but, trust me, this is actually a pretty good step-by-step guide to how you do data transformations and aggregations in Clojure and then make them run in parallel to improve performance.

Libraries I like

In my own work I lean on the following libraries a lot.

JSON is the lingua franca of computing and you are going to need a decent JSON parser and serialiser, I like Cheshire because it does everything I need, which is primarily produce sensible native data structures that are as close to native JSON structures as possible.

After JSON the other thing that I always need is access to HTTP. When you are mucking around with dirty data the biggest thing I’ve found frustrating are libraries that throw exceptions whenever you get something other than a status code of 200. clj-http is immensely powerful but you will want to switch off exceptions. clj-http-lite only uses what is in the JDK so makes for easier dependencies, you need to switch off exceptions again. Most of the time the lite library is perfectly usable, if you are just using well-behaved public APIs I would not bother with anything more complicated. For an asynchronous client there is http-kit, if you want to make simultaneous requests async can be a great choice but most of the time it adds a level of complexity and indirection that I don’t think you need. You don’t need to worry about exceptions but do remember to add a basic error handler to avoid debugging heartache.

For SQL I love yesql because it doesn’t do crazy things and instead lets you write and test normal SQL and then use inside Clojure programs. In my experience this is what you want to do 100% of the time and not use some weird abstraction layer. While I will admit to being lazy and frequently loading the queries into the default namespace it is far more sensible to load them via the require-sql syntax.

One thing I have had to do a bit of is parsing and cleaning HTML and I love the library Hickory for this. One of the nice things is that because it produces a standard Clojure map for the content you can use a lot of completely vanilla Clojure techniques to do interesting things with the content.

Example projects

I created a simple film data API that reads content from an Oracle database and simply publishes it as a JSON. This use Yesql and is really just a trivial data transform that makes the underlying data much more usable by other consumers.

id-to-url is a straight-forward piece of data munging but requires internal tier access to the Guardian Content API. Given a bunch of internal id numbers from an Oracle databases we need to check the publication status of the content and then extract the public url for the content and ultimately in the REPL I write the URLs to a flat file.

Asynchronous and Parallel processing

My work has generally been IO-bound so I haven’t really needed to use much parallel processing.

However if you need it then Rich Hickey does the best explanation of reducers and why reduce is the only function you really need in data processing. For transducers (in Clojure core from 1.7) I like Kyle Kingsbury’s talk a lot and he talks about Tesser which seems to be the ultimate library for multicore processing.

For async work Rich, again, does the best explanation of core.async. For IO async ironically is probably the best approach for making the most of your resources but I haven’t yet been in a situation where


First impressions of Kotlin

Kotlin is one of the next-generation languages that builds on top of Java. It’s kind of a post-Scala and Groovy language that comes from JetBrains and therefore has a lot of static functionality that enables great tooling to be built on top of it.

It has been in development for a while but it is now getting a big push in terms of marketing as it approaches version one. I have noticed this a lot in terms of Android development, where Google and Oracle’s legal wrangle over the JDK code used in Android applications offers an opportunity for people who want great bytecode compatibility and post-Java 6 features but who cannot upgrade their Java version.


This blog post is purely based on going through the tutorials and koans for Kotlin and not any production experience I have. This post is more a summary of my initial evaluation of whether to spend more time with this language.

Key features

Kotlin aims to have great interoperability with Java but aims to reduce boilerplate coding and eliminate certain classes of error within pure Kotlin code.

The Java legacy

Kotlin’s symbiotic relationship with Java means that fundamentally you have a language that has all of Java’s quirks and legacy and adds to it a new layer of syntax and complexity. Essentially Kotlin is syntax-sugar on Java so deep that it is like the inch-high frosting on a cupcake.

Scala has also had a strong influence on Kotlin but disappointingly this means that many of the quirky aspects of Scala have been transplanted to Kotlin. Most particularly Scala’s val and var system of maintaining compatibility with Java’s fundamentally mutable variable system.

Like a lot of object-orientated languages with lambda support, functions like filter or map are on the data and take a lambda. So you chain operations together in a trainwreck-style or if you don’t like that then you have to introduce intermediate variables. I prefer collection manipulations to be their own standalone functions which take a sequence or iterable and the lambda. This allows partial or deferred application.

What’s good about Kotlin?

Kotlin has all the higher-order function functionality that you would expect along with a straight-forward declaration and package-style namespacing.

It has some “annotation functions” that allow you to package data objects in the same way as Scala case classes.

If you limit yourself to functions and data then you have a compact language with the power to do meaningful work.

It reminds me a lot of Groovy but is typed and compiled and is more in the camp of “if it compiles it will work”.

Unsurprisingly the tooling in IntelliJ is excellent and it is easy to write and navigate around the code.

The extension functions allow a way of enhancing or bespoking code you don’t own that is more elegant than Scala’s implicit magic. The function declarations attach to the type and compiler magic introduces an implicit this. By comparison with implicit there is much less runtime magic and if you are using IntelliJ then the declarations are easy to navigate.

The type system

Over half the koans are concerned with type-compatibility with Java, in particular issues with generics and extension methods. Type inference didn’t seem that good or bad, you have to declare the types of parameters and the return type of functions, which is par for the course. I didn’t come across any confusing type errors although the extension methods sometimes had confusing scoping issues if I didn’t declare them correctly.

Rather like Groovy, Kotlin has decided to retain null compatibility with Java but uses Option and some built-in operators to allow some type-safety around nulls. I found the new operators to be more confusing that simple null-checking as they do some type-changing from Option[T] to T conditional on the Option being Some[T], otherwise the expression doesn’t get evaluated.

In theory this means you write code that accesses nested, potentially null attributes of an object in a single line without risking a Null Pointer Exception. In practice though it seemed just as likely that the code execution would get vetoed which meant that you have a subtle code branch after each use of a null-checking operator.

I’m not sure the special operators added any real value to the normal API for Option, they are less explicit in their behaviour and they really seem more concerned with reducing line count when interacting with legacy code.

So most of Kotlin’s typing seems concerns with retro-fitting fixes to the underlying Java type system. It certainly doesn’t seem to have an declared interest in having more sophisticated or powerful types.

Final thoughts: Scala versus Kotlin

Scala in many ways is much more ambitious than Kotlin but in outcomes they are very similar. Both fundamentally want to retain compatibility with Java including mutable variables, null, mutable collections and the Java type system. Both add higher-order functions and a system for extending code that you don’t own.

Obviously Scala is the earlier language and therefore a lot of what Kotlin is doing is feature matching.

The thing that separates them is really what purpose you are using them for. If you are looking for an actively developed language that is fundamentally an enhanced Java with modern features then Kotlin has better tooling and a more explicit extension system.

If you are looking for a richer type system that allows you to express behaviour as the result of the application of types or you are into category theory then Kotlin isn’t going to do anything for you and Scala is still the better choice.


The inevitability of ad-blocking

As I work in the content industry I’ve always felt bad about installing ad-blocking software. I’ve always felt that adverts were part of the deal of having free content.

Recently I have started to use them in some of my browser sessions and the reason is almost purely technical: adverts were wrecking my power consumption and hogging my CPU.

The issue is naturally acute on smartphones, which is why Apple is starting to allow ad-blocking on iOS Safari, but my recent problems have actually been on laptops. I have an aging Chromebook which you might expect to have problems but I have also found that in the last six months my pretty powerful dev laptop has also been going into full-fan power drain mode, often resulting in less than two hours of battery life.

At first I thought the issue was simply that I am a total tab monster, keeping open loads of pages and referring to them while coding or researching things.

However by digging into the developer tools and the OS monitors it became apparent that just a few of my tabs were causing all these problems (swap file paging I still have to put my hands up to) and all of them were running visually innocuous ads that were taking up vast quantities of CPU and memory.

With no way of telling whether any given webpage is going to kill my computer or not, the only sane response is to not take the risk and install an ad-blocker.

Since installing them (I’ve been using uBlock) I have indeed obtained longer battery-life and less memory-crashes on my Chromebook.

While I am still worried about how we can pay for high-quality open web content in a world without ads there is no tenable future for an open web that clients cannot viably run.

In my personal web usage I prefer to pay for the services I use and rely on. For those that I’m uncertain of I’m happy to trial and therefore to be the product rather than the customer.

In these situations though I am really dealing with the web as an app delivery platform. For content production there needs to be something better than the annual fundraising drive.

Frustratingly there is also a place for ads. Without advertising then everything becomes (online) word of mouth. There’s a positive case to be made for awareness-based advertising. I want to do it myself around recruitment as part of my work.

These adverts though are really nothing more than pictures and words. They shouldn’t be things that are taxing the capabilities of your hardware.

Advertisers are bringing this change on themselves. If they can’t find a way to square their needs and those of the people they are trying to reach then there isn’t going to be an online advertising market in nine months time and that might mean some big changes to the way the web works for everyone.

Blogging, Programming, Web Applications

An overview of Javascript reactive frameworks

This post is only meant to be a snapshot of the current state of the various DOM virtualising webframeworks that are around. I’m partly publishing it to try and discover more that I may not be aware of.

Many of these frameworks trace an ancestry back to Om and React. However each one tries to deal with perceived problems with the original frameworks. The most common being that React is too heavy and opinionated while not providing a consistent data model for components. Om on the other hand is in Clojurescript and therefore represents too much to learn in terms of a new language and build process.


Most of the libraries build on a few common building blocks that I’m not going to elaborate on here. Virtualdom was an early attempt to separate the core idea of React from the rest of the library code. Virtualdom is only concerned with creating, manipulating and stringifying DOM structures in-memory. Browser DOM APIs involving linking to the actual rendered document so managing virtual DOM is more efficient and simpler because you’re not interacting with these underlying libraries.

ImmutableJS provides a Javascript-idiom interpretation of the Clojure data structures that Om uses (and which are available as the standalone library Mori).


The first interesting framework to discuss is Omniscient, which as its name suggests is heavily influenced by Om but is written in Javascript and therefore does not require you to learn Clojure to use the same techniques that Om uses. Omniscient is built on top of React and ImmutableJS and uses its own library Immstruct to add reference cursors to ImmutableJS structures. Reference cursors allow a component to observe and change sections of a data structure without having to manipulate the whole thing. So for example a component can be given a single sub-key in an object that represents its state and it cannot access or change anything that is not under that key. The code can also be simplified to behave as if the sub-key was actually just the whole data object.

Omniscient doesn’t suggest an alternative to Om’s CSP, instead providing a mechanism for passing event flow functions down the component tree. You’re free to choose your own event libraries. It also means that you’re free to make your own mistakes here as no guidance is really given as to how to structure your event scheme appropriately.

Omniscient is one of the earliest frameworks to re-implement Om and therefore has one of the better sets of documentation on its Github pages. That said there’s not a lot of documentation and the framework does not have a massive community. The situation is worse in most of the other frameworks though so this might tip you over in favour of Omniscient.


This is a bit of a Guardian shout out as the primary developer Rich Harris is a Guardian interactive developer.

Ractive (Github) is a little be different from the other frameworks as you can essentially think of it as Mustache templates backed by Observables. You declare a data-binding and write templates in normal Mustache syntax but behind the scenes Ractive is driven by changes in the data and then writes new section of DOM in-memory according to what has changed rather than DOM diff’ing.

Also Ractive sticks with two-way databinding rather than unidirectional data flow so failures in synchronisation or rendering can be problematic.

If what you want to do is render content over a Javascript data model then there is a lot in Ractive that is very compelling. It uses templates with a standard syntax that is well understood and is a soup and nuts framework that sticks to core Javascript syntax and features. However if you want to use your own event or data model you are out of luck.


Mercury on the other hand prides itself on modularity. A microframework it attempts to create a glue layer that allows other libraries to interact in a sensible and consistent way. The default components are Virtualdom and its own observer pattern to wrap state.

Mercury’s biggest problem right now is its lack of documentation. There is an expectation that you are going to read the source code to understand what the framework is doing and how to interact with the API. I frankly think this is unrealistic. The project doesn’t currently supply the incentive to do that. Unless you have a very particular desire to avoid any framework lock-in or you want to use a very specific combination of libraries that is not supported elsewhere its hard to understand why you would invest your effort here rather than in frameworks that offer more support.


Cycle is similarly experimental, its biggest claim is that it is truly reactive and that the rendered page is purely the result of change in state. The introduction is couched in computer science theory but it would seem that at its heart Cycle wraps RxJS and Virtualdom in a glue layer that has the programmer writing the transform sequence between the event and the DOM structure.

I think it is a positive feature that Cycle re-uses a popular library to manage its state-transitions rather than implementing yet another custom version of the Observable pattern. It also makes the framework easier to get started with if you are familiar with the Rx.

Using established libraries also makes the lack of documentation more acceptable as the Cycle readme only needs to explain how the glue works in the framework.

As something built on reactivity you have to get used to dealing with intermediate state which can be bit difficult for the beginner.

Essentially any event where the user would expect feedback means you need write the conditional structure in the output. So if the user types a character in an input box then you need to write the value of the input box to be the characters the user has typed so far. Most frameworks work at a higher level of abstraction or rather they map closer to the DOM APIs, so getting a working application means grokking the way the dataflow works.

If you’re looking for purity (and a resulting simplicity in implementation) but not to have to learn a bespoke API Cycle is nicely positioned.


WebRx is similarly built on top of RxJS Observables but is a much fuller-fat framework that is much more a spiritual successor to Knockout than owing much to the influence Om or React.

Rather like React WebRx doesn’t really provide generalised event handling but instead has special sauce bindings for DOM events and a MessageBus system built over Rx.

It is also written in Typescript and generally looks to play well within the Microsoft ecosystem. It’s interesting to me as an example of how different a language has to be before its regarded as a barrier. Clearly the use of Typescript means there are people who will refuse to use the framework regardless of whether it works for their use case. Other people are going to be attracted exactly because it uses Typescript.


Language choices are also interesting in Deku which is another attempt to re-implement React in a superficial way.

Deku makes use of ES6 and 7 features and doesn’t aim to support a broad range of browsers (unlike say Ractive). Again that is going to rule it out for some people but this is a more interesting as now we are within dialects of the same core language. Language choice for implementing frameworks is not straightforward. What are you looking for? Conciseness? Editor support?

Deku aims to take the dom diffing approach but avoid getting caught in React’s framework and approach. In particular components are defined just as Javascript objects rather that classes and instances. Something I think makes it more elegant that normal React Components.

It does however still use JSX which is quite interesting as the framework claims to be taking a functional approach but actually uses a DSL for all its DOM construction.

The lifecycle hooks are slightly different with more hooks for different stages of the process and Deku uses some interesting function passing to send changed data down the tree to components.

Deku doesn’t take much influence from Om though. It doesn’t have sophisticated event handling and uses mutable data with generous access and callbacks on data write to do re-renders. This means bugs and state issues are no less likely to happen than with any other framework. It does adopt the single atom idea with a single tree representing the app and the app renderer being bound to the body element.

As such if you like the idea of React but don’t want to bound into its concept of how a Component should be defined but do like JSX and trust the implementors to create a better dom diff than Facebook or Virtualdom, this is the project for you.


I’ve only chosen a handful of frameworks to look at here, mostly based on the ones I know, I’m expecting people to point out more in the comments. I also haven’t used all of these frameworks. Road-testing all of them would be a bigger task than just trying to describe the design choices they’ve made.

The most common pattern is to try and improve the rendering time versus React by using different virtual dom difference algorithms. Usually this is combined with Observed variables that provide a Reactive component that allows changes in the data model to be conveyed to the DOM model with no coding required.

Few of the frameworks engage with the functional reactive programming paradigm by building abstract event streams or indeed any abstraction over discrete events.

The idea that the app should be a single data structure that represents the whole page seems to be gaining significant traction with several of the frameworks recommending this as an approach.

The explosion of frameworks resulting from the release of React is, I think, a positive thing. Initially it seems really daunting that you have all these choices but when you look at the real level of difference between them you can see that they are actually quite tightly coupled around a few common and core ideas and that mostly they express differences about the concerns that a framework should have which feeds into the wider conversation about micro or comprehensive frameworks.


No-one loves bad ideas

Charles Arthur has an interesting piece of post-Guardian vented frustration on his blog. His argument about developers and journalists sitting together is part-bonkers opinion and partly correct. Coders and journalists are generally working on different timeframes and newsroom developers generally don’t focus enough on friction in the tools that they are creating for journalists.

Journalists however focus too much on the deadline and the frenzy of the news cycle. I often think newsroom developers are a lot like the street sweepers who clean up after a particularly exuberant street market. Everything has to be tidied up and put neatly away before the next day’s controlled riot takes place.

The piece of the article I found most interesting was something very personal though. The central assumption that runs through Arthur’s narrative is that it is valuable to let readers pre-order computer games via Amazon. One of the pieces of work I’ve done at the Guardian is to study the value of the Amazon links in the previous generation of the Guardian website. I can’t talk numbers but the outcome was that the expense of me looking at how much money was earned resulted in all the “profits” being eaten up by cost of my time. You open the box but the cat is always dead.

Similarly Arthur’s Quixotic quest meant that he spent more money in developer’s time than the project could ever possibly earn. Amazon referrals require huge volumes to be anything other than a supplement to an individual’s income.

His doomed attempt to get people to really engage with his idea really reflected the doomed nature of the idea. British journalism favours action and instinct and sometimes that combination generates results. Mostly however it just fails and regardless of whom is sitting next to whom, who can get inspired by a muddle-minded last-minute joyride on the Titanic except deadline-loving action junkies?


Python: Preferring Named Tuples over Classes

One of the views that I decided to take in my recent Python teaching is that named tuples and functions are preferable to class-based data structures.

Python's object-orientated (OO) code is slightly strange anyway since it is retrospectively applied to the original language and most programmers find things like the self reference confusing compared to OO idioms in languages like Ruby or Java.

On top of this Python's dynamic nature means that objects are actually "open" (i.e. can take new attributes at runtime) and have few strong encapsulation guarantees. Most of which is going to be surprising to most OO-programmers who would expect the type to be binding.

Named-tuples on the other hand are immutable so their values cannot be changed and they cannot be expanded or reduced by adding or removing attributes. Their behaviour is much more defined while retaining syntax-sugar access to the attributes themselves.

Functions that operate on tuples and return tuples have some nice properties in terms of working with code. Firstly you know that there are no sequencing issues. A function that takes a tuple as an argument cannot change it so any other function is free to consume it again as an argument.

In addition you know that you are free to consume the tuple value generated by a function. As the value cannot be changed it is safe to pass it around the codebase.

I think the question should be: where are classes appropriate in ways that tuples are not?

The most common valid use of classes and inheritance is to provide a structure in a library where you expect other programmers to supply appropriate behaviour. Using classes you can simply allow the relevant methods to be implemented in the inheriting implementation. A number of Python web frameworks use this Template pattern to allow the behaviour of handlers to be defined.

Even then this is not the definitive solution. Frameworks such as Flask, use decorators instead which fits with the functional approach.

So in general I think it is simpler and easier to maintain programs that consists of functions taking and generating immutable data structures like tuples. Using Python's object-orientation features should be considered advanced techniques and used only when necessary.


/dev/winter 2015

The Dev Sessions are a Cambridge tech conference organised by the same people who do FPDays. The conference was free, held on a Saturday and was based in the Moeller Centre near the Churchill College campus. The only practical way to and from the station was via taxi (befriend those on expenses, thank you John Stevenson).

The talks were on broad topics relating to development. I had pitched a talk on Developer Autonomy, something I'm engaged with in the day job.

Misjudging the train times I arrived a little late and jumped in to the talk on using graph databases in game design. This turned out to be a much more general talk about how the speaker had created tooling to support the game designers in his job. Being a fellow tool provider my interest was immediately piqued.

The game the team were building was some weird monster trapping game, something like Pokemon but more complicated. To trap monsters you need a trap, a lure or bait and you would need to craft both so acquiring recipes and components. Trapped animals provide you with components for other baits and traps and a monetary reward.

The talk was pretty wide-ranging, they were using Neo4J to analyse circular dependencies in "quests" to capture monsters. When designers changed the game data it would get loaded into the graph and all the dependencies checked that they are like a tree (flowing forward) rather than having inter-dependencies (circular references).

It was also possible to generate a "map" of everything in the game and what elements of the game were central and which were on the periphery (which should be the high-level monsters near the end of the game).

All the game data is in text files that are stored in Git, the developers had built a tool over the VCS that simplified the presentation of the many JSON files but it was also possibly for designers to edit them directly with whatever editor they favoured.

All the game data then gets built, validated and packed so it can be shipped off to the servers to power the game.

I think, if I understood the talk correctly, that the build also includes the localised text which is then powered from the server rather than updating a binary datafile on the client.

The final really interesting part of the talk involved the use of genetic algorithms to try and create game data. Data is captured from the game indicating what percentage of the players have captured a particular monster. The designer can then enter the percentage that they intend to capture the monster and the program goes off and tries to generate variations on the monster stats and trap requirements that it predicts will be more achievable by players. If any suitable combinations are found the designer can review them and choose the one they prefer.

Again having selected some changes these are applied to the data files via the tool and then packed and shipped.

It was a really interesting talk about how engineers can make a real difference by building tools and was completely undersold by its title.

The Mixcloud talk on scaling on a bootstrap budget was very interesting as most talks on scaling are about reliability, volume and throughput. It is very rare to get one that focuses purely on trying to create the lowest cost stack.

One of the key things they do to achieve this is a lot of capacity planning with just-in-time rental, buying capacity just ahead of rising usage, something that is much easier when you have a focused product with a limited scope that all your engineers can focus on.

They were also using some interesting hacks like ruthlessly using their right to renew contracts to make sure their applications ran on the newest hardware that was being brought into the datacentre instead of staying on the older blades. A few of the other things I'd heard of before: like setting your requirements so you require individual boxes and therefore do not share your infrastructure with someone else instead of building smaller services with numerous deployments.

There were a few blanket statements that I didn't agree with. For example S3 was condemned as being "expensive" when its really not the more nuanced statement is that S3 bandwidth is expensive and it really is more of a storage solution than something you use to directly serve the public at scale.

One of the big domain specific issues was around streaming audio files, of which, intriguingly was the idea that when you serve the files the connection is so fast you serve the whole asset to the browser when the user is perhaps only going to listen to ten seconds to see if they like it.

A lot of the talk was really about building a single point of presence CDN on the cheap. I did wonder if there wasn't something smart to be done with servers that regulated the downloads more evenly or using a customer player and streaming format.

I stopped by the Julia introduction and there was some interesting points but it was very slow. Julia is quite an interesting language though and I should spend more time with it.

The final talk of the day was on "smells" in automated testing. I thought this would be an interesting topic because I think automated testing was hard but a combination of obscure slide illustrations, fairly old testing strategies and dodgy OO-code examples at the end of the day resulted in a talk that was side-tracked. Testing is hard, and since test code is code then it does not seem worth calling out tests as something special within a codebase. Writing good test code means writing good code and applying the same scrutiny of solution design to the test code just makes sense.

Two things that were not mentioned in the talk but which I think matter when you are talking about the subject as a whole are monitoring and generative testing. I think any talk about testing now needs to cover an approach to generative testing, the old world of testing examples and specifications might be helpful for illustrating code but should not be considered as really being proper test code.

Things that can be extremely difficult to test might be trivial to monitor. Time spent understanding the performance of code in production can be just as valuable as investing a lot of time in creating complex test code.

The whole day was full of interesting talks and bits and pieces and I'm definitely interested in trying to make the trip to the summer version of the event.