Gadgets

The Doxie handheld scanner

Doxie makes handheld scanners, which seemed the perfect replacement for my flatbed scanner that is currently without working drivers for any modern OS.

I opted for the DoxieOne even though my scanning is primarily for Linux and there it wasn’t going to use all the integration and software the device comes with.

Doxie’s failure to integrate with Linux is a bit baffling but the basic behaviour is absolutely acceptable. It connects as a standard storage device by USB and you can just use all your usual tools against the device’s drive.

The scans produced are good quality and unlike a flatbed it will scan any length of document as long as it is roughly A4 or less in width. Of course you can’t use it to scan books (although there is a mini-flatbed version) if that matters to you.

The only real difficulty in using the Doxie is that an slight misalignment when you first engage the roller can get very pronounced by the end of the scan and the result is a picture at an angle that is hard to fix up in an image editor. If I realise that something is feeding incorrectly I tend to rescan as it is easier.

I was looking for a portable scanner that is easy to store and interact with and the Doxie meets the bill in an elegant way.

Standard
Programming

The state of microservices

One of the liveliest sessions at Scale Summit was the one on microservices where opinions flowed free and fast in a rapidly rotating fishbowl.

There were several points of interest that I took away and I thought I would write them up here as some of the key questions. In another post I’ll talk about the problems with microservices that haven’t been solved yet.

Are we doing microservices yet?

Some people in the room had already adopted microservices, the reasons given included: breaking down or trying to change functionality monolith codebases, trying to scale up an existing applications or architectures or actually as an existing best practice that should be applied more generally.

What is a microservice?

A few people wanted to get back to this. Showing that while it is a handy term it isn’t universally understood or agreed.

I personally think a microservice is a body of code whose purpose and implementation is easy to understand when studied, adheres to the Unix philosophy of doing one thing well and which can be re-implemented quickly if needed.

Are microservices just good practice or righteous SOA?

Well naturally you can regard every new concept as being a distillation of previous good practice. However the point of naming something is to try and make it easy to triangulate on a meaning in conversation.

Won’t microservices just be corrupted by consultants and architects?

Yes, everything popular gets corrupted in time. I’m okay with that because in the interval we have a handy term to describe some useful patterns of solution design.

Don’t microservices complicate operations?

One attendee put it well: microservices are recursive so if the operations team are going to support them then they should be in the business of creating a service that deploys services.

Some people felt that for microservices to work developers and teams had to have full stack ownership and responsibility but I felt that was trying to smuggle devops in under the microservices banner.

I think microservices are best deployed on a platform and that platform defines what a deployable service is and can be responsible for shutting down misbehaving services.

Such a scheme allows for other aspects of the Unix way to be applied such as man pages, responding to –help and other useful conventions.

The platform can check whether these conventions have been met before deploying the service.

Isn’t microservices just a way to discuss the granularity of a service?

Yes in a way, although there are a few other practices that make up a successful application of microservices you can just think about it as being a way of checking the responsibility boundaries of your service and how easy it would be to replace.

If your service has multiple responsibilities and is difficult to replace easily then it might have problems.

A lot of people wanted to use AWS an an example of good services without microservices. I think AWS is a good example of service implementation: each part has good boundaries and there are a few implementations of the key APIs. Things like the AWS security functionality is a good example of how you have to work hard to avoid having services rely on other services, and the result isn’t elegant as a result.

I would argue that public-facing APIs are probably where you want to composite microservices to provide a consistent facade onto the functionality though.

As other delegates pointed out, isolating services makes them easier to scale. If starting a server in the EC2 API requires more resources than shutting it down you might prefer to scale up just the creation service rather than many instances of the whole API which are unlikely to be used or consume resources.

As ever, horses for courses, you’re going to know your domain better than I do.

Don’t microservices cause problems as well as solve them?

Absolutely, choosing the wrong solution at the wrong time is going to cause problems. Zealously over-applying microservices or applying the idea to absurd levels is not going to have a happy outcome.

A guess a good point is that we know our problems with existing service implementations. We don’t know what problems there are with microservices or whether they have logical and simple solutions. However they are helping us solve some of our known problems.

Aren’t microservices simply REST-ful services done right?

The most common form of microservice today is probably one implemented via HTTP and JSON. However this form isn’t prescriptive. ProtocolBuffers might be a better choice for an internal exchange format and ZeroMQ might be a better choice for a transport.

I also think that message queues are a good basis for microservices with micro consumers and producers focussing on tight message types.

See also my mini-list of microservice myths which has more on this subject.

Should we be doing microservices?

I suspect that doing microservices for the sake of ticking a solution buzzword is bad. However I think microservices seem a pretty good solution to a problem I know I’ve seen in a fast-moving domain where you are trying to innovate without creating a maintenance burden or large legacy.

Standard
Programming

Scale Summit 2014

Scale Summit is the new Scale Camp, an unconference aimed at bringing the same kind of topics as you might expect at Velocity.

This was the first Scale Summit, the venue was excellent as was the food (especially the bacon rolls, from Eden apparently) and supply of drink. Scale Summit happens under Chatham House rules so there’s no attribution of what is said which allows the attendees to be really frank and also for people to be free with what they really know rather than hedging and trying to be “on message”. It makes for a fascinating gathering.

The sessions varied in their organisation but all focussed on discussion between the participants. I managed to go to the Elasticsearch session, which was interesting for the practical boundaries that people were finding and also the operational knowledge. On the subject of using ES as the primary application store, the feeling seemed to be “not yet”, but there was also some words of wisdom about separating out document stores and search functionality and not finding a superficial unity in the two purposes.

The microservices session was a fast and furious fishbowl, easily the liveliest event and one that is going to require a post in its own right. It was interesting to see that the room split into practitioners and people who were sceptical that microservices were a thing or held value over conventional service development.

After lunch I sat in on what can be done to get frontend testing off the critical path to production (not much now but clearly more effort needs to be made), distributed DOS attacks on transactional sites (not as scary as I imagined but again we have to be thinking about how this works), distributed data stores (good war stories, felt better informed for going), getting ops and developers to work together and Linux containers (definitely going to try Docker now).

I had quite a few questions going into the event and while I didn’t get all the answers I hoped for I did at least establish that smart people don’t have simple answers to them either which is reassuring. It’s hard to tell in the heat of it all whether you’re on the edge of things doing things that are pushing the boundaries or simply over-complicating your situation.

The attendees were nicely mixed and from a range of backgrounds, ops, architecture and developers were all well-represented so you felt you were seeing a rounded situation.

The unconference format left me wanting more rather than feeling I had had enough. The openess was amazing and I am planning on being there next year.

Standard
Programming

Authenticity and appropriation

I know I’m not a hacker, I don’t describe myself that way and I know its not what I do. I may indulge in hacks of both programming and other kinds but hacks do not make the hacker.

“Hack” and its derivatives are very popular though. “Hackdays”, prototypes getting described as “hacks”, and people self-identifying as “hackers”. Learning something in 24 hours is old hat now, why not just “hack” it instead.

This kind of wholesale appropriation of sub-culture is nothing new. Look at punk, hip-hop or skateboarding. In a way this theft of technologist’s jargon is a backhanded complement, a validation of its worth and validity.

Appropriation brings with it the question of authenticity. Authenticity brings with it the whole field of identity politics. It is a cascade of events that brings us to point where arguments erupt as to who is capable of determining who is truly a “hacker”.

Until recently I didn’t feel this argument has effected me very much. Since I have an instinct for people who meet the archetype of the hacker (by trade) and I don’t seek the title for myself it has felt like a fight I don’t have a dog in.

The latest wave of hacker appropriation renders a useful concept useless. As a good post-modernist I don’t weep for the hacker. The thing about all appropriated sub-cultures is that if they are going to thrive they are going to evolve; change, renew and protect themselves. Witness the rise of the “brogrammer” as way of delineating those inside and outside the tent.

However when I hear the accusation that authenticity in technology is a matter of white male privilege rather than an attempt for a community to express and recognise an identity, I think we have an argument that seems to serve no-one very well.

I’m not saying that technology communities aren’t sexist or male-dominated. They self-evidently are. Unlike a lot of communities, though, technology is something where a meritocracy can function. While meritocracies are clearly shaped by peer pressure and conventional wisdom the simple fact is that a programmer is going to evaluate the utility of a piece of code entirely on how well it serves their own needs and not the gender of the person who wrote it. In fact in an internet world of handles and shared code the real identity of the person you collaborate with is often unknown to you, an irrelevance.

“Good code” is a cultural artefact, shaped by the constitution of the community. Useful code is not.

Appropriating hacking may seem a good idea. But when you do it, dismissing criticism of the authenticity of the result is self-defeating. Anything appropriated is devalued.

Attempting to liberate or seize control of the language of technologists might seem a good idea in the name of a diverse community. But anything done without consent will result in resistance.

Let’s tackle sexism and exclusion by all means, particularly in user groups and conferences where identity is concrete.

But let’s not think that cultural politics can substitute for code contributed to and valued by the community. Let the work speak for the individual, let’s value utility, humility and modesty more than any one disputed signifier.

Standard
Programming

Writing code without tests

This post is aimed at people who have mastered test-driven development and ideally also behaviour-driven development and who are familiar with XCheck testing. If you don’t have good basic steps then trying to jump onto some of these techniques are likely to backfire on you as you will probably struggle to assess the risks correctly.

There is a reason TDD was invented, it represents the refinement of good testing practice and the philosophy of good software design. TDD is a relatively simple practice to describe that requires effort to implement. Writing code driven by tests is safer that straight-coding.

Writing untested code is a kind of mastery technique. It is high-risk and relies on the skills and the knowledge of the programmer. I don’t think it is ever responsible if the programmer is not going to be the person supporting the result in production. Without this condition then the programmer’s interests are not properly aligned with the consumers of their code.

So with all those caveats in place what if we want to create code faster because we don’t have to write tests?

Well we have to understand where bugs come from and we will have to write code that doesn’t allow those situations to arise.

There are two important principles to start with. If you can rely on tested library code, then you can rely on the underlying quality of the tested code and leverage it in your own application. Secondly the code you don’t write will not have bugs.

Therefore we should be aiming to write the smallest amount of code possible and we should never try to code what others have coded for us.

The next point is about where bugs occur. I think we’re now at a consensus point that most bugs occur in the way we change and maintain state. In both procedural and functional languages it is rare to get a mistake in the order of steps that something must happen in for example. These kind of problems tend to be misunderstandings of the domain (that get written into test suites as well so testing doesn’t help catch them) rather than genuinely unexpected consequences of the programmer’s code. Object-orientated code is really hard to reason about from this point of view as objects don’t have an implied order of execution.

This is why quick scripts of less than 200 lines tend to do stable sterling service for years whereas larger applications are more tortured in their existence.

Therefore whatever language we are coding in we need to adopt the functional principle of operating only on our parameters and returning values that can be consumed by the caller.

Size matters, a lot, if whole program can fit into a single file and you can pretty much hold the whole thing in your head then it will be easy to reason about what the program is doing and see flaws in the logic of the program. A single complex line of code is better than many lines and is much better than many lines split across many files.

One way to bring down the size of code files is to be ruthless about concerns. For example recently in my Python programming I have been assigning only one purpose to each module: this module renders reports, this one provides JSON endpoints.

Another technique is to not persist any state, this is actually surprisingly easy in web programming since each request is completely separate event and by default you can trade CPU time for isolation.

If you are doing batch or server-side programming then it is worth considering using something like parallel to create many separate bubbles of execution rather than trying to write code yourself to distribute work.

Another aspect of state that causes issues are making global modifications, whether it be to a database or a filesystem. Try and defer all global changes to the final moment of a program and do all the manipulation in-memory instead. If you never change the world then you can run a program over and over again refining what it does.

Assertions are more powerful than logging in writing test-less code, it is better to kill a thread of execution rather than let it do something you weren’t expecting. Logging is really about helping build your intuition about what a program does and how it works.

Assertions allow you to create strong pre and post-conditions on the operation of the program. Essentially they allow you to guarantee the “happy path” execution of your code and avoid having to test all the negative situations that might occur.

Despite this you always want to code for failure, use short-circuit logic to abort code flow early and therefore simplify the context of the code in the rest of the function.

Remember all the basic rules of cyclomatic complexity, don’t nest, don’t do conditionals, do try and express your looping as list comprehensions.

Don’t write generic code, ever. The more potential inputs a function has, the more you end up needing unit-tests to verify the interactions. If something is meant to work on strings don’t try to make it work on strings and integers. Your detection code ends up being a potential source of bugs that needs testing.

If you write dynamic interpreted languages then you are going to have do some manual testing, unless you can remember the names and orders of the functions exactly. Don’t forget to dive into the shell or REPL and play around with the code in isolation. If you can verify the behaviour of individual parts of your program without having to wire together multiple components then you have the right level of granularity for your code.

Re-use code that is already working. Code re-use is generally best achieved by cut and pasting files and then importing the functions you need. Don’t try and synchronise your code, updating some library code ultimately means that you are going to know whether the new library code works as you expect with your functionality and that means you’ll need a test suite.

Don’t refactor your code, rewrite it. Refactoring requires unit tests. Don’t be afraid of things like myfunction2 (although once you have the new functionality you need to delete the old unused stuff). Re-writing allows you to ditch all your assumptions about the code and attempt to express your new understanding of the problem and the requirements as simply as possible.

Don’t work with large numbers of people on the same code base. The more people trying to modify and change the code the more you need tests to try and clarify your different intents for the code base. Again, try divide and conquer on the problem, rather than six people working on the same code can you get three sets of two people collaborating on three smaller codebases.

Finally don’t be afraid to write a test. Writing the right unit-test to prove you can rely on a base piece of functionality means that you then don’t have to write tests for all the pieces of code that use that underlying function. I like to try and write code without tests to maximise the flexibility of the code base when I’m tackling problems with unclear solutions. It is not an ideological thing to have no tests whatsoever, it is rather that when tempted to write a test I think “Could I do this in a way that is trivial and doesn’t require a test?”. Simplicity is the cornerstone of test-free code.

Standard
Programming

Clojurescript at London FunctionalJS

At the January’s London FunctionalJS meetup the technology under discussion and use was Clojurescript. There was an introduction to the language basics from Thomas Kristensen of Forward, which was really much more about the basics of the syntax. We then went into the dojo exercises: the choices were implementing the Todo list SPA (the Javascript world’s Pet Store), using Clojurescript with an existing Javascript framework people were already used to working with or doing some 4Clojure.

Everyone ended up doing the Todo list which is interesting in its own way. Clearly the SPA is seen as the benchmark for evaluating these kinds of technologies.

Most of the teams were able to get the basic Todo functionality done in terms of adding and removing things from the list and re-rendering it. Most teams seemed to abstract the rendering but most put the list management into the callback for the event.

Again I was interested to see that most people grasped the idea of an atom and were able to manipulate its value. Because that kind of stuff is second-nature to me now I was wondering if it would cause issue in terms of creating a modifying function rather than directly manipulating the value. The example in the setup functions of the dojo code using conj seemed to be straight-forward enough for everyone.

Identifying and deleting items seemed more problematic. Some people wanted to do it by index but for the most part matching the text of the todo-item seemed to be popular. Probably the sensible way to actually manage the items is to uuid the items to allow their underlying state to change away from the identity.

Laziness definitely caught people out, including myself! I’ve moaned about the fact that using map purely for side-effects in fact results in the form not executing. Despite this I fell into the trap again, however fortunately having encountered it before I could reverse into a quick doall.

Other teams imaginatively re-implemented doall using loop. Which I guess is testament to how easy it is to do things in a LISP.

One thing that was hellish in our team’s code and which I think cropped up in the other teams as well was the amount of set! we were applying to build up very low-level DOM calls. Right at the end I remembered that Google Closure was available to abstract some of that work away. However it still means that your knowledge of Clojure needs to be heavily supplemented by low-level DOM APIs as well as what is available in the Google Closure library (which is not the best known of libraries).

I was also wondering whether doto might not have cleaned up our code a lot. It’s an issue that a lot of Javascript mutable state is not easy to wrangle with things like threading macros that normally ease the pain. I’ve seen this in the WebGL dojos as well.

The final ugly issue of the evening was the project template that managed to both run on my machine and not run on my machine. The template was more complex that the standard SPA template as it used Compojure and Clojurescript (presumably using the former to serve static assets on localhost). Leiningen skeleton projects have to work and be reliable, otherwise potential adopters just get frustrated and quit.

The reactions were interesting, a guy on our team at the end asked why he would want to use Clojurescript. Good question. People who were doing things like building HTML5 games seemed to see the potential and advantage much more. This is an area I hadn’t really considered before but it does make a lot of sense as regular Clojure has already had a lot of success in implementing animation and complex state machines.

For me the alternating between high-level Clojure and low-level DOM APIs was painful. I’m going to be more interested in having wrappers that allow high-level programming consistently in a project. And I am going to be thinking about games more!

Standard