Software, Web Applications, Work

String Templates, or what I learned from Python and doing nothing

It’s an ill wind that blows no-one any good. The same is true of projects (although money generally helps more here; it’s an ill project that is making no-one any money).

I’m currently meant to be doing some work on Accessibilty for some new HTML pages. I thought it would be pretty easy but I was really wrong and it is changing the whole way I look at the View part of the (deceased) MVC web paradigm.

On my last project I was looking at things like Groovy’s Markup Builder and marvelling at how my collegues managed to put together a 30 line Freemarker template that did some pretty compex HTML assembly. In my spare time I have been looking at Haml as a way of escaping the verbosity and monotony of XHTML and to have the code guarantee the correctness of my page structure to avoid validation grind.

That’s because in those projects I was a developer/web designer. I wanted accurate, compliant HTML with minimum effort and which was easy to style without having awful CSS hacks.

On my current project I’m in the utterly baffling (for me anyway) world of .Net. There is no way that I can understand the huge variety of C#, XML and templating overrides that make up my current project. Having code generate HTML is a massive barrier to me being productive because, while I know a far bit of HTML having to root around an entire Visual Studio project to find the fragment that generates the problematic Div element you actually want to work with means I spend the whole day knowing nothing about .Net rather than applying the knowledge I do have.

Now some people are going to say that having a wacky Component model is different from having a nice templating language but look at something like Haml or Freemarker. The former is concise and fun and full of obscure rules; the latter is tremendously powerful, more firmly rooted in HTML and not much less obscure. For power users I agree, they are the bomb. They are a massive barrier to entry though, in a way that HTML just isn’t. People may do HTML badly but they rarely don’t do it at all.

This experience put combined with using Django/Jinja/Google App Engine is leading to me have a huge rethink about the way templating and views are put together. Passing a map of parameters to a template that is essentially exactly the way the output will look on the final device is obviously the way this problem should be tackled.

To try and get the HTML to generate in the current project I spent a day trying to get: SQLServer, BizTalk, Active Directory and Windows MQ to work together. This is utter madness and can only have been created by programmers who have no idea how to collaborate with non-programmers.

Why should I be trying to install BizTalk when what I want to do is actually generate some sample HTML so we can have a quick check of WAI standards? It should be possible to define some fixture data and then just generate HTML from the templates. It really shouldn’t be hard.

This experience is really changing the way I think about web frameworks. I am already determined to learn String Template, then I am going to look at whether my current favourite frameworks allow me to use it. I’m going to look at frameworks that ask you to put the HTML template next to the Java code. I want to know if I can put those templates in the same heirarchy that the actual website uses.

In short if I need to work with people outside the project team on a web project again, how can I get all the good things about templates and combine them with both simplicity and intuition as to how a website is organised?

Groovy, Programming, Software

Using Gant to build Java Projects

I know I said I would be taking a step by step introduction to Gant in my last post on the subject but sometimes the devil drives and you need things done in a less systematic way.

Recently I have been building Java and Scala projects with Gant. I think it has been a successful exercise so I am just going to jump on and show you some example buildfiles. The first one is going to be a Java project. This project is obviously toy code but I think if you just download the sample project and start filling in your own code (it’s an Eclipse project, Intellij and NetBeans should both import it successfully) you will be happy with how little you have to change the build file.

First have a look at the Gant file itself and then I am going to talk about the things that I think make Gant such a powerful and productive tool. The first thing to point out is the line count, this represents a complete build file for a Java SE project in under 100 lines. That includes the Hello World target I’ve left in as an echo test. Gant might have a slightly weird syntax if you are unfamiliar with Groovy but it isn’t verbose.

Targets and dependencies were in the last post so this time the new thing is using AntBuilder. Any method call that starts with Ant is a invocation of an Ant task. These are nothing but thin wrappers around the normal Ant task (and in fact I usually write them using the Ant Task documentation). Things that are attributes in Ant XML become hash properties in the parameter list. Things that would normally be nested elements are calls to the enclosing builder.

One area where Gant wins big is the way you can mix normal variables, Groovy string interpolation and Ant properties. Declaring the directory paths as Groovy variables near the head of the file allows me to create new paths via interpolation and assign the variable to the properties of an Ant task. Ant XML has properties and macro interpolation but this is both clearer and easier.

I am also using the Gant built-in clean task and in the course of using it, Groovy’s operator overloading for lists. I’m not a huge fan of operator overloading but if it’s clear enough here then it is great to have a DRY list assignment.

I also like the way that the directories are created in the init task. This kind of closure looping over a list again shows some of the power and conciseness that can be achieved by a language rather than a configuration file. If you don’t read Groovy then the each method iterates over the list its attached to and each item resulting from the iteration is stored in the whimsical “it” local variable.

Java, Software

Announcing: Xapper

Okay so here’s my first proper (i.e. finished) open source project. It is an implementation of an XHTML Builder on top of the excellent XOM library.

It is called Xapper (XHTML Wrapper) and it’s quite lightweight and (I think) easy to use.

Software

Google Chrome: what browses what?

Okay, so nearly a month after it lauched how is Google Chrome changing the way we browse? Well for Linux and OSX users, not very much. However on Windows, Chrome is finding a place into my day to day browsing. Firstly I have started to tend to use it with Google products. There’s nothing rational about this, it seems to be just brand fetishism.

I have also started to use it for any site that uses Gears. Since Gears is built in to the browser it just seems to make sense. I liked Gears a lot before Chrome and although I have it installed in Firefox I figure it is easier to use the features when they are integrated into the browser and have the advantage of the V8 Javascript engine.

I also use Google Chrome on sites where I actually expect a lot of Flash, script and Fail. Being able to kill poorly programmed sites while keeping on trucking with the browser is a pretty killer feature.

Finally I also use it to view links where I want to look at something briefly and then do nothing more with it. I don’t know whether it really makes a difference but I always wonder how much stuff Firefox caches when I am briefly checking a link for something a blog post.

Software, Work

Offshoring means your job too

I quite often come across a strange attitude amongst people in favour of offshoring development work. It reached a kind of apogee in a recent quote that “The thinking will remain in the UK but the work will be done abroad where it is cheaper.”

Don’t kid yourself! If you outsource the work then you also outsource the thinking. If you are an architect, or a designer and your implementation team is offshored then your work has also been offshored. Without the vital feedback from your implementers your high-level input is very quickly going to go stale. The understanding of how to do a job lies with those who do it, not those who raise the purchase orders.

Don’t flatter yourself! For some reason a lot of people in the UK seem to think that the IT staff in places like China, Eastern Europe and India are incapable. I know from experience that there are more smart people in all of these countries than there are in the UK. It’s a simple case of numbers. Don’t confuse offshoring to a lowest cost bidder with being incompetent. You pay so little for your code that it sends a message: we don’t care about our software. If you don’t care why is the wage slave that is eventually contracted to produce it?

So if you are involved in “higher food chain” work, don’t get caught up in the hype. The thinking will always end up following the doing, and then, ultimately, the purchasing does as well.

Java, Software

Jude, the Java Documentation Browser

I recently bought a license to Dave Flanagan’s Jude. I have enjoyed Dave’s books on Javascript, Ruby and of course the Java in a Nutshell series. I was a little disappointed that the Nutshell Java book didn’t get an update to reflect Java 6 and looking for more information I took a trip to Dave’s site and saw an updated version of Jude. I had tried it before but preferred the paper version of the book (it’s initial sections on Java features are still models in the genre).

However with no new edition in the offing I decided to buy Jude and process my Java 6 JDK. Jude basically reads the Javadoc and code in the target JDK and then serves that information out via an in-application HTTP server which you point your browser at.

Recently I have been switching between several “post Java” languages: JRuby, Groovy, Scala and Jython. In all of them though I have needed to know what exactly the details of the Java Library API are so I can access them via the host language. The other day I was filing a bug report and needed to know the details of the Charset object, as I was flicking between between Jude and the report I wondered, “When did this app become indispensible?”.

It might be argued that all you need is Google and the Sun Javadoc but Jude has a few features that make it much more useful. Obviously it useable offline, that’s not something to be sniffed at. Secondly its search and browsing features are intuitive and “right” for the domain. I find it a lot easier to browse through hierachies and leap between classes and packages using Jude’s dedicated tools than via Google. It has also replaced the not inconsiderable heft of Java in a Nutshell from my work bag.

If I could make one change to make it even more useful I would make the default search much more liberal than it is now. The search accepts Java RE which is great for power using but you shouldn’t have to enter /.*http.*/i to find every instance of a class with any variation of HTTP in it, you want to be able to just type “http” Google-style.

Software, Work

What’s the difference between Simple and Stupid in interview code?

Kent Beck (I believe) said: “Do the simplest thing that works, not the stupidest”. What does that mean in the context of showcase code (i.e. code you have written as part of some kind of application process)?

Well firstly think of what the point of showcase code is. What is the reviewer looking for as part of the application process? They are looking for evidence of a train of thought or method of problem solving that chimes with the kind of thought processes that they think are effective. Ideally these would be the “best” processes but you have to accept that a lot of processes are about finding people who agree with what an organisation thinks is best practice rather than respecting originality and clever solutions. You should never try to tailor your showcase code to an organisation because it is too hard to double guess what someone is looking for. You can only express yourself as clearly as you can and hope that it fits well with what is being sought.

A good piece of showcase code will appropriately abstract the problem being set, illustrating the way the candidate finds solutions to problems. It will neither be too elaborate, abstracting too much of the problem, nor too literal, coding “between the lines” of the problem being set.

Let’s take an example. If the problem is to model a set of automobile configurations in an object then the problem set may explain that a car is sold with seat configurations of 2, 4 or 6. In my view a stupid solution would be to have a list of just 2, 4, 6. Clearly the reviewer is going to want some flexibility here, the solution should be able to deal with a new configuration of 5.

An obvious solution is to just model the number of seats as an integer value. You can encapsulate the data by providing an API that allows you query how many seats a configuration has and provide a way of setting that value in an immutable way (once a configuration is set then the number of seats it has is unlikely to change in the lifetime of the object, that’s simplicity not stupidity). This solution just exposes that a configuration has a number of seats which is an integer, which is logical. It is an entirely sensible solution that keeps it simple.

How might a nervous candidate overcomplicate this situation? Well the number of seats really is immutable unless the problem set says otherwise, you don’t need to supply a mutator. Similarly the number of seats is never going to be a Real number.

What about modelling the number of seats as a list of Seat objects? Well yes that works because the encapsulation should hide the implementation and the number of seats is now just the number of Seats in the Seat list. Domain-driven design fans might even say this is a better solution that an integer because it is a closer description of the domain. However I think that simplicity would tend to demand that until we have something else to model about a Seat (say, whether it has a seatbelt) then it is hard to justify having a class whose only purpose is to be counted in a list. For all the difference it makes I could put Turtle objects into the Seat list. In terms of showcase code using this technique actually gives the reviewer a bit of a headache because instead of just ticking the box and moving on the reviewer now has to ask whether the new level of abstraction introduced is valid or not, is it a good idea to have a class with no functionality or data? What if actually the seat configurations are better represented by enumerated constants? The effort of creating the Seat class is just wasted and has to be replaced.

This solution at least stays within the bounds of an existing paradigm. If a candidate starts to abstract wildly then reviewer is going to give up in frustration. If the candidate starts abstracting the model to the point where it could model a motorcycle or truck just as well as car then they have just failed the process. Unless the problem says something about having a motorcycle then you don’t want to see Vehicle With An Engine type classes. After all if you start to generalise then where do you stop? Sure you can model a motorcycle and a truck but why not a plane? Or a rocket?

Software, Work

Don’t hate the RDBMS; hate the implementation

I read through this post with the usual sinking feeling of despair. How can people get so muddled in their thinking? I am not sure I can even bear to go through the arguments again.

More programmers treat the database as a dumb store than there are situations where such treatment is appropriate. No-one is saying that Twitter data is deep, relational and worthy of data mining. However not all data is like Twitter micro blog posts.

The comments on the post were for the most part very good and say a lot of what I would have said. However looking at CouchDB documentation I noticed that the authors make far less dramatic claims for their product that the blog post does. A buggy alpha release of a hashtable datastore is not going to bring the enterprise RDBMS to its knees.

I actually set up and ran Couch DB but I will save my thoughts for another day, it’s an interesting application. What I actually want to talk about is how we can get more sophisticated with our datastores. It is becoming apparent to me that ORM technologies are really making a dreadful hash of data. The object representation is getting shafted because inheritance is never properly translated into a relational schema. The relational data is getting screwed by the fact the rules for object attributes is at odds with the normal forms.

The outcome is that you end up with the bastard hybrid worst of all worlds solution. What’s the answer?

Well the first thing is to admit that application programmers think the database is a big dumb datastore and will never stop thinking that. The second is that relational data is not the one true way to represent all data. They are the best tool we have at the moment for representing rich data sets that contain a lot of relational aspects. Customer orders in a supply system is the classic example. From a data mining point of view you are going to be dead on your feet if you do not have customers and their orders in a relational data store. You cannot operate if you cannot say who is buying how much of what.

If you let developers reimplement a data mining solution in their application for anything other than your very edge and niche interests then you are going to be wasting a lot of time and money for no good reason. You simply want a relational datastore, a metadata overlay to reinterpret the normalised data in terms of domain models and a standard piece of charting and reporting software.

However the application programmers have a point. The system that takes the order should not really have to decompose an order taken at a store level into its component parts. What the front end needs to do is take and confirm the order as quickly as possible. From this point of view the database is just a dumb datastore. Or rather what we need is a simple datastore that can do what is needed now and defer and delegate the processing into a richer data set in the future. From this point of view the application may store the data in something as transient as a message queue (although realistically we are talking about something like an object cache so the customer can view and adjust their order).

Having data distributed in different forms across different systems creates something of headache as it is hard to get an overall picture of what is happening on the system at any given moment. However creating a single datastore (implemented by an enterprise RDBMS) as a single point of reference is something of an anti-pattern. It is making one thing easier, the big picture. However to provide this data is being bashed by layering technologies into all kinds inappropriate shapes and various groups within the IT department are frequently in bitter conflict.

There needs to be a step back and IT people need to accept the complexity and start talking about the whole system comprising of many components. All of which need to be synced and queried if you want the total information picture. Instead of wasting effort in fitting however many square pegs into round holes we need to be thinking about how we use the best persistence solution for a given solution and how we report and coordinate these many systems.

It is the only way we can move forward.

Software

Running Virtual Ubuntu

So at work I need to be able to have access to a personal UNIX playground and the form that you have to fill in to get a licensed VMFusion instance is a nightmare so I decided to look at the alternatives. I already had Parallels installed on my MacBook Pro but I had not done anything with it. I also decided to try and get Ubuntu running on my Windows Vista machine using the free (to download) VMWare Player.

VMWare Player requires a special image (I used this one) however once the software and the image was downloaded (the images are sensibly torrented although the player software itself does not seem to be), getting the system running was extremely easy. You just click on the image, it loads up and you update within Ubuntu as normal.

Getting Parallels working was not as as easy. I tried a standard DVD from a Linux Magazine, that failed with an X error where the X window could not be started. So I downloaded a text based installer and ran through that. It had the same problem and after reading this item in the Parallels Knowledge Base I took a guess at the problem and set the resolution during the text installation to be 1024 by 768. That sorted the issue and after that the major problem was networking. The Parallels installation did not seem able to share my wireless connection. Once I connected my Ethernet cable then the instance updated fine. Oddly once I set the VM to use Shared networking I could use the Wireless connection but counter-intuitively setting the Ubuntu instance to use the Wired connection. I guess at that point Parallels was able to weave a little magic and make the connection available and the issue of whether the physical hardware was Ethernet or Wireless was completely irrelevant.

Both systems run their virtual machines very quickly but VM Player seems to be the better suited to rapidly stopping and starting the machine. It works pretty much like a normal application, you fire it up and close the window when you are done. The Parallels application is much less seamless. Both applications use a similar amount of space to save their state, VM Player perhaps runs a bit fatter from my experience.

VM Player is pretty amazing for a product that is offered for free and is definitely a well-done teaser product. If you have never run a virtual machine before I would definitely recommend giving it a spin. Parallels is a slick and excellent program but its focus on running Windows under OS X seems to have led it to not being able to create a trouble-free installation experience for the leading desktop Linux distribution of today. That is a big mistake and even Parallels’ relatively low price tag of £50 to £60 does not excuse it. Some things should just work. After all at some point you are going to appreciate having the flexibility to install a OS how you like and at that point you may be more tempted to upgrade your existing solution than switch to a new application altogether.

Software, Work

The cruel young men and their DSLs

When faced with the question about how people are meant to learn more and more languages some pundits say that perhaps people shouldn’t be programmers if they cannot learn new languages. When you’re young, bright and brilliant that may seem a reasonable answer. However the truth is that no matter how high you try to set the bar on programming, the amount of programming to be done is far in excess of the capacity of the relatively small number of brilliant people in the world who are inclined to do it. Telling the people who make a living trying to answer this demand, with less stellar qualifications perhaps, that they should shape up or ship out isn’t going to win any friends.

It’s also pointlessly antagonistic. Getting to learn many languages should be seen as a chance to broaden and enhance skills. However that is not going to be attractive if organisations continue to provide incentives in terms of pay and opportunities to specialists. To respond negatively to the suggestion that you discard your hard-won investment in your language of choice is both natural and rational if you run the risk of earning less than the single-focus individual. DSLs will die a death unless they can be incorporated within the scope of an existing big beast language or employers adopt a capability rather than knowledge-based metric for pay rewards.

I also think that DSL aficionados often fail to point out to the broader audience of programmers that learning a DSL or even a variety of languages (most probably meaning at least one functional, dynamic and object-orientated language) will not be the same experience as the current depth learning of languages. Since a DSL should be for a specific purpose and have a small syntax or grammar customised to a particular problem or domain it will not be the same as being able to answer trivia such as what the problems with the Date API are in Java and what the Calendar class sets out to address and whether it succeeds or not. Interview questions may have to revolve around applying a new syntax for dealing with a particular problem instead of the usual language pop quiz.

Advocating languages as solutions should also involve advocating changes in employer priorities. If you don’t link the two then threatening someone’s livelihood actually makes it harder to achieve the DSL’ers joyful Babel of languages that matches tool to problem.

Echo One

Sequentially arranged sentences composed of words (and punctuation)

Category Archives: Software

String Templates, or what I learned from Python and doing nothing

Using Gant to build Java Projects

Announcing: Xapper

Google Chrome: what browses what?

Offshoring means your job too

Jude, the Java Documentation Browser

What’s the difference between Simple and Stupid in interview code?

Don’t hate the RDBMS; hate the implementation

Running Virtual Ubuntu

The cruel young men and their DSLs