Blogging, Python

PyCon UK 2022

This was the first in-person PyCon since the start of the pandemic. It had slightly changed format as well, now being mostly single-track (plus workshops) and not having a teacher/youth day. Overall I found the changes welcome. I’m generally a big fan of single track conferences because they simplify the choices and help concentrate the conversation amongst the attendees.

A lot of talks were by developer advocates but some of the most interesting talks came from the core language maintainers talking about the challenges in balancing the needs of different parts of the community while the least interesting were from the company sponsors who generally didn’t seem to have refined their content for a general audience.Typescript

A simple lightning talk about the need to sanitise the input of the builtin int function was illuminating about the challenges of reconciling web needs versus data science. However the general point about whether unlimited number functions are necessary or not was interesting. Similarly the challenge of when to adopt new syntax and keywords in the case of the addition of Exception Groups to the language.

There were a few talks about how the process of maintaining a language and community actually works. How conferences get listed on the Python website, how performance benchmarks are built and used and the desire to have a single place to centre conversation about the language.

There were talks about how to interface to various technologies with Python (Kafka, Telegram) and the inevitable talk about how to improve the performance of your data science code by using NumPy. There were also quite a few talks about Hypothesis; which probably still isn’t used as much as it should be in the community. The library now includes a test generator that can examine code and suggest a test suite for it which is quite a cool feature and certainly would be a boon for those who dislike test-first approaches (whether it has the same quality benefits is another question).

The other talk that had a big impact on my was the introduction to using PyScript in static websites. Python is going to start targeting WebAssembly as a platform which will help standardise the various projects trying to bring a fully functional interpreter to the browser and provide a place to pool efforts on improvements.

PyScript aims to allow Python to also script DOM elements and be an alternative scripting language in the browser. That’s fun but not massively compelling in my view (having done many compile to Javascript languages in the past I think it’s easier to just do Javascript (with an honourable exception for Typescript). Being able to run existing code and models in the browser without change maximises the value and potential of your existing Python codebases.

Cardiff remains a lovely venue for the conference and I think it is worth taking time out from the tracks to enjoy a bit of the city and the nearby Bute Park.

Links

  • Pyjamas, a 24 hour online conference
  • Pyrogram, a Python library for interacting with Telegram
  • HPy, an updated API for writing C extensions in Python
  • Discuss Python, the general Python community forum
  • Trustme, a testing library that allows local certificates as testing fixtures
  • PyScript, Python in the browser via Web Assembly
  • PyPerformance, benchmarks for testing Python implementations
Standard
Programming, Python

403 Forbidden errors with Flask and Zappa

One thing that tripped me up when creating applications with Zappa was an error I encountered with form posting that seems to have also caught out several other developers.

The tldr is that if you are getting a 403 Forbidden error but your application is working locally then you probably have a URL error due to the stage segment that Zappa adds to the URL of the deployed application. You need to make sure you are using url_for and not trying to write an absolute path.

The stage segment

Zappa’s url structure is surprisingly complicated because it allows you to have different versions of the code deployed under different aliases such as dev, staging and production.

When running locally your code doesn’t have the stage prefix so it is natural to use a bare path, something like flask.redirect(‘/’) for example.

If you’re using the standard form sequence of GET – POST – Redirect then everything works fine locally and remotely until the raw redirect occurs remotely and instead of getting a 404 error (which might tip you off to the real problem more quickly) you get a 403 forbidden because you are outside the deployed URL space.

If you bind a DNS name to a particular stage (e.g. app-dev.myapp.com) then the bare path will work again because the stage is hidden behind the CloudFront origin binding.

Always use url_for

The only safe way to handle URLs is for you to delegate all the path management and prefixing to Zappa. Fortunately Flask’s in-built url_for function, in conjunction with the Zappa wrapper can take care of all the grunt work for you. As long as all your urls (both in the template and the handlers) use url_for then the resulting URLs will work locally, on the API Gateway stages and if you bind a DNS name to the stage.

If this is already your development habit then great, this post is irrelevant to you but as I’ve mostly been using Heroku and App Engine for my hobby projects I’d found myself to be in the habit of writing the URLs as strings, as you do when you write the route bindings.

Then when the error occurred I was checking the URL against my code, seeing that they matched and then getting confused about the error because mentally I’d glossed over the stage.

Standard
Gadgets, Programming, Python

Creating the Guardian’s Glassware

For the last two months on and off I’ve been developing the Guardian’s Glassware in conjunction with my colleague Lindsey Dew.

Dealing with secret-pre-alpha hardware has at times being interesting but the actual process of writing services using the Mirror API is actually pretty straight-forward.

Glass applications are divided into native, using the Glass SDK, and service-based Glassware using Mirror. Mirror is built on web-friendly technologies such as HTTP, JSON and OAuth2. It also follows the Google patterns for APIs so things like authentication, discovery and the client libraries are all as you would expect if you’ve used a modern Google API before.

For our project, which was focussed on trying to create a sensible, useful newsfeed to Glass, we went with Mirror. If you want to do things like geolocation or picture and video upload then you’ll want to go native.

For various reasons we had a very narrow initial window for development. Essentially we had to start and finish in May. Our prototyping was done with a sample app from Google (you can use Mirror without an actual device), the Mirror playground and a lot of imagination.

When we actually got our Glass devices it took about a week to get my head round what the usecase was. I was thinking that it was like a very lightweight mobile phone but it is much more pervasive with lots of light contact points. I realised that we could be more aggressive about pushing information out and could go for larger sets of stories (although that was dialled back a bit in the final app to emphasise editorial curated content).

Given the tight, fixed deadline for an unknown product the rest of the application was build using lots of known elements. We used a lot of the standard Glass card templates. We used Python on Google App Engine to simplify the integration service and because we had been building a number of apps on that same stack. The application has a few concerns:

  • performing Google Authentication flow
  • polling the Guardian’s Content API and our internal Notification platform
  • writing content to Mirror
  • handling webhook callbacks from Mirror
  • storing a user’s saved stories

We use Content API all the time and normally we are rendering it into widgets or pages but here we are just transforming JSON into JSON.

The cards are actually rendered by our application and the rendered content is packaged into the JSON along with a text representation. However rendering according to the public Glass stylesheet and the actual device differed, and therefore checking the actual output was important.

The webhooks are probably best handled using deferred tasks so that you are handing off the processing quickly and limiting the concern to just processing the webhook’s payload.

For the most part the application is a mix of Google stock API code and some cron tasks that reads a web API and writes to one.

Keeping the core simple meant it was possible to iterate on things like the content mix and user interactions. The need to verify everything in device served as a limiting factor.

Glass is a super divisive technology, people are very agitated when they see you wearing it. No-one seems to have an indifferent opinion about them.

Google have done a number of really interesting things with Glass that are worth considering from a technology point of view even if you feel concerned about privacy and privilege.

Firstly the miniaturisation is amazing. The Glass hardware is about the size of a highlighter and packs a camera, memory, voice synth, wifi and bluetooth. The screen is amazingly vivid and records and plays video well. It has a web browser that seems really capable of standard HTML rendering.

The vocal recognition and command menus are really interesting and you feel a little bit space age when you fire off a Google query and get the information you’re looking for read back to you in seconds.

Developing with the Mirror API is really interesting because it solves the Android fragmentation issue. My application talks to Mirror, not to the native device. If Google want to change the firmware, wire protocol or security they can without worrying about how it will affect the apps. If they do have to make breaking changes then can use the standard webapi versioning they already use.

Unlike most of the Guardian projects this one has been embargoed before the UK launch but it is great to see it out in the open. Glass might not be the ultimate wearable tech answer; just as the brick phones didn’t directly point to the iPhone. Glass is a bold device and making the Guardian’s journalism available on a new platform has been an interesting test of our development processes and an interesting challenge to the idea of what web-capable devices are (just as the Pixel exposed some flaky thinking about what a touch device is).

What will be interesting from here is how journalists will use Glass. Our project didn’t touch on how you can use Glass to share content from the scene, but the Glass has powerful capabilities to capture pictures and video hands-free and deliver it back to desk editors. There’s already a few trials planned in less stressful feature pieces and it will be interesting to see if people find the interface intuitive and more convenient that firing up their phone.

Standard
Programming, Python

How does the patch decorator in Mock work?

I tend to use Mock more as a stubbing library rather than for mocking. The patch decorator is pretty handy in terms of this as it takes care of all the resetting once your stubbed test has run making it easy to have a test where a dependency returns an empty list, followed by a single-entry list and so on.

However I often forget how exactly it works so I’ve decided to write up my latest remembering of how to do this (via John Hartley’s help and reminders) so I have something to look up next time I forget.

The first thing is that the patch decorator takes a string that represents the fully qualified name of the stub/mock you want to create. In a Django app for example that means you should include the app name at the root. The name also reflects the local name of an imported item. Something I commonly do wrong is to bind to the absolute name, say ‘random.choice’ rather than ‘myapp.mymodule.random.choice’. If you are in the situation where your stub is correct when you call it directly but never happens when you run the code under test I am pretty sure that naming will be at the root of your problems 95% of the time.

For each string argument you have in patch you also need to define a parameter to the test function, this will contain the actual Mock object and is what you use to actually stub the value to what you want it to be for the test. Use names that make sense here, stub_db, fake_file_reader not just mock1, mock2 and so on.

With these relatively few reminders in place you should now be in a position to stub simply with Mock!

Standard
Python, Web Applications

Deploying Python apps to Epio

I recently got my beta access to ep.io, the Python application deployment platform. I had the chance today to have a play around and try out some deployments so I thought I would try and give my view on the experience before. I’ve deployed Python apps to Heroku and Gondor before so those services form my reference points here.

So firstly, there’s a command-line client that you install via pip and you effectively deploy to the platform via a client-command, SSH keys and what looks like git on the server-side. This is more like Gondor than Heroku (which is intimately linked to git). It means you have your choice of source control and if you want to be a Python purist you never need to step outside of Python for everything you are doing.

Applications consist of essentially one configuration file that states where the WSGI application is and what the requirements file is. Compared to Gondor it is a very simple setup but it did feel that it could be even simpler if it made convention-based assumptions such as the requirements file being called requirements.txt, for example.

Leveraging WSGI and configuration this way gives a very flexible platform and I was able to get both Flask and Bottle to work (the former very quickly because it has documentation, the latter via trial and error that might require its own blog-post). I didn’t have time to try Django but I felt pretty confident that I could get whatever framework I wanted working once I understood the basic setup.

Unlike Heroku, Epio provides a fixed framework for executing the apps. It seems you will be running behind NGINX and Gunicorn. Both are good choices and I certainly like them but if you want to play around with different servers like Tornado or CherryPy you may prefer Heroku’s more open deployment model. I did like the way that you can use the configuration file to have NGINX serve static content directly.

Epio naturally has less of an ecosystem than Heroku but has Solr, Postgres and Redis out of the box. All solid choices and covering off the majority of what I would need. I was certainly grateful that I didn’t have to grapple with remote database administration and could prototype apps with just Redis.

Deployment and logging have kind of rough edges. Being able to access logs directly from the application page was a win for me, however when I was struggling to define the WSGI entrypoint correctly it seemed as if the application wasn’t being really compiled until the first request comes in. I would see an entry confirming a new deployment but then nothing until I hit the app. I think there should be some kind of sanity check of what you have uploaded to see whether it will even run.

Right now epio is providing a Python-based cloud deployment platform with a sensible set of supplementary services and low opinion about the source control system to you use. It feels like if this had been around at the start of the year it would have blown me away. However now there is more competition and therefore questions of price and ease of use will matter in terms of  how compelling it is to use the service.

If you do Python web development I would definitely recommend you sign up for beta and give it a go yourself as it seems a very solid prototyping platform. If you are not a Ruby and Git fan then you may well love what is on offer here because it is already very convenient, makes few demands on you and gets your web app public in minutes.

Standard
Programming, Python

Django and JSON stores: a match in heaven

My current project is using CouchDB as its store and Django to implement the web frontend. When you have a JSON store such as CouchDB then Python is a natural complement due to its brilliant parsing of JSON into native data structures and its powerful dictionary data type that makes it really easy to work with maps.

In a previous project using Python and Mongo we used Presentation objects to provide domain logic on top of the raw data maps but this time around I wanted to try and cut out a layer and just work with maps as much as possible (perhaps the influence of the Clojure programming I’ve been doing on the side).

However this still leaves two problems that need addressing. Firstly Django templates generally handle dictionaries like a dream allowing to address them with the standard dot syntax. However both Mongo and Couch use leading underscores to indicate “special” variables and this clashes with the Python convention of having a leading underscore indicate a private member of the class. The most immediate time you encounter this is when you want to use the id of a document in a url and the naive doc._id does not work.

The other problem is the issue of legitimate domain logic. In Wazoku we want to use people’s names if they have supplied them and fallback to their email if they haven’t supplied their name.

The answer to both of these problems (without resorting to an intermediary object) is Django’s filters. The necessary logic can be written in a few lines of Python that simply examines the dictionary and does the necessary transformation to derive the id or user’s name. This is much lighter than the corresponding Presentation solution.

Standard
Programming, Python, Ruby

Truly open classes

Here’s an interesting observation, I needed to write a little script to automate some number calculating for me. I was wondering whether to do it in Ruby or Python. I’m doing a lot of Python at the moment so I felt I ought to give Ruby a little go. Share some of the love.

However the solution I had in mind really didn’t work with Ruby because while Ruby has open classes it has a comparatively fixed idea of attributes. In Python you can set attributes very freely on any object so I have got in the habit of creating something and then enhancing by applying a function. Example? Okay.

def make_captain(actor):
actor.rank = "Captain"
return actor

class Person:
pass

captain = make_captain(Person())

So this little trick doesn’t work, or rather is much more difficult to do in Ruby as Ruby, at is dynamic heart, is a language that believes in object-orientation and that classes should encapsulate rather than being little collections of data. You can use instance_variable_get/set but it lacks the elegance of the Python syntax.

In Ruby it would be easier to define the attributes in the class using the existing metaprogramming constructs and then have a class method to generate the content (effectively encapsulating my script logic).

Now this isn’t a straight “Ruby sux, no Python sux more” post. Between Scala, Clojure and Python I have been doing a lot more in a functional style that depreciates objects as anything more than value carriers. The Ruby vision of a class would give me something with a stronger sense of purpose and encapsulation, something that is hard to benefit from in a script for a particular purpose.

What is going to be interesting this year is trying to identify when the value of a piece of code is in the structure of it’s data-definition (i.e. objects) versus its process (functions). Having had a think about it I should perhaps rewrite my script to use some OO modelling because it may answer similar requirements down the line. However from a strict Lean/Waste point of view I should have gone with the Python solution as Ruby was imposing a restriction on me while providing benefits that I was unlikely to realise.

Standard
Python, Web Applications, Work

Declare for simplicity

I was struggling with an issue today on a template that was blowing up due to missing keys in a template data hash. At first I tried to write some conditional code in the template. That ended up being quite ugly so then I was trying to find a way to condition the template on whether the key was present.

Then it struck me that the issue was really the missing key. As the hash is prepared in Python code the original implementation frugally avoided creating the entry if the underlying data wasn’t present. While this is clever and minimal it actually just pushes the complexity out of the Presenter logic and into the template where it is much worse because you can only interact with the Presentation’s abstraction of the original data.

The problem here is that dynamic languages sometimes allow you to be too clever. In a declared structure system you have to declare all your data and provide sensible defaults. This makes writing the templating solution at lot easier as you never have a missing data problem to deal with (you still have headaches with duff defaults but that is a different post).

So I went back to the presenter and declared an empty list under the key that had been causing me grief. Bingo! My failure went away and the default behaviour of not generating any content for the empty list kicked off, reducing my template back to the unconditional single-line I had originally.

Standard
Python, Web Applications

Juno, a micro web framework for Python

I love microframeworks and I love Python to it therefore follows that I love the minimal web frameworks you can get for Python like CherryPy. However as soon as I heard about Juno, a clone of my favourite microframework Sinatra I had to give it a go.

For me the archetypal application for Sinatra is Git-Wiki and therefore to kick the tires on Juno I decided to try and do the same thing with Bazaar. Hence:  Bzr-Wiki on Juno!

You need Juno, Jinja2 and SQLAlchemy as well the source from Launchpad. Once you have it all then got to the repo directory and type bzr init. You should then be good to fire up wiki.py and muck around with it.

It is slightly ugly and lacks access to revision history, diffs and user tracking but apart from that it is a surprisingly functional wiki. It is also bi-directional in that you can add files to the Bazaar repo and they get reflected in the app.

So what was Juno like to work with? Well overall I thought this was the best Python microframework I have used so far. I really like the idea of decorating methods to avoid having to generate a mapping table. The syntax is terse and comprehensible, the conventions around the framework made sense to me.

By comparison with its progenitor I think I really missed the autoloading/dynamic evaluation that allows you to change code in Sinatra and have it immediately take effect. The function of the Request decorator was initially quite obscure (it binds all HTTP verbs to that method, if you want to map GET to another method you must specify all verbs independently) and I am still not sure it is right. I think the most specific decorator should take precedence. Other than that I think the framework ports a lot of the concepts from Sinatra in a sympathetic way.

The dependency on SQL Alchemy is also really clunky. If you specify that you are not using a database (as is the case here) then it is annoying to have to download a dependency and makes installation on Windows a pain I didn’t even want to try and tackle.

Juno is really promising though and I look forward to it developing. I think it would be a real delight to use it in an environment like Google App Engine.

Standard
Java, Programming, Python, Scripting, Web Applications, Work

JMeter or The Grinder, so which one is better, like?

Or for the benefit of Google: Apache JMeter versus The Grinder. Fight!

A while ago I got paid to put these two tools head to head and I think it’s been long enough that the people who put up the money will have got the benefit now.

I had used The Grinder in a previous job where it had done excellent service finding out what level of peak load our site could handle. It was convenient in that you script it in Jython because Jython was our chosen scripting language.

JMeter on the other hand was a whole new thing to me. It is a kind of graphical programming interface where you drop controls into a tree structure and the framework generates threads that run through the tree generating interactions with the target application via things like HttpClient.

The scripting versus component structure is a pretty major difference between the two and is probably one of the key things that is going to help you choose between two products that are both, to be honest, pretty mature, capable pieces of software.

If you don’t feel comfortable with Python or Jython then Grinder is likely to be out. JMeter is much more friendly to non-programming testers. However this decision isn’t as clear as you might think, JMeter actually includes Javascript functions, access to BeanShell and its own expression language. As we created more complex traffic simulations we found ourselves having extremely complex expressions that often mixed a scripting language with an expression evaluation. In some ways The Grinder is more honest in telling you up front that you are going to have to program some of this stuff yourself.

Another big differentiator is in reporting, if you need to generate reports then JMeter is a much better choice as it has components that collect and collate information and you can dump data to XML for XSLT transformation into pretty much any output format you want. It is also possible to run JMeter headless once you have the test scripts so it is possible to encorporate it into a continuous integration process. Debugging slightly flaky applications is much easier with JMeter because you can easily capture a lot of result information and then just delete the data gathering component once the test flow is working.

A lot of the functionality of both products overlaps: they can both be used a browser proxy and be used to capture scripts as a user “drives” around the site. These captures are likely to form the basis of your test plans unless you have a very clear URL scheme with little session state.

Both use thread pools to generate load and both tend to get blocked on your application before they max out their local machine (although both are capable of using 90% of memory and CPU). JMeter has a few nice options around randomly ramping up the thread activation (recreating a spike in usage more closely). The Grinder tends to give more bang for the buck as its runtime is very simple and minimal.

Both are also capable of using distributed agents and collating the data from these agents back to the coordinator.

In many ways there isn’t a “bad” choice to be made between them. So what might sway your choice one way or another? Well JMeter is an Apache project and that might make a difference for some people as you have all those good things like project governance and a good chance the project is going to continue to be developed and enhanced. JMeter was also the only project to have a test suite last time I checked out both code bases.

The Grinder does have one unbeatable quality in my opinion and that’s flexibility. When you look at the features listed on the websites you might think that JMeter does all this cool stuff with HTTP, SOAP and POP3 and the like and that The Grinder is comparatively light in the feature set.

The opposite is actually true, as The Grinder website points out the The Grinder can test anything with a Java API. There really is no limit to what you can make it do or how you can execute your test plan. In fact most of the things I’ve complained about with The Grinder are fixable from within the test scripts  (what I am really complaining about is that some things like capturing HTML output per run is something that should be available as part of the standard package). On the other hand I felt that creating a new JMeter component was actually quite tricky as you have to deal with both the GUI aspects and the actual functionality you are trying to create in your test component.

If you really want to have total control of your concurrent volume testing then The Grinder is probably the best product for you.

Perhaps the last topic for consideration is The Grinder’s use of Jython. Python is a great language and you get a lot of power for some very concise code. Just as some people are going to find it off-putting others are actually going to find it very compelling.

Okay so yet another weak “horses for courses” post on the InterWeb. Perhaps the easiest summary to end on is that your test teams will probably take a shine to JMeter while your developer teams who are also building scale tests are probably going to like The Grinder; unless they dislike Python or learning a new language.

Standard