Colin's Journal: A place for thoughts about politics, software, and daily life.
Yesterday I thought I would take a look at the performance of SimpleTAL, and look to see if there were any easy ways of improving it. I took a small (one screen full) template consisting of lots of ordinary text, a repeat command, and a couple of content commands, and timed SimpleTAL expanding it 200 times. The result was around 5 templates/sec.
I had an idea of pre-parsing the template and turning it into a series of events (start tag, data, and end tag). I implemented this fairly quickly, and found that performance improved up to the 11 templates/sec mark. I know, however, that Zope’s TAL engine can go significantly faster than this, so I started looking at it again and trying to work out how I could improve things significantly.
The current SimpleTAL implementation uses OO methodology fairly heavily. This means that for each tag in the template an object is created, and at least one handler object (often more). The tag is then passed to each handler which does various things to it based on the evaluated expressions coming back from the simpleTALES module. The result is that for a given run of the template, even with the HTML/XML parsing done before hand, there is a significant amount of object creation (expensive), a large number of method calls rather than variable access (expensive) and text manipulation/parsing.
The Zope way of getting around this is to parse the template into an inter-mediate byte code. This byte code is then used by an interpreter to generate the template, with very little in the way of object creation. I’m now re-factoring SimpleTAL in a similar way to see how much improvement I can get, and so far it’s looking promising. I’m still along way from finishing, but I have content and repeat working well enough to run my performance template, and the result is now around 90 templates/sec – a near 95% improvement! The unfortunate side effect though is that the code is harder to understand because it’s data structure driven instead of object driven, which will make maintaining the code a lot harder.
The full list of my published Software
Email: colin at owlfish.com