Colin's Journal: A place for thoughts about politics, software, and daily life.
I’m pleased with the performance improvements that are going to be in the next version of SimpleTAL. The performance of the template parsing is unchanged, but the performance of the template expansion is significantly better. Exactly how much better depends on what you are doing, but I’ve seen improvements in the 45-70% range using the simple tests that I have.
The first version of SimpleTAL that I did any benchmarking on was version 1.1. On my “Basic” template test it managed to do 5.38 expansions/sec on my lowly P3-450. Version 2.0 brought a massive improvement, getting 69.32 expansions/sec, with version 3.2 bringing the peak performance to date of 74.88. Following 3.2 performance slipped as new features and code to handle edge cases was added.
Today’s development version achieves 101.59 expansions/sec under Python 2.1, and a little more under 2.3. I think that I’ve now reached the upper limit of what I can easily tune the code to achieve. To go faster would require a significant change in approach (for example compiling to Python byte code).
Another topic: I’ve had to make another correction to my RSS feed, this time caused by a bug in my weblog plugin code. The problem was due to my code using the file extension of the current template (rss.xml), rather than the “day” template (day.html) when generating the links to posts. Thanks to Heather for pointing it out!
When I switched my weblog over to a PubTal based plugin I also switched the character set used by my RSS feed. Unfortunately the LiveJournal Syndication system doesn’t support ISO-8859-15, and so it hasn’t been syndicating my site for the last three weeks.
I’ve now switched it back to UTF-8 which should hopefully solve the problem for LJ users, and given universal support for UTF-8 in XML it should also work for anyone else aggregating the feed.
Profiling Python code is difficult. Python comes with two different profilers, called “Profile” and “hotshot”. The hotshot profiler is written in C and is meant to introduce little overhead into the running of your application. Unfortunately it also produces extremely variable output, with the timings determined for a particular function call differing by as much as 400%. This means you can’t make an optimisation and then check with hotshot to determine whether the effect was good, bad, or indifferent. The older “Profile” profiler has a similar problem, but seems to record smaller variations in the times it records for function execution.
Running these profilers was still useful however, because I’ve managed to improve SimpleTAL performance by ~35%. The single biggest performance gain was also the easiest to implement, and involved removing two debug statements. Much to my surprise Python’s logging library introduces a heavy overhead, even when debug is turned off. SimpleTAL used to make a call to a Logger’s debug function for every TAL path evaluated, and despite the significant amount of work performed during the rest of the path evaluation, this debug call took most of the execution time.
The other changes I’ve made were also fairly small, and have improved things by another 10% or so. To make any further improvements I suspect that large scale changes to the TAL/TALES interaction would have to be made.
Part of my distant past includes having to take religious studies up to, and including, GCSE level. This compulsion may be explained by the school being sponsored by the church, or it may just have been someone’s whimsy. Regardless of the reason, I had to study religion, composed of two parts: Judaism and Ethics.
In many respects I can probably claim to have failed the Ethics part, because there wasn’t any structure to it. Judaism, while being completely irrelevant to me, did at least include lots of things to remember. Having passed the GCSE exam I promptly forgot almost all that I had learnt on Judaism, apart from a few random facts and half-correct ideas.
Last night I had a chance to revive some of this knowledge by attending a wonderful passover meal hosted by a friend. The ceremony part was remarkable for its flexibility and lack of prescribed ritual. There was a very strong structure in place, but the actions performed and words spoken within that structure were extremely open. The written guide we were following originated from California, and in places it showed. It also strongly resembled a work in progress; in need of editing and in some parts completion.
The food and wine involved throughout the ceremony was excellent, although extremely filling. As ceremonies go it certainly has much to recommend it over the usual Christian ones that I’ve been exposed to (discounting Christmas and Easter meals – which in my mind are so far removed from religion that they don’t count).
I’ve finally had time to redesign my website. The content hasn’t changed, just the styling of the pages. Getting the CSS to work correctly took far longer than I had hoped, but it is mostly there now. If anyone knows why there are a couple of pixels of white space underneath the side image in Mozilla (on pages with little content), and some more above it in IE (on most pages), I would gladly like to know.
I don’t know how long this design will last. I had something else in mind, but as I started prototyping I found myself working towards this more minimalist design instead. If anyone has any problems in their preferred browser please let me know. Testing done so far includes:
Last weekend we were travelling to Arkansas, so I needed to purchase a book to keep me entertained on the flights. I chose an author new to me, Alastair Reynolds, and a book called Chasm City.
The book is far-future science fiction incorporating many tried and tested space travel ideas. The writing was generally good, but the relationships between the characters often felt false. Characters interacted in peculiar ways; conversations lacked emotion and actions seemed driven only by plot requirements. Even at 616 pages the book felt rushed, as though the plot fit but the story didn’t.
At some point I’ll try his earlier book, Revelation Space, which is selected as being the superior of the two in the Amazon reviews.
The news that the US will be fingerprinting and photographing all visa-waiver country citizens from September is marginally better than the previous plan, making us obtain a US visa, but it still disappointing. The introduction of biometric passports next year will make this a moot issue anyway.
The UK passport service is currently saying that they will place digitised photos into the new-style passports, but leave space for one other biometric identifier. Depending on what this second identifier is, and the most logical is a fingerprint, then all countries that you enter and leave will be able to record this information anyway.
Treating people as criminals isn’t just a policy for people crossing international borders; the UK is also planning to introduce biometric identify cards for the entire population. State interference in everyday lives seems to be the direction we are heading in.
I’ve let nearly two weeks slip by since last updating my weblog. The event most worthy of note during that period occurred over the past weekend, and that was our trip down to Arkansas. The trip itself was a bit of a nightmare on the outbound leg, a combination of bad luck with the weather and the typical bad customer service of American Airlines.
Discounting the travel as a necessary evil, the trip overall was a great deal of fun. I met the part of Shana’s extended family that I had until now missed out on meeting, and was reacquainted with those that I had met before. The weather in Arkansas was a very welcome break from Toronto, with highs somewhere between 70 and 80 Fahrenheit (21-26 C).
I, of course, took my camera with me. The poor Toronto weather has kept photography to a minimum recently, so I was looking forward to this opportunity to take some shots. Unfortunately the combination of rusty skills and time pressure meant that I came back with fewer good pictures than I had hoped for. If I had been alone I could have taken many more, but then the trip wasn’t intended to be a photography outing. This shot of a daffodil was one of the ones that did come out, despite the flowers being well past their best.
Prior to the trip to Arkansas, I had spent a good few days trying to unite the software which creates my website (PubTal) with my weblog program. I’ve done this in the form of a plugin for PubTal, which I’ll probably include in the next version I release. I can now write weblog posts in a text editor, or even OpenOffice.
The low cost of bandwidth, coupled with a competitive North American telephone market, has lead to some interesting business models becoming viable.
Take for instance this rather nifty sounding service UK 2 ME. The website allows you to enter a U.S or Canadian phone number, and it will in turn allocate you a national-rate non-geographic UK number (i.e. one in the 0870 range).
There is no charge for the service. Callers pay the normal UK national rate, and are connected through to your North American number, several thousand miles away. I haven’t tried the service yet, so I don’t know what the call quality is like. If it turns out to be good then it’ll be a very useful service, especially for travellers.
The service provider makes money because the cost of the bandwidth and the NA termination, are lower than the interconnect fee that it will receive for calls to the UK number. How long that’ll be true for I don’t know, but in the meantime it is a easy to offer service with virtually no overhead. There’s no customer service required, no billing to be done, just some automated provisioning to the network.
I hate finding bugs immediately after I’ve released software. It is particularly annoying when the bug is in the install script and not the application. I’ve just uploaded PubTal 3.0.1, now featuring an installer that includes the OpenOffice plugin….
Over this weekend our host had a problem with spamassassin, and it stopped marking anything as spam. I don’t receive a huge amount of spam, somewhere in the 30-60 messages a day, but it is more than enough to drive me crazy without filtering.
I set about trying to find a quick and easy filtering solution, and settled on DSPAM based on its reputation for accurate filtering. DSPAM has, unfortuately, got several issues that stops it being the non-intrusive spam filtering solution that I would like.
Firstly DSPAM is designed to work at the MTA level rather than working with email clients. Configuring MTAs is a pain, so at first I just ran it directly from Evolution with some limited sucess. The second problem I encountered was its speed, or lack of it. Although the website touts speed as one of DSPAMs major benfits, I didn’t see much evidence of this, with processing taking nearly one second per mail.
The final show stopper came when I finally tried to integrate it with my MTA (exim). The configuration wasn’t too bad, but once I had it all setup I couldn’t get it to succesfully process email because DSPAM would suffer a segmentation fault.
At this point I gave up and tried something else: Bogofilter. It was very easy to compile and install, except for the application of a small patch that is required for it to work with Berkley. Training on my mailboxes of spam and my inbox was extremely fast, and integrating it into Evolution was very simple.
Since doing this our host has got spamassassin working again. I’m still leaving Bogofilter as a second line of defence, and it has already caught some spams that spamassassin let through.
The full list of my published Software
Email: colin at owlfish.com