Colin's Journal: A place for thoughts about politics, software, and daily life.
I have finished porting SimpleTAL to Python 3. Release 5.0 of SimpleTAL is for Python 3.1 and provides similar functionality as SimpleTAL 4.2 does for Python 2.5. The differences between using 4.2 and 5.0 are documented on the SimpleTAL notes page.
At first the porting process was fairly easy. I started by getting all test cases to run cleanly under Python 2.6 with the -3 flag, and then ran 2to3 to convert the basic syntax. The next step was to run the test cases under Python 3 to highlight issues that required manual changes. Sgmllib has been removed from the standard library, so I had to remove HTMLStructureCleaner from simpleTALUtils (it was unused within the library itself). The Iterator protocol change from “next” to “__next__” meant my iterator detecting code had to be updated.
The changes to character set handling in Python 3 introduced slightly more complex changes for the template handling. In Python 2.x the SimpleTAL library would handle all encoding / decoding itself, but in Python 3 this is not always required as there is now a clean separation between bytes and strings.
One issue that I hit when porting to Python 3 was the use of regular expressions. In order for SimpleTAL to pass through singleton XML elements from the template (i.e. <tag /> rather than <tag></tag>) it needs to carry out a regex check against the raw XML that the SAX library provides. This is done by retrieving the xml.sax.handler.property_xml_string property, which is documented as returning a string. In practise however the Python 3 SAX implementation returns bytes, which at first I assumed the regex library would not work with. A little bit of research later, and I learned that the regex library can work on bytes as well.
One final surprise was the huge performance gain moving from Python 2.6 to 3.1. The SimpleTAL performance tests show a minimum speed increase of 60% (on the METAL test), with some tests clocking in 90% increases. Both HTML and XML basic template expansions are now hitting over 1600 pages/sec on a single 1.7GHz CPU.
I usually wait a while, often months, before going through my photos to pick out the good ones. This has a number of disadvantages, particularly when it comes to remembering the names of places, but it does allow me to forget which photos I thought would be good at the time I took them. This is particularly useful as I otherwise have a tendency to not really look at the photos critically, but rather to skim through them looking for the one that I thought would work.
I’ve started working through 2008’s collection now, and having made it to September I’ve rediscovered a number of nice shots. The one attached to this post to my eye works really well, but at the time I was discounting it (and a few others like it) as not being ideal because I could not find a way to get the cat to look at the camera. Now that I can see the results I’m glad that the picture has turned out this way. It feels a much more natural shot than it would otherwise be.
There are still another 760 or so photos to go through before I’ve cleared the backlog. I really do need to catch up as I’ve already skipped ahead a number of occasions, making it easy to forget which photos I’ve been through and which still need sorting. I could start marking them to keep track I suppose, but then I’d have greater excuse to postpone going through them properly.
About a year ago my small camera bag developed a broken zip, which has resulted in me having to use my backpack for carrying the camera around. This made carrying the camera a serious commitment, which resulted in fewer photos being taken. For my birthday I’ve been given a new small camera bag, so I now need to get back into the habit of carrying my camera with me again.
The full list of my published Software
Email: colin at owlfish.com