Friday, April 29, 2005

how well do you know python, part 6

class foo(list):
    def __eq__(self, other):
        raise 'foo.__eq__ called'

>>> help(list.__eq__)
Help on wrapper_descriptor:

    x.__eq__(y) <==> x==y

>>> [].__eq__(foo())
>>> [] == foo()
Traceback (most recent call last):
  File "", line 1, in ?
  File "", line 3, in __eq__
foo.__eq__ called

Help says those two statements should be equivalent. Why aren't they?

Wednesday, April 27, 2005

Any questions?

Closing on a real beta for Spyce 2.0, is there anything you'd like to see better illustrated? I have time to do another demo in the next couple days provided the scope is reasonable.

Tuesday, April 26, 2005

Extending Spyce with spyceModule

One of the ways to extend Spyce is with a "Spyce module." This is a historical term, and a little unfortunate because some people have assumed when I talk about "modules" I automatically mean the Spyce variety rather than the vanilla python variety.

A Spyce module is simply a class that extends spyceModule.spyceModule. That's it.

Spyce modules may be used in a .spy page with

[[.import name="modulename"]]

which instructs the compiler to create code that instantiates an instance of the given spyceModule at the beginning of each request for this page. It also automatically invokes the instance's finish method when the request finishes.

What can you do with Spyce modules? A common reason to write a Spyce module -- perhaps the most common -- is is to provide some sort of resource pooling. Here's the spydurus module which does exactly that:

CONNECTIONS = 3 # max connections to put in the pool

import Queue
from spyceModule import spyceModule

from durus.client_storage import ClientStorage
from durus.connection import Connection

q = Queue.Queue()
for i in range(CONNECTIONS):

class spydurus(spyceModule):
    def start(self):
        self.conn = None
            self.conn = q.get(timeout=10)
        except Queue.Empty:
            raise 'timeout while getting durus connection'
            self.conn.abort() # syncs connection

    def finish(self, err):

That's it! (Well, minus some convenience methods dealing with self.conn that are elided for concision here.) Note that finish is always called, even if your code raises an exception, even if you redirect to another page, even if another module's finish method raises.

Most modules can also be usefully used as a python module -- for instance, in the Spyce to-do demo uses spydurus.q to get one of the pooled connections in a non-.spy context.

spydurus doesn't need it, but spyceModule also provides a hook into the spyce server in the form of self._api. You can do just about anything from this. I've seen a module that implements cgi_buffer functionality, a module that provides a Struts-style controller, and more. And of course there's all the standard modules.

(Updated links to point to the site.)

Monday, April 25, 2005

how well do you know python, part 5

I was reminded of this one while working on the Spyce/Durus demo to make it a better example of using Durus in a real application.

What is wrong with the following code?

import time, threading

finished = False

def foo():
    import sys
    global finished
    finished = True

while not finished:

import A

print 'finished'

Update: fixed so there was only one problem.

Thursday, April 21, 2005

Spyce 2.0 prerelease

There won't be an official beta this week after all; Rimon wants to review the code and docs over the weekend before we take that step.

However, as an unofficial beta, Spyce 2.0 is available from subversion:

svn co

The featureset is solid at this point. If you start playing around now, you won't get burned by changes of that sort. You may well find bugs, of course, which I'll fix as soon as I can.

The docs are up here, including a pretty good introduction as to Why Spyce 2.0 Rocks, if I say so myself. (That is, I say the intro is pretty good. But also that Spyce2 rocks, for that matter.) The to-do demo and an enhanced version of the chatbox demo are both included.

Update: forgot a link to the changelog.

Not your father's spyce

You've taken spyce off my "disgusting template engines" list. This is how I'd want to see apps structured, exactly.
    -- Tim Lesher

As a followup to my Durus notes, I've put up a new Spyce + Durus demo showing off the features of Spyce 2.0.

This time I allowed myself more than two files, and split the action logic into a separate module. (To-do lists seem to be the new standard for demoing your web toolkit, and I dare not be less than trendy.)

I tried to comment the code (linked from the demo pages) so it speaks for itself. If you have questions after looking at that and the language constructs page, please feel free to ask in the comments here. I'd like the docs to be in good shape for the 2.0 release.

Update: I've modified the demo to include spydurus, a Durus connection-pooling module for Spyce, and to use ClientStorage instead of FileStorage. The demo was popular enough that single-threading it got to be annoying.

(Updated links to point to the spyce site.)

Wednesday, April 20, 2005

Introduction to Durus

The README that comes with Durus is missing a couple pieces of information that are critical if you actually want to write a program that uses Durus, and the only other documentation appears to be the 2005 Durus pycon presentation, which gives an admirable description of the technical underpinnings but doesn't fill in the blanks of how to use it, either.

Specifically, as far as code samples go, the README gives you this:

Example using FileStorage to open a Connection to a file:

    from durus.file_storage import FileStorage
    from durus.connection import Connection
    connection = Connection(FileStorage("test.durus"))

And this:

    # Assume mymodule defines A as a subclass of Persistent.
    from mymodule import A 
    x = A()
    root = connection.get_root() # connection set as shown above.
    root["sample"] = x           # root is dict-like
    connection.commit()          # Now x is stored.

That's all you get.

Okay, in this situation it's clear how to retrieve x -- look it up by the key we've hardcoded. But what if x is a list of Persistent objects, and I want to retrieve one of those. Do I have to keep track of its index in x? Do I need to generate my own unique IDs? (Note that the linked code has an obvious race condition in a multithreaded environment.)

No, you don't. Durus assigns each Persisent object an attribute called _p_oid. (P object id. Not sure if P stands for Persistent, or Private, or something else entirely.)

    def __new__(klass, *args, **kwargs):
        instance._p_oid = None # <-- this is the oid that get() cares about

    def _p_format_oid(self):
        return format_oid(self._p_oid)

_p_oid is a four-byte binary string, so for passing around as a GET or POST variable in a web application, the format function (which turns it into a string representation of the oid number) is handy.

Now that you're passing oids around, Durus gives you an easy way to retrieve the object it identifies:

    def get(self, oid):
        """(oid:str|int|long) -> Persistent | None
        Return object for `oid`.

        The object may be a ghost.

Note that if you do pass around the formatted id, you'll need to turn it into a Python int (or long) before sending it to get; if you pass the string '123' get will assume it's a valid (binary) oid and not autoconvert it.

Now that the code diving is out of the way, I'm enjoying Durus a lot. Next post I'll give a short Spyce demo using Durus for persistence.

Monday, April 18, 2005

Spyce tag compilation example

I had the question,

What do you mean by "compiling" tag libraries?

I mean, that Spyce compiles that chatbox.spy into in a more-or-less 1.3-legal python module. (The classcode/handlers/exports features aren't in 1.3, but you get the idea.) The compiled result looks like this (output courtesy of the Spyce -c option):

class boxlet(spyceTagPlus):
  classcode=((7,4),(10,83),"def addLine(self):\n    # (use get() in case server restarted)\n    request._api.getServerGlobals().get('chatlines', []).append(request['newline'])",'test-chatbox.spy')


  def syntax(self):
  def begin(self,width='300',lines='5'):
    self._out.writeStatic('\n<div width="')
    line=None# for first export

    for line in pool.setdefault('chatlines',[])[i:]:
      self._out.writeStatic('  <div>')
    self._out.writeStatic('  <div>\n  <f:text name=newline />')

    self._out.writeStatic('\n  <f:submit handler=\'self.addLine\' value="Send" />')

    self._out.writeStatic('\n  <f:submit value="Refresh" />')
    self._out.writeStatic('\n  </div>\n</div>\n\n')
  def export(self):
class spyceTagcollection(spyceTagLibrary):

You can see why not many of these got written. :)

Spyce active tags, version 2

(Updated with a little more explanation as to what's going on.)

The Spyce Active Tag compiler is coming along nicely. Spyce has had active tags since 1.3.0 (current active release is 1.3.13; it's very, very stable by now), but writing a tag library shares a lot of the problems that JSP 1.x tag libraries had; it takes a lot of code to get something done.

Now, I've updated the Spyce compiler to be able to compile tag libaries, and tied it in to the active handler feature as a bonus. Meaning, tags can wrapping their control logic together with the view so all the user has to write is a single tag, like "<chat:boxlet />" below. (You could put the controller logic into another module and write handler="foo.addLine" instead of self, but I'm keeping it simple here.)

Here's a simple example that defines and uses a chatbox component. Chat state is stored in the server globals area for simplicity. This code is running (for the next few days at least) over here. Update: original demo is down, but a slightly more sophisticated version (showing two chatboxes on the same page) is up at the spyce 2.0 prerelease site.

(I hope to get a beta of the next Spyce release (1.4? 2.0?) out later this week.)


[[.tagcollection ]]

[[.begin name=boxlet singleton=True ]]
[[.attr name=width default=300 ]]
[[.attr name=lines default=5 ]]

[[\  # this code runs when user hits Send
  def addLine(self):
    # (use get() in case server restarted)
    request._api.getServerGlobals().get('chatlines', []).append(request['newline'])

[[.import names="pool"]] [[-- creates pool alias for request._api.getServerGlobals() --]]

<div width="[[= width ]]">
  [[     i = -int(lines)
     line = None # for first export
  [[ for line in pool.setdefault('chatlines', [])[i:]:{ ]]
  <div>[[= line ]]</div>
  [[ } ]]
  <f:text name=newline />
  <f:submit handler='self.addLine' value="Send" />
  <f:submit value="Refresh" />

[[.export var=line as=last ]] [[-- sends this variable to the calling scope --]]


[[.taglib as='chat' from='chatbox.spy']]

    <title>Active tag handler test</title>

      <chat:boxlet />
      The last line in chat is: [[= last ]]

(whitenoise added to keep blogger from screwing up my formatting...)

Sunday, April 17, 2005

Open-source WYSIWYG editors: not quite there yet?

I'm looking for a free HTML editor suitable for end-users that are less comfortable entering html in a textarea. The most advanced open-source projects that fit this description seem to be xinha, a fork of the discontinued HTMLArea, and FCKEditor. Both suffer from problems with corrupting the browser history; bug reports are here and here (and here too). I note though that blogger's proprietary editor doesn't have this problem.

Breaking my back button is a serious usability problem. I note that both of these projects have huge amounts of features, far more than Blogger does. This would actually be a drawback to me if I were to deploy one, since I'd have to figure how to turn all the bloat off. Perhaps a project that focused less on features and more on usability could succeed better here.

(Please, if there's one out there that I missed, point me to it.)

Thursday, April 14, 2005

On referential integrity

At my place of employment we have a table that looks like this:

CREATE TABLE permissions (
    id character(40) NOT NULL,
    userid character(40) REFERENCES USERS(id),
    objectid character(40) NOT NULL,
    permission character varying(15) NOT NULL
Note that objectid isn't a foreign key to anything. (Before I joined there was no foreign key declared for userid, either.) That is because in this schema, everything gets a unique char(40) ID, and for flexibility the designer wanted to use the same permissions storage for all tables in the database.

(The fix for this, BTW, would involve creating an "objects" table that simply held all the IDs in the system and have that referenced by both the objects tables and any table like this one that wants to be able to reference "any object." I haven't done that yet, but I'll be moving that up on my priority list now.)

About a month ago, after suitable testing, I ran an upgrade script against our live database that went something like this:

CREATE TABLE alerttypes_old AS SELECT * FROM alerttypes;
DROP TABLE alerrtypes;
-- create new table
-- insert into new table massaged data from _old

As it happens, alerttypes is one of the tables that we are interested in permissions for. I forgot to delete the appropriate entries corresponding to the old rows, but worse, I forgot to create new ones for the new alerttypes.

What makes this more surprising is that the developer who reviewed the script missed this too. But that is what happens when you don't have proper integrity constraints: sooner or later, you're going to be restoring from backup. Even if you're a smart guy. Even if you test first (on several machines). Even if you have code reviews.

Incidently, I have seen people leave out FK constraints (or drop the ones I added -- grr!) to accomodate series of statements that temporarily violate the constraint, but eventually (in theory) leave things in a correct state. The correct course here is to put your related statements in a transaction (a good idea anyway), and tell your database to check constraints when the transaction ends, not before. For postgresql, that looks like this:

acs=# begin work;
-- ...

Wednesday, April 13, 2005

how well do you know python, part 4

import os

def foo(s):
    f = lambda a: a + "; print '%s'" % os.getcwd()

foo("print 'asdf'")

What error does this give? Why?

This one is pretty tough. I'll put a simpler illustration that gives the same error in the comments as a hint.

(Updated to fix inadvertent SyntaxError in the exec'd string, pointed out by Ian Bicking. Python bails with the error I intended before reaching that point, though.)

Tuesday, April 12, 2005

plpython intro

My current employer keeps a lot of application metadata as xml in the database, and I needed to make a simple update to some of that. Writing a client program to do this would be immense overkill, as well as not playing nicely with my database auto-upgrade script. The update could be easily handled with a regular expression substitution, but although postgresql has a decent regular expression implementation built-in, it has no function to do replacement with them. [Update: regex_replace was added in version 8.1, six months after this post.)

An underused feature of PostgreSQL is its ability to define functions in just about any language you like. Java, Perl, Python, TCL, among others, as well as PL/PGSQL. Most developers, unfortunately, are a bit leery of defining custom functions in the database. Whether this is because of the learning curve (which turns out to be quite shallow), or because they are too used to inferior databases that don't allow such things, I couldn't say.

Anyway, creating a function that did what I needed was trivial.

-- pattern, original, replacement, flags
    import re
    flags = 0
    for flagchar in args[3]:
        flags |= getattr(re, flagchar)
    p = re.compile(args[0], flags)
    return re.sub(p, args[2], args[1])
' LANGUAGE 'plpythonu'

-- pattern, original, replacement
    select regsub($1, $2, $3, '''');
' LANGUAGE 'sql'
Some things to note:
  • Postgresql 8.0 supports giving function variables names in the function definition statement, but the plpython handler doesn't know about this. (Anyone want to fix this? I just don't have the time.) So I'm stuck with the old-style args list.
  • Callers can't access the re namespace to grab the contants out as ints, so flags are passed as a concatenated string ('IS' = IGNORECASE + DOTALL, etc.) and turned into an appropriate int with getattr.
  • Speaking of that, this is the only time I have ever used the string iterator in python. I figured that after getting burned by that so many times, I might as well get some use out of it for once!
  • Postgresql 8.0 added the "dollar-quoting" feature to mitigate quoting headaches when defining functions. Unfortunately our production server will be running 7.4 for the forseeable future, so I stuck with escaping my quotes in the second function the old-fashioned way.
  • The second function isn't plpython: only SQL and PL/PGSQL functions can call other database functions directly; everything else is restricted to their handler's namespace. Although plpython gives you a way to dip back into the database, the overhead of doing this isn't negligible, and for something simple like this it's easier to just declare an SQL function anyway.
  • If you don't have plpython installed in your master template, chances are you'll need to run createlang from the commandline (or CREATE LANGUAGE from psql) before trying this at home.

Monday, April 11, 2005

how well do you know python, part 3

Here's one to file in the "cryptic documentation" category.
>>> import inspect
>>> help(inspect.getargspec)
Help on function getargspec in module inspect:

    Get the names and default values of a function's arguments.
    A tuple of four things is returned: (args, varargs, varkw, defaults).
    'args' is a list of the argument names (it may contain nested lists).
    'varargs' and 'varkw' are the names of the * and ** arguments or None.
    'defaults' is an n-tuple of the default values of the last n arguments.
What conditions would you guess might cause getargspec's args list to contain nested lists?

Saturday, April 09, 2005

Spyce + Tiles = ?

Tonight I added Tiles-like functionality to the Spyce trunk.

Now, this doesn't mean I cloned Tiles in Spyce. I took the usual Spyce approach of adding functionality found in the JSP world without adding all the complexity. (Actually, in this case it's actually more influenced by the OpenACS master and slave tags, but how many people would that mean anything to? :)

Specifically I added a way to create consistent site layout templates in the reverse-include style familiar to users of all modern web frameworks. (No, ASP.NET isn't modern by this definition. But ASP.NET 2.0 will be. They call it "Master pages," and googling for that is an excellent way to find a lot of people talking about how this is the best thing since sliced bread. This entry is long enough without me reiterating the reasons why this is a Good Thing.)

Instead of including header and footer in your content page, your content page declares that it belongs to a parent page, which defines all your common markup and a placeholder for the child. Simple example:


<spy:parent title="child title" />

Child content


    <title>[[= child['title'] ]]</title>

  <body>[[= child['_body'] ]]</body>
which results in the final html of
    <title>child title</title>

  <body>Child content</body>

This is both recursive and dynamic.

Recursive, in that one parent template can itself be the child of another. This is useful wherever you have shared content or navigation on a group of pages; you can make them children of parent1 which is a child of your master parent, so site-wide changes still only need to be made in one place, and changes affecting only this group are also only made in one place.

Dynamic, in that all the arguments to spy:parent are evaluated at runtime, even the src argument. Here's an example that doesn't do much except demonstrate this:

[[  import os
  cwd = os.getcwd()
  s = '/parenttable.spy'

<spy:parent src="=s" foo="=cwd" />
This results in the template located at /parenttable.spy being used instead of the default, and an argument named foo being passed with the result of the getcwd evaluation. (Changing which parent template to use at runtime is primarily useful for localizing different languages; passing other parameters that are evaluated at runtime is something you will use frequently even if you only care about English.)

In my opinion it's a phenomenal example of how robust Rimon's Spyce framework is that the diff for this changeset is only about 70 lines. Spyce is one of those rare projects that's a real pleasure to work on because the original author found just the right balance between thinking for future extensibility and overengineering. Tough line to walk, and Spyce does it well.

Friday, April 08, 2005

Solving a laptop performance problem

My main vice these days is Warcraft 3. (Not WoW -- I don't have nearly enough time for that.) It's been frustrating, though, since it could get extremely choppy. Maybe 5 fps choppy. Dropping all the settings to lowest didn't help, which really puzzled me since although a GeForce fx Go5200 isn't the most powerful 3d card around by a long shot, a 2.8 GHz p4 should have been able to render this stuff in software. I didn't do anything about this at first but I got good enough at the game that I started losing games because of the choppiness. So a couple nights ago I went on a killing spree with Task Manager to see if it was a background task causing the problem. I didn't see any likely candidates, and sure enough, it didn't help. I did notice my Insprion 5160 runs rather hot, though, and I wondered if it could be underclocking the CPU and/or GPU to cool off. This program verified this theory: my cpu clock oscillated every few seconds between 2.8 and 1.8 GHz. Now, even the lowest setting of 1.8 GHz is plenty for wc3. Apparently Blizzard did something dumb (a friend who knows more about windows programming than I suggested it might actually be a win32 API problem) and sets its timer based on the maximum clockspeed, and doesn't adjust when it drops down. Turning speedstep off in my BIOS, which sets the cpu clock to its lowest setting permanently, fixed my warcraft problem. I'd be pretty ticked if I still had to run, say, Eclipse, but 1.8 GHz is also plenty for Emacs. So I'm happy for now.

Tuesday, April 05, 2005

how well do you know python, part 2

An easy one today:
import os, threading

def foo(s):
    print s

for dir in os.listdir(os.getcwd()):
    threading.Thread(target=lambda: foo(dir)).start()
What bug may cause this to print something other than the contents of the current directory?

corrupting postgresql

Carnage Blender ran out of space on its root partition Saturday. This is also the partition that has the postgresql data on it (WAL on a separate disk). Bad news: data from a few tables just disappeared in total violation of referential constraints. (Perhaps MVCC marked the old row as invalid for an UPDATE, but couldn't create the new row. Just guessing.)

This post seems to indicate that this was fixed for 8.0, although it may have gotten into one of the later 7.x releases (CB is still running the ancient-by-postgresql-standards 7.4.2 release, just over a year old) as well. Changelogs don't mention it specifically though.

I've rearranged disk usage so this won't happen again, but looking at the list of bugfixes through 7.4.7 I guess I should schedule some downtime to upgrade. Now that 8.0 has its first point release I should probably just bite the bullet and dump/reload to that. Unfortunately, Carnage Blender runs two multi-gig databases now, each with several circular dependencies that make necessary manual fiddling with the dumps before restoring.

Monday, April 04, 2005

jobs via RSS -- logical use of RSS, really. Example: Maybe if I were younger or something I would have thought of that too. My grandchildren will mock me for still using email.

On the other hand, I did create an rss feed for the python job board several months ago. I guess it didn't occur to me that anyone would want to know about non-python jobs. :)

C# partial classes

A couple days ago I was talking with a former co-instructor at Northface University about the new (in the sense of "been in the language spec for years but since Visual Studio 2003 didn't support it everyone's waiting for Studio 2005 before using it") C# feature called partial classes or partial types.

If you're into dynamic languages, maybe you're like me and you think of mixins when you hear the phrase "partial classes." Wouldn't be the first time Microsoft gave old technology a new name and called it innovation, right?

Unfortunately for C# developers, partial classes have nothing to do with refactoring common behavior into a single reusable code fragment. All it is is a way to break a class up into multiple files for the benefit of code generation tools like Studio 2005's GUI designer. This is something that could easily be done with annotations/decorators/attributes (Java/Python/C# terms for pretty-close-to-the-same-thing). In fact, this is the approach the NetBeans Java IDE has been taking for years with their Swing designer (called "guarded [blocks|areas|code|chunks]"); it works fine.

My friend says that he could see this being useful to other, third party code generators, but I have a hard time seeing that carrying enough weight with Microsoft to actually spawn a new language feature, even if it were the only way to accomplish IDE support for code generation, which it isn't.

Some of the more rabid .NET fanboys have suggested that this will be cool because it lets you physically split a class among different developers. This is a lousy idea on several levels. First, if your classes are that large you probably need to rethink your design. Second, if you really do have an ironclad reason for this monster class, remember that one of the main goals of OO design is that someone who only wants to consume your class only needs to know the public API. People advocating splitting classes up into multiple human-edited files are forgetting that the other side of that coin is, the implementors of this class do need to know everything about it, or you'll be in a situation far worse than ordinary conflict merging.

Maybe I'm missing something. It really does baffle me that the C# team would go to the trouble (and it is trouble) to add a feature with only one real use case, and a flimsy one at that. Most of C# design I can see good reasons behind, if you take as a starting point "Like Java, only with the benefit of several years of hindsight. Oh yeah, and Sun doesn't own it." Some of the decisions are arguable but this one makes me scratch my head.

Saturday, April 02, 2005

Laszlo pycon slides

Unfortunately, not everyone who presented at pycon got notes to the archive. One of the ones I was particularly hoping to see was Oliver Steele's talk on OpenLaszlo, partly because I love Jython (contributed 8 or so patches about a year ago, some of which have recently been applied now that Samuele isn't the only one with CVS access) but mostly because Laszlo is pretty damn cool. "Ajax" has all the press recently but I think the Laszlo approach has a lot more going for it, especially if IE 7 stays in the dark ages.

Fortunately, Oliver just recently put up a a pdf of his openlazlo slides. Unfortunately there's a lot of blank slides, and rather more diagrams than code. (I think they were limited to 30 minutes, though. Ouch!) What I got out of it was,

  • Porting the ActionScript assembler from Jython to Java roughly doubled the LOC for 10x the speed (seems like they use/used Jython 2.0 though, ancient even by Jython standards?)
  • Jython's ease of development was a boon during prototyping, but
  • Don't be too quick to optimize by re-implementing in Java (or, for CPython developers, C/C++) because "you don't know when you're done prototyping"