Skip to main content

Introduction to Durus

The README that comes with Durus is missing a couple pieces of information that are critical if you actually want to write a program that uses Durus, and the only other documentation appears to be the 2005 Durus pycon presentation, which gives an admirable description of the technical underpinnings but doesn't fill in the blanks of how to use it, either.

Specifically, as far as code samples go, the README gives you this:

Example using FileStorage to open a Connection to a file:

    from durus.file_storage import FileStorage
    from durus.connection import Connection
    connection = Connection(FileStorage("test.durus"))

And this:

    # Assume mymodule defines A as a subclass of Persistent.
    from mymodule import A 
    x = A()
    root = connection.get_root() # connection set as shown above.
    root["sample"] = x           # root is dict-like
    connection.commit()          # Now x is stored.

That's all you get.

Okay, in this situation it's clear how to retrieve x -- look it up by the key we've hardcoded. But what if x is a list of Persistent objects, and I want to retrieve one of those. Do I have to keep track of its index in x? Do I need to generate my own unique IDs? (Note that the linked code has an obvious race condition in a multithreaded environment.)

No, you don't. Durus assigns each Persisent object an attribute called _p_oid. (P object id. Not sure if P stands for Persistent, or Private, or something else entirely.)

[in persistent.py]
    def __new__(klass, *args, **kwargs):
        [...]
        instance._p_oid = None # <-- this is the oid that get() cares about

    def _p_format_oid(self):
        return format_oid(self._p_oid)

_p_oid is a four-byte binary string, so for passing around as a GET or POST variable in a web application, the format function (which turns it into a string representation of the oid number) is handy.

Now that you're passing oids around, Durus gives you an easy way to retrieve the object it identifies:

[in connection.py]
    def get(self, oid):
        """(oid:str|int|long) -> Persistent | None
        Return object for `oid`.

        The object may be a ghost.
        """

Note that if you do pass around the formatted id, you'll need to turn it into a Python int (or long) before sending it to get; if you pass the string '123' get will assume it's a valid (binary) oid and not autoconvert it.

Now that the code diving is out of the way, I'm enjoying Durus a lot. Next post I'll give a short Spyce demo using Durus for persistence.

Comments

Anonymous said…
Nope, I must have my blinkers on because I don't see it.

After putting 'x' in root how do I get it back out of the dictionary in subsequent sessions? And how do I know, after insert, what it's _p_oid is?
Jonathan Ellis said…
OK, here's a sample that puts it together a little more:

from durus.file_storage import FileStorage
from durus.connection import Connection
from durus.persistent import Persistent
conn = Connection(FileStorage("test.durus"))

## first session ##
class A(Persistent): pass # in real code you'd flesh it out some of course
x = A()
x.someattr = 'foo'
conn.get_root()['sample'] = x
conn.commit()
print x._p_oid
print x._p_format_oid()

## second session ##
print conn.get_root()['sample']
# or, say you have a Spyce page that passed the formatted id in a GET var
print conn.get(int(request['id']))
Anonymous said…
I think it's easier to use the dictionary access than mucking around with oids. It will make your life easier, particularly if you encapsulate your durus-touching code in functions/methods that abstract away the durus-specific stuff.

One other note: in your "first session" you connect to your durus storage using FileSession. If you want to use durus in a multi-threaded or multi-process setting (like most webservers), you'll want to use ClientStorage. And, right before you get anything, you'll want to call conn.abort() in order to synchronize your durus cache. Otherwise, you may well trigger an exception.

Of course, using ClientStorage means starting up a durus server. You can do it programmatically, using durus' own internal API, but I find it easier just to do os.system('durus -s --file test.durus')

---Peter Herndon
tpherndonATgmail.com
Jonathan Ellis said…
Interesting, an API to autostart the server is another thing the docs didn't mention -- I deliberately used FileStorage so I didn't have to spend code checking to see if the server were already running, etc.
Jonathan Ellis said…
I'm guessing StorageServer.serve is what you're referring to, and no, it's not really an improvement over os.system-ing it. I'll stick to FileStorage for examples.
Anonymous said…
Using _p_oid as application object identifiers is not really recommended. They are basically a DB implementation detail. If you want an ID, give your object one (e.g. an 'id' attribute). Normally you would store things in dicts, PersistentDicts or BTrees for easy retrieval.
Jonathan Ellis said…
Using Durus at all is an implementation detail; it's not like there is anything else out there that is API-compatible with it, except maybe ZODB. As long as I'm doing that, I might as well make use of what it provides, IMO.
Anonymous said…
One thing to note: the _p_oids are not assigned to objects until they are saved with a transaction. This probably does not matter in an application, but I think it is good to know.

If you ever wanted to move part of a durus database into a different durus database, the _p_oids would change--just something else to be aware of.

Don't use a Durus Connection from multiple threads.

Durus-2.0 will be released later this week. The only change is that the new license will be gpl-compatible.
Jonathan Ellis said…
Thanks for pointing that out.
mario ruggier said…
About not using a Durus Connection from multiple threads, would this imply that:

a) either you'd need to create a Connection per thread (so you have as many memory caches as threads) ?

b) or you'd need to proxy wrap a Connection to make it thread-safe, for example such as in this ThreadedProxy wrapper recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/159143 ?

Popular posts from this blog

Why schema definition belongs in the database

Earlier, I wrote about how ORM developers shouldn't try to re-invent SQL . It doesn't need to be done, and you're not likely to end up with an actual improvement. SQL may be designed by committee, but it's also been refined from thousands if not millions of man-years of database experience. The same applies to DDL. (Data Definition Langage -- the part of the SQL standard that deals with CREATE and ALTER.) Unfortunately, a number of Python ORMs are trying to replace DDL with a homegrown Python API. This is a Bad Thing. There are at least four reasons why: Standards compliance Completeness Maintainability Beauty Standards compliance SQL DDL is a standard. That means if you want something more sophisticated than Emacs, you can choose any of half a dozen modeling tools like ERwin or ER/Studio to generate and edit your DDL. The Python data definition APIs, by contrast, aren't even compatibile with other Python tools. You can't take a table definition

Python at Mozy.com

At my day job, I write code for a company called Berkeley Data Systems. (They found me through this blog, actually. It's been a good place to work.) Our first product is free online backup at mozy.com . Our second beta release was yesterday; the obvious problems have been fixed, so I feel reasonably good about blogging about it. Our back end, which is the most algorithmically complex part -- as opposed to fighting-Microsoft-APIs complex, as we have to in our desktop client -- is 90% in python with one C extension for speed. We (well, they, since I wasn't at the company at that point) initially chose Python for speed of development, and it's definitely fulfilled that expectation. (It's also lived up to its reputation for readability, in that the Python code has had 3 different developers -- in serial -- with very quick ramp-ups in each case. Python's succinctness and and one-obvious-way-to-do-it philosophy played a big part in this.) If you try it out, pleas

A review of 6 Python IDEs

(March 2006: you may also be interested the updated review I did for PyCon -- http://spyced.blogspot.com/2006/02/pycon-python-ide-review.html .) For September's meeting, the Utah Python User Group hosted an IDE shootout. 5 presenters reviewed 6 IDEs: PyDev 0.9.8.1 Eric3 3.7.1 Boa Constructor 0.4.4 BlackAdder 1.1 Komodo 3.1 Wing IDE 2.0.3 (The windows version was tested for all but Eric3, which was tested on Linux. Eric3 is based on Qt, which basically means you can't run it on Windows unless you've shelled out $$$ for a commerical Qt license, since there is no GPL version of Qt for Windows. Yes, there's Qt Free , but that's not exactly production-ready software.) Perhaps the most notable IDEs not included are SPE and DrPython. Alas, nobody had time to review these, but if you're looking for a free IDE perhaps you should include these in your search, because PyDev was the only one of the 3 free ones that we'd consider using. And if you aren