Tuesday, October 31, 2006

Another serving of SqlSoup

Earlier this year I wrote an introduction to SqlSoup, the SQLAlchemy extension that leverages SQLAlchemy's excellent introspection, mapping, and sql construction to provide a database interface that is both simple and powerful.

Here's what SqlSoup has added since then (continuing with the books/loans/users example tables from pyweboff). Full SqlSoup documentation is on the SQLAlchemy wiki.

Set operations

The introduction covered updating and deleting rows that had been mapped to Python objects. You can also perform updates and deletes directly to the database.

>>> db.loans.insert(book_id=book_id, user_name=user.name)
MappedLoans(book_id=2,user_name='Bhargan Basepair',loan_date=None)
>>> db.flush()
>>> db.loans.delete(db.loans.c.book_id==2)

>>> db.loans.update(db.loans.c.book_id==2, book_id=1)
>>> db.loans.select_by(db.loans.c.book_id==1)
[MappedLoans(book_id=1,user_name='Joe Student',loan_date=datetime.datetime(2006, 7, 12, 0, 0))]


Occasionally, you will want to pull out a lot of data from related tables all at once. In this situation, it is far more efficient to have the database perform the necessary join. (Here we do not have "a lot of data," but hopefully the concept is still clear.) SQLAlchemy is smart enough to recognize that loans has a foreign key to users, and uses that as the join condition automatically.

>>> join1 = db.join(db.users, db.loans, isouter=True)
>>> join1.select_by(name='Joe Student')
[MappedJoin(name='Joe Student',email='student@example.edu',password='student',classname=None,admin=0,book_id=1,user_name='Joe Student',loan_date=datetime.datetime(2006, 7, 12, 0, 0))]

If you're unfortunate enough to be using MySQL with the default MyISAM storage engine, you'll have to specify the join condition manually, since MyISAM does not store foreign keys. Here's the same join again, with the join condition explicitly specified:

>>> db.join(db.users, db.loans, db.users.c.name==db.loans.c.user_name, isouter=True)
<class 'sqlalchemy.ext.sqlsoup.MappedJoin'>

You can compose arbitrarily complex joins by combining Join objects with tables or other joins. Here we combine our first join with the books table:

>>> join2 = db.join(join1, db.books)
>>> join2.select()
[MappedJoin(name='Joe Student',email='student@example.edu',password='student',classname=None,admin=0,book_id=1,user_name='Joe Student',loan_date=datetime.datetime(2006, 7, 12, 0, 0),id=1,title='Mustards I Have Known',published_year='1989',authors='Jones')]

If you join tables that have an identical column name, wrap your join with "with_labels", to disambiguate columns with their table name:

>>> db.with_labels(join1).c.keys()
['users_name', 'users_email', 'users_password', 'users_classname', 'users_admin', 'loans_book_id', 'loans_user_name', 'loans_loan_date']

Advanced mapping

SqlSoup can map any SQLAlchemy Selectable with the map method. Let's map a Select object that uses an aggregate function; we'll use the SQLAlchemy Table that SqlSoup introspected as the basis. (Since we're not mapping to a simple table or join, we need to tell SQLAlchemy how to find the "primary key," which just needs to be unique within the select, and not necessarily correspond to a "real" PK in the database.)

>>> from sqlalchemy import select, func
>>> b = db.books._table
>>> s = select([b.c.published_year, func.count('*').label('n')], from_obj=[b], group_by=[b.c.published_year])
>>> s = s.alias('years_with_count')
>>> years_with_count = db.map(s, primary_key=[s.c.published_year])
>>> years_with_count.select_by(published_year='1989')

Obviously if we just wanted to get a list of counts associated with book years once, raw SQL is going to be less work. The advantage of mapping a Select is reusability, both standalone and in Joins. (And if you go to full SQLAlchemy, you can perform mappings like this directly to your object models.)

Thursday, October 26, 2006

Ruby isn't going to fracture, and "enterprise" is not synonymous with "static"

I don't follow Ruby development too closely (most of the info on it is still in Japanese, after all), but the US RubyConf was held recently so there's been an unusual number of English posts on Ruby, among them David Pollack's The Impending Ruby Fracture.

David's article seems to consist of these points:

  1. Matz is uninterested in adding static bondage & discipline features to Ruby (true, as far as I know)
  2. "Enterprise" users won't be satisfied without said features (more on this below)
  3. There are a lot of Ruby runtimes out there right now (the most interesting part of the article)
  4. Therefore some Enterprise will co-opt one of the runtimes to fork Ruby and add the B&D features (wtf?)

Summarized this way it looks faintly ridiculous, and yet nobody over on the programming reddit has called this out. Maybe I'm taking excessive liberties with David's article, but I don't think I am.

The possibility of forking is part of what makes open source wonderful. The actual cost of a fork is astronomically high; almost nobody has made it work. For every X.org there are dozens of failures and probably far more where the would-be forkers realized that however bad the situation was, actually forking would be worse.

Now, in the absence of strong leadership, what you can have happening to a language is de facto forking, like what you have with Lisp -- the Common Lisp standard is old, so the various Lisp implementations have gone their separate ways to various degrees and portability between them is pretty dicey. But the Ruby community seems to be pretty content with the job Matz is doing so I don't see this happening.

As a motivation to assume the huge costs of forking, David submits... "interfaces or some other optional typing mechanism?" Excuse me. Even though some intelligent language designers have flirted with ideas along those lines, that's not something that's going to get refugees from Java to rally around your banner for a fork.

I also have to take issue with David's characterization of this as features that "appeal to enterprise customers." While it may be true that B&D languages are currently popular with large corporations, other large corporations recognize the advantages of dynamic languages. Corporations aren't stupid; they're just very conservative, for the most part. In 10 years you'll see more Python and Ruby in the enterprise, just as Java and C# are replacing COBOL and C++ now.