Skip to main content

Another serving of SqlSoup

Earlier this year I wrote an introduction to SqlSoup, the SQLAlchemy extension that leverages SQLAlchemy's excellent introspection, mapping, and sql construction to provide a database interface that is both simple and powerful.

Here's what SqlSoup has added since then (continuing with the books/loans/users example tables from pyweboff). Full SqlSoup documentation is on the SQLAlchemy wiki.

Set operations

The introduction covered updating and deleting rows that had been mapped to Python objects. You can also perform updates and deletes directly to the database.

>>> db.loans.insert(book_id=book_id, user_name=user.name)
MappedLoans(book_id=2,user_name='Bhargan Basepair',loan_date=None)
>>> db.flush()
>>> db.loans.delete(db.loans.c.book_id==2)

>>> db.loans.update(db.loans.c.book_id==2, book_id=1)
>>> db.loans.select_by(db.loans.c.book_id==1)
[MappedLoans(book_id=1,user_name='Joe Student',loan_date=datetime.datetime(2006, 7, 12, 0, 0))]

Joins

Occasionally, you will want to pull out a lot of data from related tables all at once. In this situation, it is far more efficient to have the database perform the necessary join. (Here we do not have "a lot of data," but hopefully the concept is still clear.) SQLAlchemy is smart enough to recognize that loans has a foreign key to users, and uses that as the join condition automatically.

>>> join1 = db.join(db.users, db.loans, isouter=True)
>>> join1.select_by(name='Joe Student')
[MappedJoin(name='Joe Student',email='student@example.edu',password='student',classname=None,admin=0,book_id=1,user_name='Joe Student',loan_date=datetime.datetime(2006, 7, 12, 0, 0))]

If you're unfortunate enough to be using MySQL with the default MyISAM storage engine, you'll have to specify the join condition manually, since MyISAM does not store foreign keys. Here's the same join again, with the join condition explicitly specified:

>>> db.join(db.users, db.loans, db.users.c.name==db.loans.c.user_name, isouter=True)
<class 'sqlalchemy.ext.sqlsoup.MappedJoin'>

You can compose arbitrarily complex joins by combining Join objects with tables or other joins. Here we combine our first join with the books table:

>>> join2 = db.join(join1, db.books)
>>> join2.select()
[MappedJoin(name='Joe Student',email='student@example.edu',password='student',classname=None,admin=0,book_id=1,user_name='Joe Student',loan_date=datetime.datetime(2006, 7, 12, 0, 0),id=1,title='Mustards I Have Known',published_year='1989',authors='Jones')]

If you join tables that have an identical column name, wrap your join with "with_labels", to disambiguate columns with their table name:

>>> db.with_labels(join1).c.keys()
['users_name', 'users_email', 'users_password', 'users_classname', 'users_admin', 'loans_book_id', 'loans_user_name', 'loans_loan_date']

Advanced mapping

SqlSoup can map any SQLAlchemy Selectable with the map method. Let's map a Select object that uses an aggregate function; we'll use the SQLAlchemy Table that SqlSoup introspected as the basis. (Since we're not mapping to a simple table or join, we need to tell SQLAlchemy how to find the "primary key," which just needs to be unique within the select, and not necessarily correspond to a "real" PK in the database.)

>>> from sqlalchemy import select, func
>>> b = db.books._table
>>> s = select([b.c.published_year, func.count('*').label('n')], from_obj=[b], group_by=[b.c.published_year])
>>> s = s.alias('years_with_count')
>>> years_with_count = db.map(s, primary_key=[s.c.published_year])
>>> years_with_count.select_by(published_year='1989')
[MappedBooks(published_year='1989',n=1)]

Obviously if we just wanted to get a list of counts associated with book years once, raw SQL is going to be less work. The advantage of mapping a Select is reusability, both standalone and in Joins. (And if you go to full SQLAlchemy, you can perform mappings like this directly to your object models.)

Comments

Popular posts from this blog

The Missing Piece in AI Coding: Automated Context Discovery

I recently switched tasks from writing the ColBERT Live! library and related benchmarking tools to authoring BM25 search for Cassandra . I was able to implement the former almost entirely with "coding in English" via Aider . That is: I gave the LLM tasks, in English, and it generated diffs for me that Aider applied to my source files. This made me easily 5x more productive vs writing code by hand, even with AI autocomplete like Copilot. It felt amazing! (Take a minute to check out this short thread on a real-life session with Aider , if you've never tried it.) Coming back to Cassandra, by contrast, felt like swimming through molasses. Doing everything by hand is tedious when you know that an LLM could do it faster if you could just structure the problem correctly for it. It felt like writing assembly without a compiler -- a useful skill in narrow situations, but mostly not a good use of human intelligence today. The key difference in these two sce...

A week of Windows Subsystem for Linux

I first experimented with WSL2 as a daily development environment two years ago. Things were still pretty rough around the edges, especially with JetBrains' IDEs, and I ended up buying a dedicated Linux workstation so I wouldn't have to deal with the pain.  Unfortunately, the Linux box developed a heat management problem, and simultaneously I found myself needing a beefier GPU than it had for working on multi-vector encoding , so I decided to give WSL2 another try. Here's some of the highlights and lowlights. TLDR, it's working well enough that I'm probably going to continue using it as my primary development machine going forward. The Good NVIDIA CUDA drivers just work. I was blown away that I ran conda install cuda -c nvidia and it worked the first try. No farting around with Linux kernel header versions or arcane errors from nvidia-smi. It just worked, including with PyTorch. JetBrains products work a lot better now in remote development mod...

Why PHP sucks

(July 8 2005) Apparently I got linked by some PHP sites, and while there were a few well-reasoned comments here I mostly just got people who only knew PHP reacting like I told them their firstborn was ugly. These people tended to give variants on one or more themes: All environments have warts, so PHP is no worse than anything else in this respect I can work around PHP's problems, ergo they are not really problems You aren't experienced enough in PHP to judge it yet As to the first, it is true that PHP is not alone in having warts. However, the lack of qualitative difference does not mean that the quantitative difference is insignificant. Similarly, problems can be worked around, but languages/environments designed by people with more foresight and, to put it bluntly, clue, simply don't make the kind of really boneheaded architecture mistakes that you can't help but run into on a daily baisis in PHP. Finally, as I noted in my original introduction, with PHP, ...