Skip to main content

On referential integrity

At my place of employment we have a table that looks like this:

CREATE TABLE permissions (
    id character(40) NOT NULL,
    userid character(40) REFERENCES USERS(id),
    objectid character(40) NOT NULL,
    permission character varying(15) NOT NULL
);
Note that objectid isn't a foreign key to anything. (Before I joined there was no foreign key declared for userid, either.) That is because in this schema, everything gets a unique char(40) ID, and for flexibility the designer wanted to use the same permissions storage for all tables in the database.

(The fix for this, BTW, would involve creating an "objects" table that simply held all the IDs in the system and have that referenced by both the objects tables and any table like this one that wants to be able to reference "any object." I haven't done that yet, but I'll be moving that up on my priority list now.)

About a month ago, after suitable testing, I ran an upgrade script against our live database that went something like this:

CREATE TABLE alerttypes_old AS SELECT * FROM alerttypes;
DROP TABLE alerrtypes;
-- create new table
-- insert into new table massaged data from _old

As it happens, alerttypes is one of the tables that we are interested in permissions for. I forgot to delete the appropriate entries corresponding to the old rows, but worse, I forgot to create new ones for the new alerttypes.

What makes this more surprising is that the developer who reviewed the script missed this too. But that is what happens when you don't have proper integrity constraints: sooner or later, you're going to be restoring from backup. Even if you're a smart guy. Even if you test first (on several machines). Even if you have code reviews.

Incidently, I have seen people leave out FK constraints (or drop the ones I added -- grr!) to accomodate series of statements that temporarily violate the constraint, but eventually (in theory) leave things in a correct state. The correct course here is to put your related statements in a transaction (a good idea anyway), and tell your database to check constraints when the transaction ends, not before. For postgresql, that looks like this:

acs=# begin work;
BEGIN
acs=# SET CONSTRAINTS ALL DEFERRED;
SET CONSTRAINTS
-- ...

Comments

Popular posts from this blog

The Missing Piece in AI Coding: Automated Context Discovery

I recently switched tasks from writing the ColBERT Live! library and related benchmarking tools to authoring BM25 search for Cassandra . I was able to implement the former almost entirely with "coding in English" via Aider . That is: I gave the LLM tasks, in English, and it generated diffs for me that Aider applied to my source files. This made me easily 5x more productive vs writing code by hand, even with AI autocomplete like Copilot. It felt amazing! (Take a minute to check out this short thread on a real-life session with Aider , if you've never tried it.) Coming back to Cassandra, by contrast, felt like swimming through molasses. Doing everything by hand is tedious when you know that an LLM could do it faster if you could just structure the problem correctly for it. It felt like writing assembly without a compiler -- a useful skill in narrow situations, but mostly not a good use of human intelligence today. The key difference in these two sce...

Why PHP sucks

(July 8 2005) Apparently I got linked by some PHP sites, and while there were a few well-reasoned comments here I mostly just got people who only knew PHP reacting like I told them their firstborn was ugly. These people tended to give variants on one or more themes: All environments have warts, so PHP is no worse than anything else in this respect I can work around PHP's problems, ergo they are not really problems You aren't experienced enough in PHP to judge it yet As to the first, it is true that PHP is not alone in having warts. However, the lack of qualitative difference does not mean that the quantitative difference is insignificant. Similarly, problems can be worked around, but languages/environments designed by people with more foresight and, to put it bluntly, clue, simply don't make the kind of really boneheaded architecture mistakes that you can't help but run into on a daily baisis in PHP. Finally, as I noted in my original introduction, with PHP, ...

A week of Windows Subsystem for Linux

I first experimented with WSL2 as a daily development environment two years ago. Things were still pretty rough around the edges, especially with JetBrains' IDEs, and I ended up buying a dedicated Linux workstation so I wouldn't have to deal with the pain.  Unfortunately, the Linux box developed a heat management problem, and simultaneously I found myself needing a beefier GPU than it had for working on multi-vector encoding , so I decided to give WSL2 another try. Here's some of the highlights and lowlights. TLDR, it's working well enough that I'm probably going to continue using it as my primary development machine going forward. The Good NVIDIA CUDA drivers just work. I was blown away that I ran conda install cuda -c nvidia and it worked the first try. No farting around with Linux kernel header versions or arcane errors from nvidia-smi. It just worked, including with PyTorch. JetBrains products work a lot better now in remote development mod...