Wednesday, January 10, 2007

Fun with three-valued logic

I thought I was pretty used to SQL's three-valued logic by now, but this still caused me a minute of scratching my head:

# select count(*) from _t;
 count
-------
  1306
(1 row)

# select count(*) from _t2;
 count
-------
 19497
(1 row)
Both _t and _t2 are temporary tables of a single column I created with SELECT DISTINCT.
# select count(*) from _t where userhash in (select userhash from _t2);
 count
-------
   982
(1 row)

# select count(*) from _t where userhash not in (select userhash from _t2);
 count
-------
     0
(1 row)

Hmm, 982 + 0 != 1306...

Turns out there was a null in _t2; X in {set containing null} evaluates to null, not false, and negating null still gives null. (The rule of thumb is, any operation on null is still null.)

.................

I'm giving a tutorial on Advanced Databases with SQLAlchemy at PyCon in February. Feel free to let me know if there is anything you'd like me to cover specifically.

2 comments:

Anonymous said...

Thank you!!! you just solved my long standing django bug with select_related and null relations!

http://code.djangoproject.com/ticket/3149

Jonathan Ellis said...

Cool, glad I could help. :)