Skip to main content

Posts

Showing posts from 2024

The Missing Piece in AI Coding: Automated Context Discovery

I recently switched tasks from writing the ColBERT Live! library and related benchmarking tools to authoring BM25 search for Cassandra . I was able to implement the former almost entirely with "coding in English" via Aider . That is: I gave the LLM tasks, in English, and it generated diffs for me that Aider applied to my source files. This made me easily 5x more productive vs writing code by hand, even with AI autocomplete like Copilot. It felt amazing! (Take a minute to check out this short thread on a real-life session with Aider , if you've never tried it.) Coming back to Cassandra, by contrast, felt like swimming through molasses. Doing everything by hand is tedious when you know that an LLM could do it faster if you could just structure the problem correctly for it. It felt like writing assembly without a compiler -- a useful skill in narrow situations, but mostly not a good use of human intelligence today. The key difference in these two sce...

dlib is not compatibile with numpy 2.x

I spent way too long trying to figure out this problem with dlib while using the Python face_recognition library that wraps it, and since I couldn't find anyone giving the correct diagnosis and solution online, I'm posting it as a public service to the next person who hits it. Here's the error I was getting: RuntimeError: Error while calling cudaMallocHost(&data, new_size*sizeof(float)) in file /home/jonathan/Projects/dlib/dlib/cuda/gpu_data.cpp:211. code: 2, reason: out of memory Eventually I gave up and switched from the GPU model ("cnn") to the CPU one ("hog").  Then I started getting errors about  RuntimeError: Unsupported image type, must be 8bit gray or RGB image. The errors persisted after adding PIL code to convert to RGB. This one was easier to track down on Google: it happens when you have numpy 2.x installed, which is not compatible with dlib.  Seems like something along the way should give a warning about that! At any rate, with numpy dow...

Adding ColPali to ColBERT Live!

I recently added support for ColPali image search to the ColBERT Live! Library . This post is going to skip over the introduction to ColPali and how it works; please check out Antaripa Saha's excellent article for that. TLDR, ColPali allows you to natively compare text queries with image-based documents, with accuracy that bests that previous state of the art. ("Natively" means there's no extract-to-text pipeline involved.) Adding ColPali to ColBERT Live! The "Col" in ColPali refers to performing maxsim-based "late interaction" relevance as seen in ColBERT . Since ColBERT Live! already abstracts away the details of computing embedding vectors, it is straightforward to add support for ColPali / ColQwen by implementing an appropriate Model subclass . However, ColBERT Live!'s default parameters were initially tuned for text search . To be able to give appropriate guidance for image search, I ran a grid search on the ViDoRe benchmark that ...

Why JVector 3 is the most advanced vector index on the planet

Transcript of airhacks.fm episode 316 Adam Bien: Hey, Jonathan, how JVector 4 is doing? Jonathan Ellis: JVector 4? AB: Yeah, because Vector 3 is completed, I think now. JE: JVector 4. Well, shoot man. If you want to sneak preview, we may have some news to talk about with GPU acceleration for JVector 4, but that's super, super early and I can't promise any specifics yet. 0:00:28 JVector 3 features and improvements AB: It was a joke actually. So, I know that JVector 3 is completed. And so what's the major features or what happened between JVector 2 and 3? JE: JVector 2 was a fairly straightforward adaptation of Microsoft Research’s DiskANN search indexing to Java and to Cassandra. And so that means that you have a two pass search where you have a core index that works on a graph, whose comparisons are done with quantized vectors that are kept in memory. And then you refine the results of that search by using full resolution vectors from disk. And so JVector 3 has been, ho...

A week of Windows Subsystem for Linux

I first experimented with WSL2 as a daily development environment two years ago. Things were still pretty rough around the edges, especially with JetBrains' IDEs, and I ended up buying a dedicated Linux workstation so I wouldn't have to deal with the pain.  Unfortunately, the Linux box developed a heat management problem, and simultaneously I found myself needing a beefier GPU than it had for working on multi-vector encoding , so I decided to give WSL2 another try. Here's some of the highlights and lowlights. TLDR, it's working well enough that I'm probably going to continue using it as my primary development machine going forward. The Good NVIDIA CUDA drivers just work. I was blown away that I ran conda install cuda -c nvidia and it worked the first try. No farting around with Linux kernel header versions or arcane errors from nvidia-smi. It just worked, including with PyTorch. JetBrains products work a lot better now in remote development mod...