Skip to main content

NagleQueue

Here's a piece of code for the one or two other developers writing intensely network-dependent code in Python. The idea is, instead of a protocol that looks like

[command][arg]
[command][arg]
...

You save overhead (in your protocol and in the network's, if you're doing one connection per command) by batching things:

[command][arg][arg]...

Pretty obvious stuff, no doubt. But I thought the following class makes it rather elegant.

class NagleQueue:
    """
    When an item is put, NagleQueue waits for other puts for
    aggregate_time seconds (may be fractional) and then calls
    aggregate() with a list whose maximum length is max_items.

    TODO: currently, NagleQueue always waits the entire aggregate_time,
    even if max_items items are added sooner.

    NagleQueue starts a thread to handle the aggregate() call;
    this is not a Daemon thread, so you must call stop() before your
    application exits.  NagleQueue will process any remaining items,
    then quit.

    It is an unchecked error to put additional items after calling stop().
    (They may or may not get processed.)
    """
    def __init__(self, aggregate_time, max_items):
        self.q = queue.Queue()
        self.running = True
        self.aggregate_time = aggregate_time
        self.max_items = max_items
        threading.Thread(target=self._process, name='naglequeue').start()

    def put(self, item):
        self.q.put(item)

    def stop(self):
        self.running = False

    def _process(self):
        while True:
            try:
                item = self.q.get(timeout=1)
            except queue.Empty:
                if not self.running:
                    break
                else:
                    continue
            time.sleep(self.aggregate_time)

            L = []
            while True:
                L.append(item)
                if len(L) >= self.max_items:
                    break
                try:
                    item = self.q.get_nowait()
                except queue.Empty:
                    break
            self.aggregate(L)

    def aggregate(self, items):
        """
        combines list of items into a single request
        """
        raise 'must implement aggregate method'

Comments

Florian said…
This comment has been removed by a blog administrator.
Florian said…
Saw that you actually didn't implement the aggegation at all :D, my bad.

Popular posts from this blog

The Missing Piece in AI Coding: Automated Context Discovery

I recently switched tasks from writing the ColBERT Live! library and related benchmarking tools to authoring BM25 search for Cassandra . I was able to implement the former almost entirely with "coding in English" via Aider . That is: I gave the LLM tasks, in English, and it generated diffs for me that Aider applied to my source files. This made me easily 5x more productive vs writing code by hand, even with AI autocomplete like Copilot. It felt amazing! (Take a minute to check out this short thread on a real-life session with Aider , if you've never tried it.) Coming back to Cassandra, by contrast, felt like swimming through molasses. Doing everything by hand is tedious when you know that an LLM could do it faster if you could just structure the problem correctly for it. It felt like writing assembly without a compiler -- a useful skill in narrow situations, but mostly not a good use of human intelligence today. The key difference in these two sce...

A week of Windows Subsystem for Linux

I first experimented with WSL2 as a daily development environment two years ago. Things were still pretty rough around the edges, especially with JetBrains' IDEs, and I ended up buying a dedicated Linux workstation so I wouldn't have to deal with the pain.  Unfortunately, the Linux box developed a heat management problem, and simultaneously I found myself needing a beefier GPU than it had for working on multi-vector encoding , so I decided to give WSL2 another try. Here's some of the highlights and lowlights. TLDR, it's working well enough that I'm probably going to continue using it as my primary development machine going forward. The Good NVIDIA CUDA drivers just work. I was blown away that I ran conda install cuda -c nvidia and it worked the first try. No farting around with Linux kernel header versions or arcane errors from nvidia-smi. It just worked, including with PyTorch. JetBrains products work a lot better now in remote development mod...

Python at Mozy.com

At my day job, I write code for a company called Berkeley Data Systems. (They found me through this blog, actually. It's been a good place to work.) Our first product is free online backup at mozy.com . Our second beta release was yesterday; the obvious problems have been fixed, so I feel reasonably good about blogging about it. Our back end, which is the most algorithmically complex part -- as opposed to fighting-Microsoft-APIs complex, as we have to in our desktop client -- is 90% in python with one C extension for speed. We (well, they, since I wasn't at the company at that point) initially chose Python for speed of development, and it's definitely fulfilled that expectation. (It's also lived up to its reputation for readability, in that the Python code has had 3 different developers -- in serial -- with very quick ramp-ups in each case. Python's succinctness and and one-obvious-way-to-do-it philosophy played a big part in this.) If you try it out, pleas...