TheBuc's comments

TheBuc · 2026-02-17T16:07:44 1771344464

Hi HN,

I built boolean-query-parser in about half a day because I was tired of hardcoding filtering logic for a project. I didn’t set out to write the "ultimate" parser; I just needed something lightweight to handle queries like (error AND /critical/i) OR NOT "debug" without pulling in heavy dependencies.

I open-sourced it, put it on PyPI, and honestly... I forgot about it.

A few days ago, I checked the stats: 3,000 organic downloads without a single tweet or blog post. It turns out that having a zero-dependency, pure-Python logic engine is a more common need than I thought.

Why it’s resonating (I think):

Zero Overhead: No Lark, no Pyparsing. Just a clean recursive descent parser in a single file.

Universal Logic: I use "log filtering" as an example because it's intuitive, but it's really a generic engine for user-defined segments, search bars, or alerting rules.

Wasm Playground: Since it’s so lean, I put together a Playground using WebAssembly. It loads instantly and lets you see the AST generation in real-time.

The Pivot to C: Seeing that people are actually using this in production has motivated me to take it seriously. Python is great for the "half-day hack," but for high-throughput applications, it hits a ceiling.

I’m now working on a full rewrite in C (CPython extension). The goals are:

Pure Efficiency: Zero-allocation matching using string views.

True Parallelism: Releasing the GIL so you can process massive datasets across all cores.

Robustness: Moving from recursion to a stack-based approach to handle even the most insane nested queries.

I’d love to hear your thoughts on the current AST structure or any advice from those who have transitioned a Python tool to a C extension!

Repo: https://github.com/Piergiuseppe/boolean-query-parser Playground: https://piergiuseppe.github.io/boolean-query-parser/