OP here.
I built this because I recently caught myself almost pasting a block of logs containing AWS keys into Claude.
The Problem: I need the reasoning capabilities of cloud models (GPT/Claude/Gemini), but I can't trust myself not to accidentally leak PII or secrets.
The Solution: A Chrome extension that acts as a local middleware. It intercepts the prompt and runs a local BERT model (via a Python FastAPI backend) to scrub names, emails, and keys before the request leaves the browser.
A few notes up front (to set expectations clearly):
Everything runs 100% locally.
Regex detection happens in the extension itself.
Advanced detection (NER) uses a small transformer model running on localhost via FastAPI.
No data is ever sent to a server.
You can verify this in the code + DevTools network panel.
This is an early prototype.
There will be rough edges. I’m looking for feedback on UX, detection quality, and whether the local-agent approach makes sense.
Tech Stack:
Manifest V3 Chrome Extension
Python FastAPI (Localhost)
HuggingFace dslim/bert-base-NER
Roadmap / Request for Feedback:
Right now, the Python backend adds some friction. I received feedback on Reddit yesterday suggesting I port the inference to transformer.js to run entirely in-browser via WASM.
I decided to ship v1 with the Python backend for stability, but I'm actively looking into the ONNX/WASM route for v2 to remove the local server dependency. If anyone has experience running NER models via transformer.js in a Service Worker, I’d love to hear about the performance vs native Python.
Repo is MIT licensed.
Very open to ideas suggestions or alternative approaches.
I do something similar locally by manually specifying all the things I want scrubbed/replaced and having keyboard maestro run a script on my system keyboard whenever doing a paste operation that's mapped to `hyperkey + v`. The plus side of this is that the paste is instant. The latency introduced by even the littlest of inference is enough friction to make you want to ditch the process entirely.
Another plus of the non-extension solution is that it's application agnostic.
reply