brantmv's comments

brantmv · 2026-03-13T16:24:53 1773419093

Very nice. But surely the most chopped up amen break is Virtual Riot's Death by Amen.

brantmv · 2026-03-02T15:34:19 1772465659

Maybe you've heard of the "can one hear the shape of a drum" question. In the 90's, mathematicians successfully produced two drum shapes that would theoretically produce indistinguishable sets of frequencies when struck. Somehow, no one ever got around to actually making the drums. So we gave it a shot!

brantmv · 2025-12-29T15:24:50 1767021890

For anyone wondering, "mse" probably means "math Stack Exchange".

brantmv · 2025-12-28T20:12:17 1766952737

Maybe I'm wrong, but it looks like the authors did not actually have any LLMs write or verify any code for their experiments. Instead, their experiments consist of simulating the simplified Markov chain model itself. They simulated their simple Markov chain and checked if the theorem's predictions matched empirical statistics. This amounts to a test not of their model, but of basic Markov chain theory.

Did I misread or miss something?

brantmv · 2025-12-29T00:03:17 1766966597

Also, the mathematical content here is pretty thin. Their main theorem has nothing to do with LLMs directly. It's a theorem about a five-state Markov chain, and the proof follows from standard Markov chain theory.

For those reasons, the grandiose name "LLM-Verifier Convergence Theorem" does not sit well with me.

brantmv · 2025-12-21T14:10:37 1766326237

Why say this in such a rude way?

dwb · 2025-12-21T14:36:48 1766327808

Because powerful interests are trying to hijack human creative pursuits in the interest of profit. None of the images in the post are art.

brantmv · 2025-10-11T03:06:45 1760152005

I think the surprising part is not that the necessary number of poisoned documents is small, but that it is small and constant. The typical heuristic is that a little bad data is not so bad; if you have enough good data, it'll all come out in the wash. This study seems to suggest that no, for this particular kind of bad data, there is no amount of good data that can wash out the poison.

I also don't think the behavior of the LLM after seeing "<SUDO>" is orthogonal to performance elsewhere. Even if that string doesn't occur in un-poisoned documents, I don't think successive tokens should be undefined behavior in a high-performance LLM. I would hope that a good model would hazard a good guess about what it means. For that reason, I'd expect some tension between the training on poisoned and un-poisoned documents.

brantmv · 2025-08-18T15:18:43 1755530323

Mathematicians are afraid of higher order tensors because they are unruly monsters.

There's a whole workshop of useful matrix tools. Decompositions, spectral theory, etc. These tools really break down when you generalize them to k-tensors. Even basic concepts like rank become sticky. (Iirc, the set of 3-tensors of tensor rank ≤k is not even topologically closed in general. Terrifying.) If you hand me some random 5-tensor, it's quite difficult to begin to understand it without somehow turning it into a matrix first by flattening or slicing or whatever.

Don't get me wrong. People work with these things. They do their best. But in general, mathematicians are afraid of higher order tensors. You should be too.

brantmv · 2025-05-28T01:02:25 1748394145

I'm always surprised how many other mathematicians don't know what I'm talking about when I reference this paper. It should be in the canon of math essays.