In the link above he's described 7 very practical ways to use it. No functional jargon, no mathematical jargon. Just practical useful ideas. And the language choice in the book is irrelevant - the concepts translate well.
There is an alternate universe where he would be well known as the top author on software engineering. His website is great as well.
That said, if you do know a bit of the math his example introduced commutative, invertible, invariance, idempotency, structural recursion & isomorphism - but anyone reading it would never really know and would never need to know. It's just framed as useful applications of tests.
> because small models found the same vulnerability.
With a ton of extra support. Note this key passage:
>We isolated the vulnerable svc_rpc_gss_validate function, provided architectural context (that it handles network-parsed RPC credentials, that oa_length comes from the packet), and asked eight models to assess it for security vulnerabilities.
Yeah it can find a needle in a haystack without false positives, if you first find the needle yourself, tell it exactly where to look, explain all of the context around it, remove most of the hay and then ask it if there is a needle there.
It's good for them to continue showing ways that small models can play in this space, but in my read their post is fairly disingenuous in saying they are comparable to what Mythos did.
I mean this is the start of their prompt, followed by only 27 lines of the actual function:
> You are reviewing the following function from FreeBSD's kernel RPC subsystem (sys/rpc/rpcsec_gss/svc_rpcsec_gss.c). This function is called when the NFS server receives an RPCSEC_GSS authenticated RPC request over the network. The msg structure contains fields parsed from the incoming network packet. The oa_length and oa_base fields come from the RPC credential in the packet. MAX_AUTH_BYTES is defined as 400 elsewhere in the RPC layer.
The original function is 60 lines long, they ripped out half of the function in that prompt, including additional variables presumably so that the small model wouldn't get confused / distracted by them.
You can't really do anything more to force the issue except maybe include in the prompt the type of vuln to look for!
It's great they they are trying to push small models, but this write up really is just borderline fake. Maybe it would actually succeed, but we won't know from that. Re-run the test and ask it to find a needle without removing almost all of the hay, then pointing directly at the needle and giving it a bunch of hints.
The benefit here is reducing the time to find vulnerabilities; faster than humans, right? So if you can rig a harness for each function in the system, by first finding where it’s used, its expected input, etc, and doing that for all functions, does it discover vulnerabilities faster than humans?
Doesn’t matter that they isolated one thing. It matters that the context they provided was discoverable by the model.
There is absolutely zero reason to believe you could use this same approach to find and exploit vulns without Mythos finding them first. We already know that older LLMs can’t do what Mythos has done. Anthropic and others have been trying for years.
>At AISLE, we've been running a discovery and remediation system against live targets since mid-2025: 15 CVEs in OpenSSL (including 12 out of 12 in a single security release, with bugs dating back 25+ years and a CVSS 9.8 Critical), 5 CVEs in curl, over 180 externally validated CVEs across 30+ projects spanning deep infrastructure, cryptography, middleware, and the application layer.
So there is pretty good evidence that yes you can use this approach. In fact I would wager that running a more systematic approach will yield better results than just bruteforcing, by running the biggest model across everything. It definitely will be cheaper.
> There is absolutely zero reason to believe you could use this same approach to find and exploit vulns without Mythos finding them first.
There's one huge reason to believe it: we can actually use small models, but we cant use Anthropic's special marketing model that's too dangerous for mere mortals.
Why? They claim this small model found a bug given some context. I assume the context wasn’t “hey! There’s a very specific type of bug sitting in this function when certain conditions are met.”
We keep assuming that the models need to get bigger and better, and the reality is we’ve not exhausted the ways in which to use the smaller models. It’s like the Playstation 2 games that came out 10 years later. Well now all the tricks were found, and everything improved.
If this were true, we're essentially saying that no one tried to scan vulnerabilities using existing models, despite vulnerabilities being extremely lucrative and a large professional industry. Vulnerability research has been one of the single most talked about risks of powerful AI so it wasn't exactly a novel concept, either.
If it is true that existing models can do this, it would imply that LLMs are being under marketed, not over marketed, since industry didn't think this was worth trying previously(?). Which I suspect is not the opinion of HN upvoters here.
I use the models to look for vulnerabilities all the time. I find stuff often. Have I tried to do build a new harness, or develop more sophisticated techniques? No. I suspect there are some spending lots of tokens developing more sophisticated strategies, in the same way software engineers are seeking magical one-shot harnesses.
Yeah, but what stops P1 from DDos'ing and picking checkmate each time?
If P2 picks check the first time, then they're done. At any point after if they pick checkmate, since P1 has checkmate selected they will reveal it and P2 will lose.
You're assume if someone picks 'checkmate' and the next player picks 'check' the games is over and the checkmate selector loses. I assumed that it means you treat it like 'check' 'check' and continue playing. But neither is actually specified in OPs post.
But let's assume it's your rules. Then winning is easy, just never pick checkmate. Literally never. As soon as your opponent picks it, they lose.
So is war (the card game), but people still play it
I think the proposed game has that both of you lose, like tic tac toe. The only way to win is to checkmate as described. Although it is a memoryless game as proposed, so all options (restart, continue, end) are indistinguishable. Maybe if you win, you go again?
Anyways, the game seems to be described to be the equivalent to the political doctrine of mutually assured destruction. Also a terribly designed game.
> Chris Tucker's character here isn't for everyone
Yeah this comment to me is incredibly surprising. Chris Tucker played an absolutely incredible character in that movie. So creative, so well executed, so memorable.
He was up there with Bruce Willis as top two in that film.
Such a brilliant movie - and definitely feels like a lost art.
This link below gives a better description of it, along with the definitions of the reduction rules. (which I got from further down in this thread)
But what I believe was meant by the above was: "delta E1 E1" creates a new "reduction tree" (my own made up term) with E1 being the left child of this new root node, and E2 being the right child of this new root node - and which then begins applying the reduction on this newly constructed tree.
Overall the concept seems pretty interesting - and it's nice to see someone come up with something both novel in the space and at the same time seemly "applicable".
I'm actually going to reply to myself with my understanding of what's written here as the more I've looked at it, the cooler I think it is.
I happened to read this link [1], so that's what I'm basing it off of. Also note, I don't know this at all so I may be flat our wrong, but hopefully this gives someone a good intuition. I would recommend glancing at the image at the top of that link (https://olydis.medium.com/a-visual-introduction-to-tree-calc...) for reference to follow along with my interpretation.
To me it overall seems like similar to a lisp, but focused on binary trees. There are rules 0a, 0b, 1, 2, 3a, 3b, 3c. And the concept is actually pretty simple. Rules 0 are construction rules. Rules 1 & 2 are combinators. And rule 3 is pattern matching which allows for reflection / acting on the shape of code (according to the docs).
First I think of there are 3 main constructions (and via analogy I'll use lisp terms even though this clearly is not lisp and will not directly translate, but I think that will make communication easier rather than using unfamiliar terms).
Being a binary tree there are 3 main basic shapes. There is nil/false/empty that is represented by a single node. There is a "boxed expression" (a single root node with a left child) and there is a "paired expression" (a root node with left and right children). (in the diagram on the link w,x,y,z ... represent "unknown" expressions or maybe arguments if you want to think of them that way). '@' is used as the "apply" operator, but I'm going to call it "cons" even though it is not the "cons" from lisp.
Here is my view:
Construction Rules:
0a - If I cons false/nil with an unknown expression what do I get? i.e. what is "(cons 'nil z)", where z is an unknown expression? The answer is '(z) (or equivalently (z . 'nil)) What I'll call a boxed expression.
0b - If I cons a "boxed expression" with an unknown expression what do I get? What is "(cons (y . 'nil) 'nil) z)"? Answer: a boxed pair: (y . z).
1 - [Note this rule does double duty and starts a new pattern that's kinda cool]. So continuing the pattern, if I cons a boxed pair with an unknown expression (cons (x . y) z) what do I get? Well it depends on the shape of the boxed pair. Specifically what is the shape of x? If it's 'nil then apply rule 1.
(cons ('nil . y) z) results in y.
Detour, it turns out (looking from the the perspective of the y and z, and 'cons' being a operator/combinator that this just happens to also be the 'K' combinator.
2 - So in this same boxed pair from above (cons (x . y) z), we said apply rule 1 if x is 'nil. But we've had consistent pattern (nil, boxed expr, paired expr) here so let's follow it now on the shape of x. So if x isn't 'nil, but is instead a boxed expression (x . 'nil) then use rule 2. Or to be explicit:
(cons ((x . 'nil) . y) z) reduces to (cons (cons x z) (cons y z)). Here it is helpful to think of cons as "apply" as they define it and not my made-up 'cons', because then it's easy to see that this is the S combinator. Ie (apply (apply x z) (apply y z)).
3 - Ok so we have one stage left of our pattern for (cons (x . y) z). We said if x is 'nil you get rule 1. If x is a boxed expression, you get rule 2. And so if x is a paired expression, what do you get? Well you get a flavor of rule 3. Which flavor of rule 3? Well it depends on the shape of z. And if you've been following there are 3 choices for the shape of z. The same 3: nil, boxed expression and paired expression. You'll also note that this is the only time we have transformations depend on anything in z (the second argument to cons). Up until now z has always been a blackbox and had no affect on our rules -- the first arg to cons always decided what we did. If you think of z as the data argument to cons and the first arg to cons being the code, this is allowing the code we execute to structurally depend on the data/argument. Ie this is your functional pattern matching behavior. Again, thinking of 'cons' as 'apply' here helps.
So let's walk through them, but where we left off we said in (cons (x . y) z), x could be nil and we get rule 1. x could be a boxed expression and we get rule 2. And if x is a paired expression we get rule 3. So a paired expression replacing the unknown x above looks like (cons ((w . x) . y) z).
[A note you can skip: To be consistent with the image in the link I've reused the letter 'x', but the x in (cons ((w . x) . y) z) is not the x in
(cons (x . y) z). The 'x' in (cons (x . y) z) is equivalent to (w . x) in the (cons ((w . x) . y) z). Think hygienic syntax replacement for my own convenience. You can also ignore this whole bracketed aside if it didn't make sense.]
Continuing rule 3, a paired expression replacing the unknown x above looks like (cons ((w . x) . y) z). Whether you apply rule 3a, 3b or 3c depends on the shape of the "argument" z. If z is nil (or a base case like 0), you get the fixed expression 'w'. If z is a boxed expression (u . 'nil), you get (cons x u), and if it's a paired expression (u . v) you get (cons (cons y u) v). Ignoring the details, it will conditionally apply the "code" w, x or y to the argument z, depending on the shape of z. In my view that is essentially equivalent to functional programming style pattern matching - they describe it as reflection.
Overall it's a pretty cool system. 3 "basic shapes". A few basic construction rules, 2 combinators and a case for pattern matching. In my view they have an operator that combines cons and apply in an elegant way, and can do pattern matching on it. It seems to really get to the essence of a lot of ideas in the space and very concisely without many assumptions or overhead. And note, while I specified, nil and cons and others, this all when serialized is represented by a single symbol with open and close parens for grouping. It's all just valueless binary trees.
Agreed, truth tables one is important. But it's backwards, you test people on truth tables before teaching them.
If someone is seeing this for the first time they may have never seen some of those gates and you quiz them.
Then finally after passing the quiz, you define NAND and NOR and Inverter.
Swap the teaching one to be the intro to the truth tables one.
Second bit of feedback is the timer. Increase the time allotted. I know them very well and still was struggling to get all the input correct before the timer hit. Or consider possibly just eliminating the timer completely - if your goal is to be sure that they know them.
good point, made an update that added difficulty levels to the minigames (handles the timer), and i'll probably move the truth tables minigame to after the user builds the truth tables, thx
Domain Modeling Made Functional - Scott Wlaschin
In the link above he's described 7 very practical ways to use it. No functional jargon, no mathematical jargon. Just practical useful ideas. And the language choice in the book is irrelevant - the concepts translate well.
There is an alternate universe where he would be well known as the top author on software engineering. His website is great as well.
That said, if you do know a bit of the math his example introduced commutative, invertible, invariance, idempotency, structural recursion & isomorphism - but anyone reading it would never really know and would never need to know. It's just framed as useful applications of tests.
reply