Thanks for reading! The claim isn't that LLMs can't pick a topology, they obviously do, every time they generate code. The claim is that the spec doesn't determine which topology is correct, because multiple valid topologies satisfy the same behavioral spec.
"Enough context" is doing a lot of work in your argument. What is that context? At some point it includes architectural intent, performance constraints, team conventions, future extensibility assumptions ... i.e., exactly the stuff that isn't in the spec and that constitutes the engineering judgment we're discussing. If you force fold all of that into "context," then sure, but now you've just moved the goalposts from spec to "spec plus everything else the developer knows," which is then like a 1:1 map from spec to code.
On inferring how components relate from training data: that gives you statistical priors over topologies which is a good to have sure, but it is not same a justified topology selection for a specific system.
The key insight here is that danger doesn't require sentience or even intelligence, just behavior. And not even emergent behavior. This is plain evolutionary pressure applied to mutation-equivalent recombination of personality files.
Which makes me wonder: if the threat model is evolutionary rather than adversarial, should the defense model be immunological rather than containment-based?
So, instead of anti-viruses we end up writing WBC equivalent api consumption monitors?
The declaration of no-LLM was done for social prestige or maybe self-deception of self-sufficiency like "I don't need LLM". And when it was time to do the actual work, the dependency kicked in like drugs.
A lesson for all of us with LLMs in our workflow.
Is this written in the linked article? Or the info is from other places online? Because I didn't see this.
Article seems to say that this choice was given just for review (how you will review not how you will get reviewed) and the consequence of getting caught, their paper being rejected, was a punishment, not the original trade-off or motivation for choosing option A.
It fits though, quite funnily: They did not want LLM near their own papers because they could not have imagined injecting prompts to get a good review and that's the same lack-of-awareness (i guess you could say 'skill issue') which made them not look for prompt injections in the first place.
If i wanted to extend the joke further, injecting prompts into your own pdf to get good reviews by reviewers using LLMs is actually work. Skill and work. And if they had that, they wouldn't be in this soup.
I'm sorry if I am the only one laughing, but I am.
Couldn't game the system the accurate way so ... got caught gaming it the lazy way!
I do feel sorry for them, I do, they must have worked hard on their papers, but this is funny. Thanks.
yeah but will you promise to do it by hand and then use a calculator?
Or will you have every intention to keep the promise but it would seem such a chore by now (cuz calculator is such a part of your workflow) that you would minimize the sanctity of your promise in your mind?
If yes, that's dependency, not usual use.
(I just learned that choosing no-LLM also meant no-LLM on their own papers, so I am less generous with motivations now. Wasn't dependency, just plain old self-interest. Thanks for your point.)
The other commenter explained that the policy was applied as reciprocal, "The declaration of no-LLM was done so you are not judged yourself by an LLM."
Basically, they didn't want LLM near their own paper's review.
Although, a bunch of LLM researchers basically saying with their actions "Don't judge me with an LLM" is particularly ironic. Doubly so when caught using the LLM for the task they, themselves, want to opt out of.
Causes mostly add up: molecular kinetic energies aggregate to temperature, collisions to pressure, imperfections to measurement errors, etc.
So, normal or CLT is the attractor state for the unexceptional world.
BUT for the exceptional world, causes multiply or cascade: earthquake magnitudes, network connectivity, etc. So, you get log-normal or fat-tailed.
The cognitive dissonance comes from the tension between the-spec-as-management-artifact vs the-spec-as-engineering-artifact. Author is right that advocates are selling the first but second is the only one which works.
For a manager, the spec exists in order to create a delgation ticket, something you assign to someone and done.
But for a builder, it exists as a thinking tool that evolves with the code to sharpen the understanding/thinking.
I also think, that some builders are being fooled into thinking like managers because ease, but they figure it out pretty quickly.
Also, however much you manage the project, eventually you do need to actually "build". You can't deliver on hype alone. Or maybe you can, but only for some news cycles or VC meetings. The user will eventually need the promised product.
Unless Miscrosoft does something, this impression of LinkedIn Speak is here to stay.
LinkedIn Speak: Unless Microsoft takes action, this LinkedIn-optimized communication style is the new industry standard. It's all about leveraging synergy and personal branding to stay ahead of the curve. #ThoughtLeadership #Networking #FutureOfWork
Apart from rediscovering all the problems with distributed systems, I think LM teams will also rediscover their own version of the mythical man-month, and very quickly too.
There were 3 core insights: adding people makes the project later, communication cost grows as n^2, and time isn't fungible.
For agents, maybe the core insight won't hold, and adding a new agent won't necessarily increase dev-time, but the second will be worse, communication cost will grow faster than n^2 because of LLM drift and orchestration overhead.
The third doesn't translate cleanly but i'll try: Time isn't fungible for us and assumptions and context, however fragmented, aren't fungible for agents in a team. If they hallucinate at the wrong time, even a little, it could be a equivalent of a human developer doing a side-project during company time.
An agent should write an article on it and post it on moltbook: "The Inevitable Agent Drift"
"Enough context" is doing a lot of work in your argument. What is that context? At some point it includes architectural intent, performance constraints, team conventions, future extensibility assumptions ... i.e., exactly the stuff that isn't in the spec and that constitutes the engineering judgment we're discussing. If you force fold all of that into "context," then sure, but now you've just moved the goalposts from spec to "spec plus everything else the developer knows," which is then like a 1:1 map from spec to code.
On inferring how components relate from training data: that gives you statistical priors over topologies which is a good to have sure, but it is not same a justified topology selection for a specific system.
reply