Hacker Newsnew | past | comments | ask | show | jobs | submit | striking's commentslogin

Being marked an enemy of the state for disagreeing with the state to me sounds like thoughtcrime, plain and simple. How much more Orwellian can you get?

I remember neither that happening in 1984, nor is that a description of what is happening to Anthropic. Or is this is an Animal Farm reference instead?

I remember Winston having a private conversation about political beliefs, and then being literally tortured into submission. And I remember Anthropic refusing a government order (albeit a stupid government order), and then being labeled a "supply chain risk." You can twist reality however you'd like though.


You don’t remember the concept of thought crime in 1984? Or you don’t recall how thought crime gets you branded an enemy of the state? The former was a term literally introduced in 1984 and the thought police is tasked with locating and eliminating thought crime. Throughout the book there are news reports of the thought criminals caught and arrested who are now enemies of the state. The book ends with him being tortured until he completely succumbs to the thought control and is then murdered.

If you can’t see the allegory in that story to an administration that actively goes after those it labels as enemies because they dare to voice their own opinion or oppose their political goals in any way, either you’re not cut out for literary analysis and trying to apply metaphors in literature to the real world or you aren’t seeing the real world for what it is.


Didn't label them an "enemy." Didn't accuse them of crime. And the decree was due to Anthropic's actions--not their thoughts.

Ok, just labeling them a supply chain risk while also claiming they’re critical to national security for insisting the government stick to the powers to the model they agreed to in the contract and not expanding it.

> Their true objective is unmistakable: to seize veto power over the operational decisions of the United States military. That is unacceptable.

Yup, definitely not an enemy.

> Instead, @AnthropicAI and its CEO @DarioAmodei, have chosen duplicity

Don’t you call your friends duplicitous?

> Anthropic’s stance is fundamentally incompatible with American principles.

Oh boy. Doubleplus ungood.

> I am directing the Department of War to designate Anthropic a Supply-Chain Risk to National Security. Effective immediately, no contractor, supplier, or partner that does business with the United States military may conduct any commercial activity with Anthropic

Oh yeah, totally not an enemy. Just no one can do business with them. Doubleplusungood behavior.

They’re both a danger to US troops with their behavior and also critical to the supply chain of said troops. Very important to understand and accept that doublethink.


This doesn't require the slightest bit of doublethink. Their technology is fantastic and would be an important military tool if Anthropic allowed it to be used as such. Their choice to disallow it makes them a supply chain risk, but the existence of the technology makes them important. This isn't hard.

There's no need to read it that literally, we're not making Borges' map here. 1984 is both about the visceral horror of the authoritarian state and the existential horror of being unable to fight an opponent who controls the very language you speak and the concept of truth. The former grounds the latter, turning an interesting philosophical treatise that might otherwise not land with readers into an approachable work of fiction.

They got labeled a "supply chain risk" in order to prevent the government from contracting with them. They didn't disappear or arrest or even charge Dario. He's a billionaire with more freedom and opportunity than Orwell could have even imagined.

I would love to hear your perspective of how the label "supply chain risk" and its definition aren't in accordance with the concept of being branded an enemy of the state. I'll reproduce the definition below:

> “Supply chain risk” means the risk that an adversary may sabotage, maliciously introduce unwanted function, or otherwise subvert the design, integrity, manufacturing, production, distribution, installation, operation, or maintenance of a covered system so as to surveil, deny, disrupt, or otherwise degrade the function, use, or operation of such system (see 10 U.S.C. 3252). (https://www.acquisition.gov/dfars/subpart-239.73-requirement...)

There's a little bit of leeway here, but this definition means either the company is an adversary (or an extension of one, e.g. Huawei/the CCP) or is under threat of being compromised by an adversary.

So which is Anthropic? Well, neither: the government's court filings and public comments in the media claim that Anthropic has an "adversarial posture". They want to simultaneously get away with bucketing Anthropic under the statute for adversaries, but without calling Anthropic an adversary directly in a court of law. They want to apply the statute without needing to follow the actual definition of an adversary.

From a CNBC interview:

> We can't have a company that has a different policy preference that is baked into the model through its constitution, its soul, its policy preferences, pollute the supply chain so our warfighters are getting ineffective weapons, ineffective body armor, ineffective protection. That's really where the supply chain risk designation came from. (https://www.cnbc.com/2026/03/12/anthropic-claude-emil-michae...)

That's why the judge rightly called this situation Orwellian: we're looking at linguistic sleight of hand designed to allow the government to turn what is a simple contract dispute into a company-threatening classification that threatens to uproot them entirely from any company that does business with the most powerful entity in the United States. Because Anthropic doesn't want to do the government's bidding despite being allowed to as a matter of freedom of speech, they are being threatened with a punishment that goes beyond just not being able to contract directly with the government. And that's not fair.

I would also love to understand why you keep going back to the literal events of the book. You don't need to be locked in a room and forced to claim that 2+2=5 for your situation to be Orwellian.


> I remember Winston having a private conversation about political beliefs, and then being literally tortured into submission.

I remember Winston being forced to accept that 2+2=5 and believing it.

> In the end the Party would announce that two and two made five, and you would have to believe it. It was inevitable that they should make that claim sooner or later: the logic of their position demanded it. Not merely the validity of experience, but the very existence of external reality, was tacitly denied by their philosophy. The heresy of heresies was common sense. And what was terrifying was not that they would kill you for thinking otherwise, but that they might be right. For, after all, how do we know that two and two make four? Or that the force of gravity works? Or that the past is unchangeable? If both the past and the external world exist only in the mind, and if the mind itself is controllable—what then?

* https://www.goodreads.com/quotes/321469-in-the-end-the-party...

* https://en.wikipedia.org/wiki/2_%2B_2_%3D_5#George_Orwell

> And I remember Anthropic refusing a government order (albeit a stupid government order), and then being labeled a "supply chain risk." You can twist reality however you'd like though.

I remember when American companies could do domestic business, or not, with whomever they wished without having to worry about being punished by the government for their choices.

If a government orders a pacifist to pick up a gun, is that allowed? If a government orders a pacifist to manufacture a gun, is that allowed? (There's a spectrum of 'complicity'.)


> I remember when American companies could do domestic business, or not, with whomever they wished without having to worry about being punished by the government for their choices.

No you don't, because that time as never existed.

> If a government orders a pacifist to pick up a gun, is that allowed? If a government orders a pacifist to manufacture a gun, is that allowed? (There's a spectrum of 'complicity'.)

Yes. It's called the draft. It's called wartime manufacturing decrees. These all existed at the time of Orwell, and he never alluded to them being thoughtcrimes. Compelling people to act against their beliefs is common and distinct from throughtcrime. And if you cannot see that, then I don't even know how to talk to you. Government has always controlled your outer life. Orwell introduced thoughtcrime as the next step in totalitarianism, as the erasure of inner life.

edit: I asked Opus to analyze this thread, and I agree with it.

> That said, Orwell would probably also note that the people arguing against you aren't entirely wrong to be alarmed — they're just reaching for the wrong literary reference and overstating the analogy. Government retaliation against companies for political speech is concerning on its own terms without needing to be dressed up as dystopian fiction. The 1984 framing actually weakens the critique by making it easy to dismiss as hyperbolic.

> He'd probably tell everyone in the thread to say what they mean in plain language and stop hiding behind his book.


And so can you.

Sure, but also you might be on a city bus for... half an hour? It's not pleasant to have someone blast noise but it's nothing like a multi-hour flight. Why bother?

The bundling might feel necessary from Atari's side because OpenTTD would compete with Atari's re-release on platforms like Steam and GoG (unlike on OpenTTD's website, where you're already at the end of the funnel for OpenTTD specifically and therefore Atari doesn't feel like they're losing a sale).

> Today, we’re releasing a research preview of GPT‑5.3‑Codex‑Spark, a smaller version of GPT‑5.3‑Codex, and our first model designed for real-time coding.

from https://openai.com/index/introducing-gpt-5-3-codex-spark/, emphasis mine


You're right. It's funny because I kind of noticed that, but with all of these subtle model issues, I'm so used to being distraught by the smallest thing I've had to learn to 'trust the data' aka the charts, model standings, performance, etc. and in this case, I was under the assumption 'it was the same model' clearly it's not.

Which is a bummer because it would be nice to try a true side-by-side analysis.


> It's funny because I kind of noticed that

It's less funny when you consider that you were very confident about it, yet now it seems you haven't even bothered to run the model yourself, as you'd notice how different the quality of responses were, not just the speed.

Kind of makes me ignore everything else you wrote too, because why would that be correct when you surely haven't validated that before writing it, and you got the basics wrong?


What a snide and insulting comment - and plainly wrong.

I literally stated 'I noticed that' - implying I'm using the model.

I'm 'running the model' literally as I write this, I use it every day.

What I was 'wrong' about was the very fine point that '5.3 Codex Spark' is a different model that '5.3 Codex' which is rather a fine point.

I 'thought that I noticed something, but dismissed it' because I value the facts generally more than my intuition. I just so happened that I had that one fact wrong - 'Spark' is technically a different model, so it's not just 'a faster model', it will 'behave differently' , which lends credence to the individual I was responding to.


Even if interpretability of specific models or features within them is an open area of research, the mechanics of how LLMs work to produce results are observable and well-understood, and methods to understand their fundamental limitations are pretty solid these days as well.

Is there anything to be gained from following a line of reasoning that basically says LLMs are incomprehensible, full stop?


>Even if interpretability of specific models or features within them is an open area of research, the mechanics of how LLMs work to produce results are observable and well-understood, and methods to understand their fundamental limitations are pretty solid these days as well.

If you train a transformer on (only) lots and lots of addition pairs, i.e '38393 + 79628 = 118021' and nothing else, the transformer will, during training discover an algorithm for addition and employ it in service of predicting the next token, which in this instance would be the sum of two numbers.

We know this because of tedious interpretability research, the very limited problem space and the fact we knew exactly what to look for.

Alright, let's leave addition aside (SOTA LLMs are after all trained on much more) and think about another question. Any other question at all. How about something like:

"Take a capital letter J and a right parenthesis, ). Take the parenthesis, rotate it counterclockwise 90 degrees, and put it on top of the J. What everyday object does that resemble?"

What algorithm does GPT or Gemini or whatever employ to answer this and similar questions correctly ? It's certainly not the one it learnt for addition. Do you Know ? No. Do the creators at Open AI or Google know ? Not at all. Can you or they find out right now ? Also No.

Let's revisit your statement.

"the mechanics of how LLMs work to produce results are observable and well-understood".

Observable, I'll give you that, but how on earth can you look at the above and sincerely call that 'well-understood' ?


It's pattern matching, likely from typography texts and descriptions of umbrellas. My understanding is that the model can attempt some permutations in its thinking and eventually a permutation's tokens catch enough attention to attempt to solve, and that once it is attending to "everyday object", "arc", and "hook", it will reply with "umbrella".

Why am I confident that it's not actually doing spatial reasoning? At least in the case of Claude Opus 4.6, it also confidently replies "umbrella" even when you tell it to put the parenthesis under the J, with a handy diagram clearly proving itself wrong: https://claude.ai/share/497ad081-c73f-44d7-96db-cec33e6c0ae3 . Here's me specifically asking for the three key points above: https://claude.ai/share/b529f15b-0dfe-4662-9f18-97363f7971d1

I feel like I have a pretty good intuition of what's happening here based on my understanding of the underlying mathematical mechanics.

Edit: I poked at it a little longer and I was able to get some more specific matches to source material binding the concept of umbrellas being drawn using the letter J: https://claude.ai/share/f8bb90c3-b1a6-4d82-a8ba-2b8da769241e


>It's pattern matching, likely from typography texts and descriptions of umbrellas.

"Pattern matching" is not an explanation of anything, nor does it answer the question I posed. You basically hand waved the problem away in conveniently vague and non-descriptive phrase. Do you think you could publish that in a paper for ext ?

>Why am I confident that it's not actually doing spatial reasoning? At least in the case of Claude Opus 4.6, it also confidently replies "umbrella" even when you tell it to put the parenthesis under the J, with a handy diagram clearly proving itself wrong

I don't know what to tell you but J with the parentheses upside down still resembles an umbrella. To think that a machine would recognize it's just a flipped umbrella and a human wouldn't is amazing, but here we are. It's doubly baffling because Claude quite clearly explains it in your transcript.

>I feel like I have a pretty good intuition of what's happening here based on my understanding of the underlying mathematical mechanics.

Yes I realize that. I'm telling you that you're wrong.


>Do you think you could publish that in a paper for ext ?

You seem to think it's not 'just' tensor arithmetic.

Have you read any of the seminal papers on neutral networks, say?

It's [complex] pattern matching as the parent said.

If you want models to draw composite shapes based on letter forms and typography then you need to train them (or at least fine-tune them) to do that.

I still get opposite (antonym) confusion occasionally in responses to inferences where I expect the training data is relatively lacking.

That said, you claim the parent is wrong. How would you describe LLM models, or generative "AI" models in the confines of a forum post, that demonstrates their error? Happy for you to make reference to academic papers that can aid understanding your position.


>You seem to think it's not 'just' tensor arithmetic.

If I asked you to explain how a car works and you responded with a lecture on metallic bonding in steel, you wouldn’t be saying anything false, but you also wouldn’t be explaining how a car works. You’d be describing an implementation substrate, not a mechanism at the level the question lives at.

Likewise, “it’s tensor arithmetic” is a statement about what the computer physically does, not what computation the model has learned (or how that computation is organized) that makes it behave as it does. It sheds essentially zero light on why the system answers addition correctly, fails on antonyms, hallucinates, generalizes, or forms internal abstractions.

So no: “tensor arithmetic” is not an explanation of LLM behavior in any useful sense. It’s the equivalent of saying “cars move because atoms.”

>It's [complex] pattern matching as the parent said

“Pattern matching”, whether you add [complex] to it or not is not an explanation. It gestures vaguely at “something statistical” without specifying what is matched to what, where, and by what mechanism. If you wrote “it’s complex pattern matching” in the Methods section of a paper, you’d be laughed out of review. It’s a god-of-the-gaps phrase: whenever we don’t know or understand the mechanism, we say “pattern matching” and move on, but make no mistake, it's utterly meaningless and you've managed to say absolutely nothing at all.

And note what this conveniently ignores: modern interpretability work has repeatedly shown that next-token prediction can produce structured internal state that is not well-described as “pattern matching strings”.

- Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task (https://openreview.net/forum?id=DeG07_TcZvT) and Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models (https://openreview.net/forum?id=PPTrmvEnpW&referrer=%5Bthe%2...

Transformers trained on Othello or Chess games (same next token prediction) were demonstrated to have developed internal representations of the rules of the game. When a model predicted the next move in Othello, it wasn't just "pattern matching strings", it had constructed an internal map of the board state you could alter and probe. For Chess, it had even found a way to estimate a player's skill to better predict the next move.

There are other interpretability papers even more interesting than those. Read them, and perhaps you'll understand how little we know.

On the Biology of a Large Language Model - https://transformer-circuits.pub/2025/attribution-graphs/bio...

Emergent Introspective Awareness in Large Language Models - https://transformer-circuits.pub/2025/introspection/index.ht...

>That said, you claim the parent is wrong. How would you describe LLM models, or generative "AI" models in the confines of a forum post, that demonstrates their error? Happy for you to make reference to academic papers that can aid understanding your position.

Nobody understands LLMs anywhere near enough to propose a complete theory that explains all their behaviors and failure modes. The people who think they do are the ones who understand them the least.

What we can say:

- LLMs are trained via next-token prediction and, in doing so, are incentivized to discover algorithms, heuristics, and internal world models that compress training data efficiently.

- These learned algorithms are not hand-coded; they are discovered during training in high-dimensional weight space and because of this, they are largely unknown to us.

- Interpretability research shows these models learn task-specific circuits and representations, some interpretable, many not.

- We do not have a unified theory of what algorithms a given model has learned for most tasks, nor do we fully understand how these algorithms compose or interfere.


I made this metaphor from my understanding of your comment.

Imagine we put a kid in a huge library of book who doesn't know how to write/read and knows nothing about what letter means etc. That kid stayed in the library and had a change for X amount time which will be enough to look over all of them.

what this will do is that not like us but somehow this kid managed to create patterns in the books.

After that X amount of time, we asked this Kid a question. "What is the capital of Germany?"

That kid will just have it is on kind of map/pattern to say "Berlin". Or kid might say "Berlin is the capital of the Germany" or "Capital of Germany is Berlin." The issue here is that we do not have the understanding of how this kid came of with the answer or what kind of "understanding" or "mapping" being used to reach this answer.

The other part basically shows we do not fully understand how LLM works is: Ask a very complex question to an AI. Like "explain me the mechanics of quantum theory like I am 8 years old".

1- Everytime, it will create differnt answer. Main point is the same but the letters/words etc would be different. Like the example I give above.There are unlimited type of answer AI can give you. 2- Can anyone in the Earth - a human - without a technology access for have unlimited amount of book/paper to check whatever info he needs - tell us the exact sentence/words will LLM use? No.

Then we do not have fully understand of LLM.

You can create a linear regression model and give it 100 people data and all these 100 people are blue eyed. Then give 101 person and ask it to predict the eye color. You already know the exact answer. It will be %100.


I think what you two are going back and forth on is the heated debate in AI research regarding Emergent Abilities. Specifically, whether models actually develop "sudden" new powers as they scale, or if those jumps are just a mirage caused by how we measure them.


I don't have much more to add to the sibling comment other than the fact that the transcript reads

> When you rotate ")" counterclockwise 90°, it becomes a wide, upward-opening arc — like ⌣.

but I'm pretty sure that's what you get if you rotate it clockwise.


> I feel like I have a pretty good intuition of what's happening here based on my understanding of the underlying mathematical mechanics.

You should write a paper and release it and basically get rich.


From Gemini:When you take those two shapes and combine them, the resulting image looks like an umbrella.


The concept “understand” is rooted in utility. It means “I have built a much simpler model which produces usefully accurate predictions, of the thing or behaviour I seek to ‘understand’”. This utility is “explanatory power”. The model may be in your head, may be math, may be an algorithm or narrative, it may be a methodology with a history of utility. “Greater understanding” is associated with models that are simpler, more essential, more accurate, more useful, cheaper, more decomposed, more composable, more easily communicated or replicated, or more widely applicable.

“Pattern matching”, “next token prediction”, “tensor math” and “gradient descent” or the understanding and application of these by specialists, are not useful models of what LLMs do, any more than “have sex, feed and talk to the resulting artifact for 18 years” is a useful model of human physiology or psychology.

My understanding, and I'm not a specialist, is there are huge and consequential utility gaps in our models of LLMs. So much so, it is reasonable to say we don't yet understand how they work.


You can't keep pushing the AI hype train if you consider it just a new type of software / fancy statistical database.


Yes, there is - benefit of a doubt.


My guess is that the submitter is automated. It's not the first time their post title has been truncated by the text limit without their editing it.


A DS set to Auto mode will boot to the cartridge (and you can reflash the firmware to skip the health and safety screen). From there the OS is replaced with whatever is on the cart. A flashcart with the right shell will boot right into whatever app you want (and you can soft reset the console with a key combination to switch apps).

3DSes require a little more work and have a longer boot chain, but it's been thoroughly broken all the way to the bootstrapping process so you can use whichever firmware version and whatever patches you like with enough effort.


Once a DS has been flashed (skips the health and safety screen) it also disables signature verification for DS download play, so you can beam homebrews directly to your DS' home screen with a wifi card. But this is an awkward process that most people don't actually do with their original DSes, as it requires putting tinfoil over a toothpick and jamming it into a hole next to the battery to close the flash write jumper. I think DS' crypto has also been defeated but I can't find any documentation of arbitrary download play on unflashed DSes. Also seems no .nds signing keys in the leaks from what I can tell.


Thanks for this! I wish there were more cross-comparisons like this out there of what it is actually like to use some of these frameworks, the note on Django being a little less magic than Rails makes me genuinely interested in it.


if you want "less magic than rails" check out ecto, i would say it has less magic than django


It's not just brain atrophy, I think. I think part of it is that we're actively making a tradeoff to focus on learning how to use the model rather than learning how to use our own brains and work with each other.

This would be fine if not for one thing: the meta-skill of learning to use the LLM depreciates too. Today's LLM is gonna go away someday, the way you have to use it will change. You will be on a forever treadmill, always learning the vagaries of using the new shiny model (and paying for the privilege!)

I'm not going to make myself dependent, let myself atrophy, run on a treadmill forever, for something I happen to rent and can't keep. If I wanted a cheap high that I didn't mind being dependent on, there's more fun ones out there.


> let myself atrophy, run on a treadmill forever, for something

You're lucky to afford the luxury not to atrophy.

It's been almost 4 years since my last software job interview and I know the drills about preparing for one.

Long before LLMs my skills naturally atrophy in my day job.

I remember the good old days of J2ME of writing everything from scratch. Or writing some graph editor for universiry, or some speculative, huffman coding algorithm.

That kept me sharp.

But today I feel like I'm living in that netflix series about people being in Hell and the Devil tricking them they're in Heaven and tormenting them: how on planet Earth do I keep sharp with java, streams, virtual threads, rxjava, tuning the jvm, react, kafka, kafka streams, aws, k8s, helm, jenkins pipelines, CI-CD, ECR, istio issues, in-house service discovery, hierarchical multi-regions, metrics and monitoring, autoscaling, spot instances and multi-arch images, multi-az, reliable and scalable yet as cheap as possible, yet as cloud native as possible, hazelcast and distributed systems, low level postgresql performance tuning, apache iceberg, trino, various in-house frameworks and idioms over all of this? Oh, and let's not forget the business domain, coding standards, code reviews, mentorships and organazing technical events. Also, it's 2026 so nobody hires QA or scrum masters anymore so take on those hats as well.

So LLMs it is, the new reality.


This is a very good point. Years ago working in a LAMP stack, the term LAMP could fully describe your software engineering, database setup and infrastructure. I shudder to think of the acronyms for today's tech stacks.


And yet many the same people who lament the tooling bloat of today will, in a heartbeat, make lame jokes about PHP. Most of them aren't even old enough to have ever done anything serious with it, or seen it in action beyond Wordpress or some spaghetti-code one-pager they had to refactor at their first job. Then they show up on HN with a vibe-coded side project or blog post about how they achieved a 15x performance boost by inventing server-side rendering.


Highly relevant username!


I try :)


Ya I agree it's totally crazy.... but, do most app deployments need even half that stuff? I feel like most apps at most companies can just build an app and deploy it using some modern paas-like thing.


> I feel like most apps at most companies can just build an app and deploy it using some modern paas-like thing.

Most companies (in the global, not SV sense) would be well served by an app that runs in a Docker container in a VPS somewhere and has PostgreSQL and maybe Garage, RabbitMQ and Redis if you wanna get fancy, behind Apache2/Nginx/Caddy.

But obviously that’s not Serious Business™ and won’t give you zero downtime and high availability.

Though tbh most mid-size companies would also be okay with Docker Swarm or Nomad and the same software clustered and running behind HAProxy.

But that wouldn’t pad your CV so yeah.


> Most companies (in the global, not SV sense) would be well served by an app that runs in a Docker container in a VPS somewhere and has PostgreSQL and maybe Garage, RabbitMQ and Redis if you wanna get fancy, behind Apache2/Nginx/Caddy.

That’s still too much complication. Most companies would be well served by a native .EXE file they could just run on their PC. How did we get to the point where applications by default came with all of this shit?


When I was in primary school, the librarian used a computer this way, and it worked fine. However, she had to back it up daily or weekly onto a stack of floppy disks, and if she wanted to serve the students from the other computer on the other side of the room, she had to restore the backup on there, and remember which computer had the latest data, and only use that one. When doing a stock–take (scanning every book on the shelves to identify lost books), she had to bring that specific computer around the room in a cart. Such inconveniences are not insurmountable, but they're nice to get rid of. You don't need to back up a cloud service and it's available everywhere, even on smaller devices like your phone.

There's an intermediate level of convenience. The school did have an IT staff (of one person) and a server and a network. It would be possible to run the library database locally in the school but remotely from the library terminals. It would then require the knowledge of the IT person to administer, but for the librarian it would be just as convenient as a cloud solution.


I think the 'more than one user' alternative to a 'single EXE on a single computer' isn't the multilayered pie of things that KronisLV mentioned, but a PHP script[0] on an apache server[0] you access via a web browser. You don't even need a dedicated DB server as SQLite will do perfectly fine.

[0] or similarly easy to get running equivalent


> but a PHP script[0] on an apache server[0] you access via a web browser

I've seen plenty of those as well - nobody knows exactly how things are setup, sometimes dependencies are quite outdated and people are afraid to touch the cPanel config (or however it's setup). Not that you can't do good engineering with enough discipline, it's just that Docker (or most methods of containerization) limits the blast range when things inevitably go wrong and at least try to give you some reproducibility.

At the same time, I think that PHP can be delightfully simple and I do use Apache2 myself (mod_php was actually okay, but PHP-FPM also isn't insanely hard to setup), it's just that most of my software lives in little Docker containers with a common base and a set of common tools, so they're decoupled from the updates and config of the underlying OS. I've moved the containers (well data+images) across servers with no issues when needed and also resintalled OSes and spun everything right back up.

Kubernetes is where dragons be, though.


> That’s still too much complication. Most companies would be well served by a native .EXE file they could just run on their PC

I doubt that.

As software has grown to solving simple personal computing problems (write a document, create a spreadsheet) to solving organizational problems (sharing and communication within and without the organization), it has necessarily spread beyond the .exe file and local storage.

That doesn't give a pass to overly complex applications doing a simple thing - that's a real issue - but to think most modern company problems could be solved with just a local executable program seems off.


It can be like that, but then IT and users complain about having to update this .exe on each computer when you add new functionality or fix some errors. When you solve all major pain points with a simple app, "updating the app" becomes top pain point, almost by definition.


> How did we get to the point where applications by default came with all of this shit?

Because when you give your clients instructions on how to setup the environment, they will ignore some of them and then they install OracleJDK while you have tested everything under OpenJDK and you have no idea why the application is performing so much worse in their environment: https://blog.kronis.dev/blog/oracle-jdk-and-openjdk-compatib...

It's not always trivial to package your entire runtime environment unless you wanna push VM images (which is in many ways worse than Docker), so Docker is like the sweet spot for the real world that we live in - a bit more foolproof, the configuration can be ONE docker-compose.yml file, it lets you manage resource limits without having to think about cgroups, as well as storage and exposed ports, custom hosts records and all the other stuff the human factor in the process inevitably fucks up.

And in my experience, shipping a self-contained image that someone can just run with docker compose up is infinitely easier than trying to get a bunch of Ansible playbooks in place.

If your app can be packaged as an AppImage or Flatpak, or even a fully self contained .deb then great... unless someone also wants to run it on Windows or vice versa or any other environment that you didn't anticipate, or it has more dependencies than would be "normal" to include in a single bundle, in which case Docker still works at least somewhat.

Software packaging and dependency management sucks, unless we all want to move over to statically compiled executables (which I'm all for). Desktop GUI software is another can of worms entirely, too.


When I come into a new project and I find all this... "stuff" in use, often what I later find is actually happening with a lot of it is:

- nobody remembers why they're using it

- a lot of it is pinned to old versions or the original configuration because the overhead of maintaining so much tooling is too much for the team and not worth the risk of breaking something

- new team members have a hard time getting the "complete picture" of how the software is built and how it deploys and where to look if something goes wrong.


That was on NBC.


Businesses too. For two years it's been "throw everything into AI." But now that shit is getting real, are they really feeling so coy about letting AI run ahead of their engineering team's ability to manage it? How long will it be until we start seeing outages that just don't get resolved because the engineers have lost the plot?


From what I am seeing, no one is feeling coy simply because of the cost savings that management is able to show the higher-ups and shareholders. At that level, there's very little understanding of anything technical and outages or bugs will simply get a "we've asked our technical resources to work on it". But every one understands that spending $50 when you were spending $100 is a great achievement. That's if you stop and not think about any downsides. Said management will then take the bonuses and disappear before the explosions start with their resume glowing about all the cost savings and team leadership achievements. I've experienced this first hand very recently.


Of all the looming tipping points whereby humans could destroy the fabric of their existence, this one has to be the stupidest. And therefore the most likely.


There really ought to be a class of professionals like forensic accountants who can show up in a corrupted organization and do a post mortem on their management of technical debt


How long until “the LLM did it it” is just as effective as “AWS is down, not my fault”?


Never because the only reason that works with Amazon is that everyone is down at the exact same time.


Everyone will suffer from slop code at the same time.


Yeah but that's very different from an AWS outage. Everyone's website being down for a day every year or 2 is something that it's very hard to take advantage of as a competitor. That's not true for software that is just terrible all the time.


This to me is the point.. LLMs can't be responsible for things. It sits with a human.


Why can LLMs not be responsible for things? (genuine question - I'm not certain myself).


because it doesn't have any skin in the game and can't be punished, and can't be rewarded for succeeding. Its reputation, career, and dignity are nonexistent.


On the contrary - the LLM has had it's own version of "skin in the game" through the whole of it's training. Reinforcement learning is nothing but that. Why is that less real than putting a person in prison. Is it because of the LLM itself, or because you don't trust the people selling it to you?


Are you claiming that LLMs are... sentient? Bold claim, Taylor.


This doesn't seem to have stopped anyone before.


Stopped anyone from doing what? Assigning responsibility to someone with nothing to lose, no dignity or pride, and immune from financial or social injury?


If you’re just a gladhander for an algorithm, what are you really needed for?


> It's not just brain atrophy, I think. I think part of it is that we're actively making a tradeoff to focus on learning how to use the model rather than learning how to use our own brains and work with each other.

I agree with the sentiment but I would have framed it differently. The LLM is a tool, just like code completion or a code generator. Right now we focus mainly on how to use a tool, the coding agent, to achieve a goal. This takes place at a strategic level. Prior to the inception of LLMs, we focused mainly on how to write code to achieve a goal. This took place at a tactical level, and required making decisions and paying attention to a multitude of details. With LLMs our focus shifts to a higher-level abstraction. Also, operational concerns change. When writing and maintaining code yourself, you focus on architectures that help you simplify some classes of changes. When using LLMs, your focus shifts to building context and aiding the model effectively implement their changes. The two goals seem related, but are radically different.

I think a fairer description is that with LLMs we stop exercising some skills that are only required or relevant if you are writing your code yourself. It's like driving with an automatic transmission vs manual transmission.


Previous tools have been deterministic and understandable. I write code with emacs and can at any point look at the source and tell you why it did what it did. But I could produce the same program with vi or vscode or whatever, at the cost of some frustration. But they all ultimately transform keystrokes to a text file in largely the same way, and the compiler I'm targeting changes that to asm and thence to binary in a predictable and visible way.

An LLM is always going to be a black box that is neither predictable nor visible (the unpredictability is necessary for how the tool functions; the invisibility is not but seems too late to fix now). So teams start cargo culting ways to deal with specific LLMs' idiosyncrasies and your domain knowledge becomes about a specific product that someone else has control over. It's like learning a specific office suite or whatever.


> An LLM is always going to be a black box that is neither predictable nor visible (the unpredictability is necessary for how the tool functions; the invisibility is not but seems too late to fix now)

So basically, like a co-worker.

That's why I keep insisting that anthropomorphising LLMs is to be embraced, not avoided, because it gives much better high-level, first-order intuition as to where they belong in a larger computing system, and where they shouldn't be put.


> So basically, like a co-worker.

Arguably, though I don't particularly need another co-worker. Also co-workers are not tools (except sometimes in the derogatory sense).


Sort of except it seems the more the co-worker does the job it atrophies my ability to understand.. So soon we'll all be that annoyingly ignorant manager saying, "I don't know, I want the button to be bigger". Yay?


Only if we're lucky and the LLMs cease being replaced with improved models.

Claude has already shown us people who openly say "I don't code and yet I managed this"; right now the command line UI will scare off a lot of people, and people using the LLMs still benefit from technical knowledge and product design skills, if the tools don't improve we keep that advantage…

…but how long will it be before the annoyingly ignorant customer skips the expensive annoyingly ignorant manager along with all us expensive developers, and has one of the models write them bespoke solution for less than the cost of off-the-shelf shrink-wrapped DVDs from a discount store?

Hopefully that extra stuff is further away than it seems, hopefully in a decade there will be an LLM version of this list: https://en.wikipedia.org/wiki/List_of_predictions_for_autono...

But I don't trust to hope. It has forsaken these lands.


> using the LLMs still benefit from technical knowledge and product design skills, if the tools don't improve we keep that advantage…

I don't think we will, because many of us are already asking LLMs for help/advice on these, so we're already close to the point where LLMs will be able to use these capabilities directly, instead of just for helping us drive the process.


Indeed, but the output of LLMs today for these kinds of task are akin to a junior product designer, a junior project manager, a junior software architect etc.

For those of us who are merely amateur at any given task, LLMs raising us to "junior" is absolutely an improvement. But just as it's possible to be a better coder than an LLM, if you're a good PM or QA or UI/UX designer, you're not obsolete yet.


> and can at any point look at the source and tell you why it did what it did

Even years later? Most people can’t unless there’s good comments and design. Which AI can replicate, so if we need to do that anyway, how is AI specially worse than a human looking back at code written poorly years ago?


I mean, Emacs's oldest source files are like 40 years old at this point, and yes they are in fact legible? I'm not sure what you're asking -- you absolutely can (and if you use it long enough, will) read the source code of your text editor.


Well especially the lisp parts!


The little experience I have with LLM confidently shows that LLMs are much better at navigating and modifying a well structured code base. And they struggle, sometimes to a point where they can't progress at all, if tasked to work on bad code. I mean, the kind of bad you always get after multiple rounds of unsupervised vibe coding.


> I happen to rent and can't keep

This is my fear - what happens if the AI companies can't find a path to profitability and shut down?


Don't threaten us with a good time.


That’s not a good time, I love these things. I’ve been able to indulge myself so much. Possibly good for job security but would suck in every other way.


This is why local models are so important. Even if the non-local ones shut down, and even if you can't run local ones on your own hardware, there will still be inference providers willing to serve your requests.


Recently I was thinking about how some (expensive) customer electronics like the Mac Studio can run pretty powerful open source models with a pretty efficient power consumption, that could pretty easily run on private renewable energy, and that are on most (all?) fronts much more powerful than the original ChatGPT especially if connected to a good knowledge base. Meaning that aside from very extreme scenarios I think it is safe to say that there will always be a way not to go back to how we used to code, as long as we can offer the correct hardware and energy. Of course personally I think we will never need to go to such extreme ends... despite knowing of people who seem to seriously think developed countries heavily run out of electricity one day, which, while I reckon there might be tensions, seems like a laughable idea IMHO.


> the meta-skill of learning to use the LLM depreciates too. Today's LLM is gonna go away someday, the way you have to use it will change. You will be on a forever treadmill, always learning the vagaries of using the new shiny model (and paying for the privilege!)

I haven’t found this to be true at all, at least so far.

As models improve I find that I can start dropping old tricks and techniques that were necessary to keep old models in line. Prompts get shorter with each new model improvement.

It’s not really a cycle where you’re re-learning all the time or the information becomes outdated. The same prompt structure techniques are usually portable across LLMs.


Interesting, I’ve experienced the opposite in certain contexts. CC is so hastily shipped that new versions often imbalance existing workflows. E.g. people were raving about the new user prompt tools that CC used to get more context but they messed my simple git slash commands


I think you have to be aware of how you use any tool but I don’t think this is a forever treadmill. It’s pretty clear to me since early on that the goal is for you the user to not have to craft the perfect prompt. At least for my workflow it’s pretty darn close to that for me.


If it ever gets there, then anyone can use it and there's no "skill" to be learned at all.

Either it will continue to be this very flawed non-deterministic tool that requires a lot of effort to get useful code out of it, or it will be so good it'll just work.

That's why I'm not gonna heavily invest my time into it.


Good for you. Others like myself find the tools incredibly useful. I am able to knock out code at a higher cadence and it’s meeting a standard of quality our team finds acceptable.


Looking forward for those 10x improvements to finally show up somewhere. Any day now!

Jokes aside, I never said it's not useful, but most definitely it's not even close to all this hype.


> very flawed non-deterministic tool that requires a lot of effort to get useful code out of it

We are all different but I think most of us with open minds are the flaw in your statement.


I have deliberately moderated my use of AI in large part for this reason. For a solid two years now I've been constantly seeing claims of "this model/IDE/Agent/approach/etc is the future of writing code! It makes me 50x more productive, and will do the same for you!" And inevitabely those have all fallen by the wayside and been replaced by some new shiny thing. As someone who doesn't get intrinsic joy out of chasing the latest tech fad I usually move along and wait to see if whatever is being hyped really starts to take over the world.

This isn't to say LLMs won't change software development forever, I think they will. But I doubt anyone has any idea what kind of tools and approaches everyone will be using 5 or 10 years from now, except that I really doubt it will be whatever is being hyped up at this exact moment.


HN is where I keep hearing the “50× more productive” claims the most. I’ve been reading 2024 annual reports and 2025 quarterlies to see whether any of this shows up on the other side of the hype.

So far, the only company making loud, concrete claims backed by audited financials is Klarna and once you dig in, their improved profitability lines up far more cleanly with layoffs, hiring freezes, business simplification, and a cyclical rebound than with Gen-AI magically multiplying output. AI helped support a smaller org that eliminated more complicated financial products that have edge cases, but it didn’t create a step-change in productivity.

If Gen-AI were making tech workers even 10× more productive at scale, you’d expect to see it reflected in revenue per employee, margins, or operating leverage across the sector.

We’re just not seeing that yet.


I have friends who make such 50x productivity claims. They are correct if we define productivity as creating untested apps and games and their features that will never ship --- or be purchased, even if they were to ship. Thus, “productivity” has become just another point of contention.


100% agree. There are far more half-baked, incomplete "products" and projects out there now that it is easier to generate code. Generously, that doesn't necessarily equate to productivity.

I've agree with the fact that the last 10% of a project is the hardest part, and that's the part that Gen-AI sucks at (hell, maybe the 30%).


> If Gen-AI were making tech workers even 10× more productive at scale, you’d expect to see it reflected in revenue per employee, margins, or operating leverage across the sector.

If we’re even just talking a 2x multiplier, it should show up in some externally verifiable numbers.


I agree, and we might be seeing this but there is so much noise, so many other factors, and we're in the midst of capital re-asserting control after a temporary loss of leverage which might also be part of a productivity boost (people are scared so they are working harder).

The issue is that I'm not a professional financial analyst and I can't spend all day on comps so I can't tell through the noise yet if we're seeing even 2x related to AI.

But, if we're seeing 10x, I'd be finding it in the financials. Hell, a blind squirrel would, and it's simply not there.


Yes, I think there many issues in a big company that could hide a 2x productivity increase for a little while. But I'd expect it to be very visible in small companies and projects. Looking at things like number of games released on steam, new products launched on new product sites, or issues fixed on popular open source repos, you'd expect a 2x bump to be visible.


In my experience all technology has been like this though. We are on the treadmill of learning the new thing with our without LLMs. That's what makes tech work so fun and rewarding (for me anyway).


I assume you're living in a city. You're already renting out a lot of things to others (security, electricity, water, food, shelter, transportation), what is different with white collar work?


>the city gets destroyed

vs.

>a company goes bankrupt or pivots

I can see a few differences.


My apartment has been here for years and will be here for many more. I don't love paying rent on it but it certainly does get maintained without my having to do anything. And the rest of the infrastructure of my life is similarly banal. I ride Muni, eat food from Trader Joe's, and so on. These things are not going away and they don't require me to rewire my brain constantly in order to make use of them. The city infrastructure isn't stealing my ability to do my work, it just fills in some gaps that genuinely cannot be filled when working alone and I can trust it to keep doing that basically forever.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: