Hacker Newsnew | past | comments | ask | show | jobs | submit | augment_me's commentslogin

Isn't this apples to pears comparison? It's really just saying that having a bigger credit card gets you shit faster, but it's actually worse in terms of GPU utilization and efficiency.

1) The total amount of time is not the same if you just count GPU-hours. If you have 16 GPUs, it makes sense to run them for 4.5 hours to get to 72h for an even comparison, not 8.

2) If we stop at 4.5 hours(and are generous including the big drop), the loss is about 0.978, which is the same as about 44 hours with the sequential solution, making the sequential solution about twice as efficient.

So the real conclusion here is that we are able to run things in parallel at an efficiency loss but at a time win as long as we have access to more hardware. I feel like the blog oversells itself.


I believe that's it's sadly a necessity for control of the population when you have other superpowers employing this.

If you are Europe, and you have democratic elections, you have an informational power asymmetry towards the states that have mass surveillance and control. You are (as we saw last year with the Romanian election that was swung to 60% in 2 weeks over TikTok) susceptible towards influence of other superpowers. Even if you want to keep democratic elections, you need to somehow make sure that the citizens are voting in their interest. If the citizens at the same time are victims of the attention economy, their interest will be whatever foreign superpowers want it do be.

One well-tried solution is to engage and educate the population. However, this takes years, not weeks as the campaigns take, and takes immense resources as people will default to convenient attention economy tools.

Other option is to ban platforms/create country-wide firewalls. It's a lot harder in democratic societies, you ban one app and a new one takes it's place. Cat is kind of out of the bag on this one.

Last and easiest option is mass surveillance. Figure out who is getting influenced by what, and start policing on what opinions those people are allowed to have and what measures to take to them. Its a massive slippery slope, but I can clearly see that it's the easiest and most cost-effective way to solve this information-assymetry


As always, the devil is in the details. How will "mass surveillance" be implemented? How will bad opinions be suppressed? How will misguided officials be blocked?

Even the vague outline you've provided has issues. You can't prevent someone from having an opinion. You can't figure out who is "influenced" vs merely "exposed" (and visible intrusion shifts people towards the former).

You should actually consider the downsides and failure modes of implemented mass surveillance, not "it prevents malicious foreign influence better than my other proposals", because it may be worse than said influence (which does not necessarily translate into control; keep in mind that Georgescu only won the primary and would've lost the runoff had it not been annulled). The world under free information is the devil you know.

I always hold that the problem with mass censorship and state overreach is, they are too powerful and people are too selfish and stupid. There's no good solution, but my prediction is that any drastic attempt to prevent foreign interference will backfire and fail at that (liberal leaders can't use authoritarian tools as effectively as authoritarians). Even Democracy, "the worst form of government except for all others that have been tried", is a better countermeasure; all you need, to prevent anti-democratic foreign capture and ultimate failure, is to preserve it.


I think the definition of what is "anti-democratic" is as hard as the initial 3 questions you pose. If you push second-order ideas, for example by using refugees as indirect fuel for anti-democratic sentiment, is that anti-democratic? The Romanian election propaganda in itself was not anti-democratic, the coordination from a foreign state was. This means that the future of this kind of interference could be a more diffuse approach, or an approach where this is done from within Europe.

Any countermeasure you propose will just lead to moving one level of abstraction, or finding another point of entry.

I do think it's a better idea than mass surveillance, but I believe that the states will see it as harder. It can be that mass surveillance is implemented, and then the states do not know what to do with the data and nothing is achieved.


To what end would you say the surveillance is for?

So you surveil your citizens and precog their opinions... to do what? Make them have state-sponsored opinions? Don't we already have that without the surveillance?

It's trivial to predict how a human will behave without any surveillance at all. Facebook abandoned their Beacon system not because of the backlash, but because they realized all they really needed to predict user behavior was the user's credit card statements, which they could easily buy.

At some point the constitution is the backstop, and unless we amend it, it should hold true.


I don't think that the EU member states have the same data access as companies in the US like Facebook do, and therein lies the problem. There is no good way to gather and connect data like Meta or Palantir can, you can't just sell things to the maximum bidder here. I think that's where the necessity comes from.

Regarding banning platforms I’d say just ban the attention driven business model online by forbidding all social media platforms from serving ads entirely.

Ban all third party advertisement. Or taking payment for it at any rate.

> "control of the population"

Who is doing the controlling in this take? "The Government"? Calling for more government control when some say--at least in the US--too much government is the heart of our current political strife. Unless this argument is for corporate surveillance?

As for elections in the age of social media, why not just pass Blackout laws around the date of the election? One week not sufficient? Make it two.

But instead the answer is mass surveillance? To do what? Arrest & detain people, and let the judicial system incarcerate them for months or years while the process plays out?


I am not for mass surveillance, I am saying it's the cheapest option to achieve the goal without disturbing the individual and causing social unrest. If you have a blackout, you will have businesses stopped, people will complain, people will use VPNs anyways, massive economic costs. Mass surveillance will just allow you to monitor, flag and perhaps later exclude people without affecting the rest.

>when some say

Some say very nearly anything you could imagine and many things you couldn’t.


>If you are Europe, and you have democratic elections, you have an informational power asymmetry towards the states that have mass surveillance and control. You are (as we saw last year with the Romanian election that was swung to 60% in 2 weeks over TikTok) susceptible towards influence of other superpowers

When Georgia tried to implement a law to inhibit this type of foreign meddling from all superpowers it was widely branded a "pro russia law", presumably because the west had invested more in astroturfing Georgia.

Which is no different to what the US and Europe was already doing in Romania on an ENORMOUS scale before Russia ran its Tiktok campaign. Russia's campaign evidently resonated with the populace far more than what the NED were doing.

Democracy is a bit like freedom of speech - either you support it even when it makes decisions you dont like (e.g. in opposition to western imperialism) or you hate it. There isnt a middle ground.

If you support the Romanian secret services' decision to cancel the election over a tiktok campaign which was more convincing than better funded NED campaigns which they permit, you probably just hate democracy.

If you think "pro russia law" is an accurate designation of what Georgia was trying to implement - again, you just hate democracy.


Thank you. Haven't seen this problem framed in quite this way before. I find the point quite persuasive.

But, I don't understand how this step could possibly work:

> start policing on what opinions those people are allowed to have and what measures to take to them

A much more effective counter to this would be to rebalance the information asymmetry by giving citizens the tools to coordinate against state sponsored influence.


It's a good suggestion, but the thing is that the average person does not care and does not want to use your tools. You can make an app that gives you correct news, where you can vote for local political issues, etc. Most people don't give a shit, you as a state are competing against the attention economy(evolving into an affection economy given LLM use).

You are competing against companies that are using biologically wired mechanisms, like short-burst 3-second information overload together with marketing signaling(consumer neuroscience) to make you do choices and then confabulate the choice to yourself as your own.

Any tool would have to either be made in a landscape where ALL of the attention/affection-economy tools are banned, OR use the same mechanisms.


I agree but I also think that authenticity is one quality that such a tool could offer that other things in the attention economy cannot.

Addictive things are addictive. But people are also capable, given the right circumstances, to go and "touch grass". People are capable of making choices that are good for them. Especially if we make those choices easy enough.

I often scroll too much but I also go into nature and meet irl humans. And it's not close to an insurmountable choice.


> A much more effective counter to this would be to rebalance the information asymmetry by giving citizens the tools to coordinate against state sponsored influence.

Which tools, specifically? I know none.


I mean that we are in dire need of such tools!

I also am not aware of any existing tools.


Your last paragraph spunds misguided

I am with you on this, and you can't win, because as soon as you voice this opinion you get overwhelmed with "you dont have the sauce/prompt" opinions which hold an inherent fallacy because they assume you are solving the same problems as them.

I work in GPU programming, so there is no way in hell that JavaScript tools and database wrapper tasks can be on equal terms with generating for example Blackwell tcgen05 warp-scheduled kernels.


There's going to be a long tail of domain-specific tasks that aren't well served by current models for the foreseeable future, but there's also no question the complexity horizon of the SotA models is increasing over time. I've had decent results recently with non-trivial Cuda/MPS code. Is it great code/finely tuned? Probably not but it delivered on the spec and runs fast enough.


> you can't win, because as soon as you voice this opinion you get overwhelmed with "you dont have the sauce/prompt"

> I've had decent results recently with non-trivial Cuda/MPS code.


Anthropic has a challenge of optimizing GPU code.

The current leader is Opus 4.5

https://github.com/anthropics/original_performance_takehome


I have done it, its not GPU-code, you are optimizing a toy compiler for a fictional framework. There is some SIMD mechanics but you cant call it GPU. There is a lot of such real challenges though - KernelBench, Project Popcorn, FlashInfer, Wafer, Standard Kernel.


In my experience LLMs are useless for GPU compute code, just not enough in the training set.


Yeah, the argument here is that once you say this, people will say "you just dont know how to prompt, i pass the PTX docs together with NSight output and my kernel into my agent and run an evaluation harness and beat cuBLAS". And then it turns out that they are making a GEMM on Ampere/Hopper which is an in-distribution problem for the LLMs.

It's the idea/mindset that since you are working on something where the tool has a good distribution, its a skill issue or mindset problem for everyone else who is not getting value from the tool.


Now please get back to coding GPU stuff so we can train our models on your code. Thank you.


Another thing I've never got them to generate is any G code. Maybe that'll be in the image/3d generator side indirectly, but I was kind of hoping I could generate some motions since hand coding coordinates is very tedious. That would be a productivity boost for me. A very very niche boost, since I rarely need bespoke G code, but still.


Oh HELL no. :P Gcode is (at least if you’re talking about machining) the very definition of something you want to generate analytically using tried and tested algorithms with full consideration taken for the specifics of the machine and material involved.

I guess if you just want to use it to wiggle something around using a stepper motor and a spare 3D printer control board, it might be OK though. :)


You did something smart and efficinently using the least amount of energy and time needed. +1 for consciousness being a mistake


Lmao using amendments as arguments in 2026


How many years has it been since there were any amendments? Representative government is dead.


I don't think that most research starts with the idea of being a crypto rugpull. Many research labs and startups fail, and that is fine, you dont have to double down and drag a bunch of people into the mud with you because of that, which is what a lot of the example the author points to.

In some sense I just feel like this is another way to gamble, which in general is seeing an unprecedented growth with Polymarket and the likes. There is less faith in white-collar skills making you rich, so you just try your luck.


This is the first question I ask, and every time I get the answer of some monolith that supposedly solves something. Imo, this is completely fine for any personal thing, I am happy when someone says they made an API to compare weekly shopping prices from the stores around them, or some recipe, this makes sense.

However more often than not, someone is just building a monolithic construction that will never be looked at again. For example, someone found that HuggingFace dataloader was slow for some type of file size in combination with some disk. What does this warrant? A 300000+ line non-reviewed repo to fix this issue. Not a 200-line PR to HuggingFace, no you need to generate 20% of the existing repo and then slap your thing on there.

For me this is puzzling, because what is this for? Who is this for? Usually people built these things for practice, but now its generated, so its not for practice because you made very little effort on it. The only thing I can see that its some type of competence signaling, but here again, if the engineer/manager looking knows that this is generated, it does not have the type of value that would come with such signaling. Either I am naive and people still look at these repos and go "whoa this is amazing", or it's some kind of induced egotrip/delusion where the LLM has convinced you that you are the best builder.


I noticed that despite really liking Karpathy and the blog, I was am kind of wincing/involuntarily reacting to the LLM-like "It's not X, its Y"-phrases:

> it's not just a website you go to like Google, it's a little spirit/ghost that "lives" on your computer

> it's not just about the image generation itself, it's about the joint capability coming from text generation

There would be no reaction from me on this 3 years ago, but now this sentence structure is ruined for me


I used to use a lot of em dashes normally in my writing - they were my go-to replacements for commas and semicolons

But I had to change how I write because people started calling my writing “AI generated”


2026 will be the year of the ;


Please no that's my go to


so you switched to using hyphens instead?


En dashes!


You’re absolutely right!

Jk jk, now that you pointed it out I can’t unsee it.


Yeah, came to read Karpathy's thoughts, but might as well ask an LLM myself..


Very broadly, AI sentence-structure and word choice is recursing back into society, changing how humans use language. The Economist recently had a piece on word usage of British Parliament members. They are adopting words and phrases commonly seen in AI.

We're embarking on a ginormous planetary experiment here.


> The Economist recently had a piece on word usage of British Parliament members. They are adopting words and phrases commonly seen in AI.

Many of the speeches given by MPs are likely to have been written beforehand, in whole or in part. Wouldn’t the more likely explanation be that they, or their staff, are using LLMs to write their speeches?


I hated these sentences way before LLMs, at least in the context of an explanation.

> it's not just a website you go like Google, it's a little spirit/ghost that "lives" on your computer

This type of sentence, I call rhetorical fat. Get rid of this fat and you obtain a boring sentence that repeats what has been said in the previous one.

Not all rhetorical fats are equal, and I must admit I find myself eyerolling on the "little spirit" part more than about the fatness.

I understand the author wants to decorate things and emphasize key elements, and the hate I feel is only caused by the incompatible projection of my ideals to a text that doesn't belong to me.

> it's not just about the image generation itself, it's about the joint capability coming from text generation.

That's unjustified conceptual stress.

That could be a legitimate answer to a question ("No, no, it's not just about that, it's more about this"), but it's a text. Maybe the text wants you to be focused, maybe the text wants to hype you; this is the shape of the hype without the hype.

"I find image generation is cooler when paired with text generation."


It is not a decoration. Karpathy juxtaposes ChatGPT (which feels like a "better google" to most people) to Claude Code, which, apparently, feels different to him. It's a comparison between the two.

You might find this statement non-informative, but without two parts there's no comparison. That's really the semantics of the statement which Karpathy is trying to express.

ChatGPT-ish "it's not just" is annoying because the first part is usually a strawman, something reader considers trite. But it's not the case here.


Indeed, I was probably grumpy at the time I wrote the comment. I do find some truth in it still.

You're right ! The strawman theory is based.

But I think there's more to it, I find dislikable the structure of these sentences (which I find a bit sensationnalist for nothing, I don't know, maybe I am still grumpy).


Well, language is a subject to 'fashion' one-upmanship game: people want to demonstrate their sophistication, often by copying some "cool" patterns, but then over-used patterns become "uncool" cliche.

So it might be just a natural reaction to over-use of a particular pattern. This kind of stuff have been driving language evolution for millennia. Besides that, pompous style is often used in 'copy' (slogans and ads) which is something most people don't like.


Karpathy should go back to what he does best: educating people about AI on a deep level. Running experiments and sharing how they work, that sort of stuff. It seems lately he is closer to an influencer who reviews AI-based products. Hopefully it is not too late to go back.


I feel these review stuff is more like a side / pass time to him. Look at nanochat for example. My impression is that these are the thongs he spends most of his energy still.

After all,l he's been a "influencer" for a long time, starting from the "software 2.0" essay.


We need to integrate how Singapore and Japan do oral English into our writing I guess.

Joking aside, as a nonnative English speaker who spent quite a bit of time to learn to write in English "properly", this trend of needing to write baad Engrish to avoid being called out in public for "written by an LLM" is frustrating...


I cannot unsee this anymore and it ruins the whole internet experience for me


Same here, had to configure ChatGPT to stop making these statements. Also had to configure bunch of other stuff to make it bland when answering questions.


The way to make AI not sound like ChatGPT is to use Claude.

I realized that's what bothered me. It's not "oh my god, they used ChatGPT." But "oh my god, they couldn't even be bothered to use Claude."

It'll still sound like AI, but 90% of the cringe is gone.

If you're going to use AI for writing, it's just basic decency to use the one that isn't going to make your audience fly into a fit of rage every ten seconds.

That being said, I feel very self conscious using emdashes in current decade ;)


If a reader gets angry simply because the author used ChatGPT instead of Claude, then the reader is an idiot.


I dont think Ive ever noticed someone use an emdash until chatgpt appeared


That's because people didn't make it a point to performatively notice them. But e.g. macOS and iOS have been auto-inserting them for a long time now. Ditto Word.


I didnt know this thanks


https://xkcd.com/3126/

I mostly use them in Telegram because it auto converts -- into emdash. They are a pain to type everywhere else though!


I love em dashes—they basically indicate a more deliberate pause than a … without the tight vibes of a semicolon.


I don't use LLM for writing just factual research stuff. And this would happen even in those questions.


Same, I cringe when I read this structure.


It's not text - it's clickbait distillied to grammar.


All software is not meant to be open-source, in production and working on 100 platforms.

Sometimes the point of the software is to make an app with 2 buttons for your mom to help her do her grocery shopping easier


The text structure screams GPT5 sadly, so I would not be surprised if not only the text but the images were wrong.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: