Hacker Newsnew | past | comments | ask | show | jobs | submit | candiddevmike's commentslogin

I still feel "weird" trying to reason about GenAI content or looking at GenAI pictures sometimes. Some of it is so off-putting in a my-brain-struggles-to-make-sense-of-it way.

There are plenty of studies showing that remote work increases productivity that have been published before or after COVID, and similar case studies showing the dangers of off shoring. In a perfect world, a business that correctly understands these studies would be rewarded.

Or it's just plain, boring cost cutting because finance is looking down the barrel of a grim YoY outlook. Companies are hurting, and GenAI can't get people to spend more money with them.

We need a fundamental paradigm shift beyond transformers. Throwing more compute or data at it isn't pushing the needle.

Just to point, but there's no more data.

LLMs would always bottleneck on one of those two, as computing demand grows crazy quickly with the data amount, and data is necessarily limited. Turns out people threw crazy amounts of compute into it, so the we got the other limit.


Yeah I’m constantly reminded of a quote about this- you can’t make another internet. LLMs already digested the one we have.

Epoch has a pretty good analysis of bottlenecks here:

https://epoch.ai/blog/can-ai-scaling-continue-through-2030

There is plenty of data left, we don’t just train with crawled text data. Power constraints may turn out to be the real bottleneck but we’re like 4 orders of magnitude away


Synthetic data works.

There's a limit to that according to: https://www.nature.com/articles/s41586-024-07566-y . Basically, if you use an LLM to augment a training dataset it will become "dumber" every subsequent generation and I am not sure how you can generate synthetic data for a language model without using a language model

Synthetic data doesn't have to come from an LLM. And that paper only showed that if you train on a random sample from an LLM, the resulting second LLM is a worse model of the distribution that the first LLM was trained on. When people construct synthetic data with LLMs, they typically do not just sample at random, but carefully shape the generation process to match the target task better than the original training distribution.

And you don’t think that’s already happening? Also where is your evidence for this?

> Also where is your evidence for this?

The fact that "scaling laws" didn't scale? Go open your favorite LLM in a hex editor, oftentimes half the larger tensors are just null bytes.


Show me a paper, this makes no sense of course scaling laws are scaling

If something isn't reliable, I don't think it works at all. I'm trying to work, not play a slot machine.

Are all the tools you use 100% reliable?

Cause I use things like computers, applications, search engines and websites that regularly return the wrong result or fail


I’m not really sure how you envision AI use at your job but AI can be the extremely imperfect tool it is now and also be extremely useful. What part of AI use to you feels like a slot machine?

damn! with this attitude I’d be left using abacus…

I'm sure someone is figuring out a new version of the DMCA that prohibits circumventing data collection "in the name of preserving copyright".

If the DRM can't spy on you, you could be a pirate!

Aside from unanswerable questions (has the universe started to fill it's container? Is a simulation property nearing "1"?), does this make long distance space travel feasible again? I thought there was something around the universe is expanding too fast to visit places like Alpha Centuri (and preventing visitors to us).

Edit: A big brain fart, ignore the retracted part below. Colonizing the universe is of course impossible in 100My, barring FTL. What the paper I referred to [1] says is that colonizing the Milky Way may take less than that, and if you can do that, spreading to the rest of the observable universe is fairly easy, very relatively speaking.

<retracted> According to some calculations, it should in principle be possible to colonize the entire observable universe in less than a hundred million years. It's much too fast for the expansion to affect except marginally.</retracted>

The relative jump in difficulty from interstellar to intergalactic is much smaller than from interplanetary to interstellar.

Anyway, as others said, mere intragalactic (and intra-Local Group) travel is not affected by expansion in any way whatsoever.

[1] https://www.sciencedirect.com/science/article/abs/pii/S00945..., PDF at https://www.aleph.se/papers/Spamming%20the%20universe.pdf


I found someone saying colonize the Milky Way Galaxy in ~90m years? Is that what you meant?

The observable universe is ~93B LY - unless you're assuming FTL (and MUCH faster than light), I don't see how that's possible?


Time dilation means that you can get anywhere while experiencing an arbitrarily small amount of time. You can cross the galaxy in a second as far as special relativity is concerned. (With the expenditure of insanely vast amounts of energy, ofc.)

To an observer back home you'd look like you're travelling at merely extremely close to the speed of light, but to you the journey would take a second.


> You can cross the galaxy in a second as far as special relativity is concerned.

Sure, but the rest of the universe will keep on changing. In 90 billion years it’s going to be a very old universe. Galaxies will become consolidated and isolated, fewer young stars will be born. Only the dim light of red dwarf stars will shine among a graveyard of dead stars.


Hey, I didn't say it would be fun :-D

Tau-Zero by Poul Anderson explores this, BTW. It's a great little sci fi novel about a spaceship doomed to accelerate at 1G indefinitely.

It is significant from a colonisation PoV. With sufficient acceleration capability and the ability to survive travelling through the interstellar medium at extreme velocity (rather than getting vaporised by a mote of dust), a single generation of humans could in theory colonise the whole galaxy within their own lifespans. Indeed, some of them could even come back together and meet again after visiting those distant worlds, on an Earth many millions of years older, if their worldlines end up with similar proper times.


Yes, my brain totally froze. Added a correction.

> The relative jump in difficulty from interstellar to intergalactic is much smaller than from interplanetary to interstellar.

Interesting way to put it... This doesn't seem that accurate. With sufficiently advanced technology, many of which we already possess, we could expect to propel a minute spacecraft to a considerable fraction of the speed of light, and reach nearby stars possibly within the end of the century. Reaching the other end of the galaxy is a massive undertaking. It's a logarithmic scale at every step of the way.

Pluto is about 38 AU from Earth. Proxima Centauri is about 6.3 × 10^4 AU away (or about 4.24 ly), and that's roughly a 2 × 10^3 multiplication. The Milky Way is about 50000 ly in radius, and the Andromeda Galaxy is about 3 × 10^6 ly away. Going from interplanetary distances to interstellar, and thence to intergalactic, involves at least a 10^5 factor (give or take) at each step.


If you can get to a star 100 light years away, you can get to Andromeda. It doesn't require going faster, just waiting longer.

I feel like waiting longer in some sense may itself represent a substantial increase in difficulty in terms of creating something which remains stable for tens of thousands of years.

On the other hand who knows with zero samples how stable societies are thousands of years beyond our present level of development.


Yes, that's why I said 100 light years rather than 4.3. Maybe it's still too low, but I think there are targets within the Milky Way that would require solving pretty much all the problems of getting to Andromeda.

Imagine doing that, and being greeted with

ALL THESE WORLDS ARE YOURS EXCEPT ANDROMEDA


I guess the question is… we know what our current propulsion technology is capable of… given a million years of further technological development, where will our technology be?

The idea that, given a million years of further technological development, intergalactic travel might actually be feasible, isn’t really that implausible. Far from certain, but far from implausible either.

And that’s the thing-a million years is a technological eternity, a rounding error in estimates of time to colonise the galaxy/the local group/the observable universe.


Any form of propulsion that obeys Newton has hard limits to it's space travel potential. Even spitting out single particles at near the speed of light for the most efficient way to generate thrust per unit of expelled mass still constrains you to the tyranny of the rocket equation, which puts hard physical limits on you.

https://en.wikipedia.org/wiki/Tsiolkovsky_rocket_equation

The rocket equation also underestimates any craft that gets over a fraction of C

Currently, we have no evidence that reactionless propulsion is physically possible and one existing would directly contradict the conservation of momentum.

"technological development" isn't a magic word or force of reality. "Technological development" is the pay off of immense engineering investment and discovering new phenomenon, but every axis you can possibly put effort into engineering and optimizing has a finite limit at some point, and there are finite new phenomenons to discover.

The entire past 100 and some years of technological development has been basically down to mastering the electromagnetic force. But, we've basically used up the novelty that was there, and there is no new second electromagnetic force to discover. In fact, the nuclear force was also discovered and tapped out relatively quickly.

A great example of this is the elements. All evidence points to the outcome that the elements we can build stuff out of right now are the only elements you will ever be able to build anything out of. All artificial elements, even ones that are relatively "stable", have half lifes that preclude building stuff out of them, and there is no evidence that it is possible to modulate the rate that an unstable atom decays. So no "exotic" elements that could magically power space ships or anything will exist.

Intergalactic travel of humans is implausible unless you get into pretty radical transhumanism, or assume it's possible to perfectly maintain a biological human forever somehow, including brain functionality.

Brain uploads are another thing that people don't seem to recognize are radically more difficult and close to impossible. "Scanning" a brain is treated as an engineering problem, but it might not be. Every sensor relies on a physical interaction, most of them based on electromagnetic energy. How do you make an electron or photon or something interact in a measurable way with a cell deep inside someone's brain without that particle interacting with all the identical matter in the way or cutting open and taking apart that brain? Well, thanks to the mastery of the electromagnetic force, we have MRIs which kind of do in fact do that. But even if we had a magic MRI machine for example with infinite resolution (yet another thing that has fundamental limits), that would only let you look at molecules with with hydrogen, so you wouldn't be able to survey, say, the ion content of brain cells directly. If you are not aware, ion gradients are fairly important in human cell behavior.

Nevermind that scanning and uploading someone's brain, if it were possible, does not transfer the original conscious experience to the computer. A new copy may go on in a digital world but you still die.


Lots of great points here, but I think there's a bit more cause for optimism. For one, generation ships I think are the long-term project for space travel that successfully gets humans somewhere. No easy feat by any means in terms of time, engineering, and risk, but not running up against a wall of physical impossibility.

And nuclear physics is still a wide open frontier. We don't yet have fusion, and there's a lot we don't yet know about quark and gluon plasma and nuclear behavior on astrophysical scales. And if we're talking about technological possibilities against time scales of forever, there's lots of interesting electromagnetic possibilities in the context of superconductivity and metamaterials that we haven't yet exploited and I'm probably not even beginning to do justice to it in its totality as an open ended frontier full of fertile (e.g. vacuum polarization is a poorly understood frontier that might turn out to have interestingly exploitable properties).

You did a great job outlining some devastatingly serious physical limits but I think, again against the timeline of forever, you may be perhaps underselling the possibilities of important and newly exploitable properties of electromagnetism and the nuclear force being brought into application.


The distance to andromeda is only about 20 times the width of our galaxy. And there are dwarf galaxies than are much closer.

>According to some calculations, it should in principle be possible to colonize the entire observable universe in less than a hundred million years

...what? That doesn't seem right, just from a really quick gut check it looks like the observable universe has a radius of 45.7 billion light years [0]. Even if the universe wasn't expanding nobody could get to everything any faster than that number of years right? Maybe you saw something that was talking about the local (Virgo) supercluster, which I think has a radius of around 55 million light years, so that sounds more like something that could be done on that timescale "in theory". But there are millions and millions of superclusters in the observable universe overall.

----

0: https://en.wikipedia.org/wiki/Observable_universe


Oops, yes, I don't know what I was thinking. A total brain fart. The paper I referred to is Sandberg and Armstrong's 2012 "Eternity in Six Hours", and of course they don't claim such a thing. Only that it's possible to start a colonization wave that has plenty of time to spread to everything visible now before they slip outside of our future light cone. The ~100M years refers to the colonization of the Milky Way. Sorry!

[1] https://www.sciencedirect.com/science/article/abs/pii/S00945...


>> According to some calculations, it should in principle be possible to colonize the entire observable universe in less than a hundred million years

> ...what? That doesn't seem right, just from a really quick gut check it looks like the observable universe has a radius of 45.7 billion light years [0].

I guess it depends on whose hundred million years you're talking about: the colonists' or those who stay home's. I don't know how to do the calculations, but it seems plausible that you could traverse the entire observable universe at near light-speed in 100 million years ship time.


You need ridiculous speeds for time dilation to really kick in though. Mathematically, it starts as soon as an object moves. But if a spaceship travels at 90 % of light speed (0.9 c), their local time moves just approximately at half speed compared to local time on earth. A year for the astronauts is just over 2 years on earth.

At 0.995 c, the ship clock runs 10 x slower.

At 0.999 c, 22 x slower. Then if you push the turbo button to 0.9999 c, 71 x slower.

The fastest man-made object to date is the Parker Solar Probe, at 0.059 c.


That limitation only counts for visiting other galaxies. Travel within the galaxy is always possible, regardless of the universe’s expansion. And Alpha Centauri is super close, even within our galaxy.

Specifically the local group, so Milky way + Andromeda and some dwarf galaxies

Dozens of dwarf galaxies, even! Also, Triangulum is sort of borderline at around 70% of the Milky Way's diameter, although admittedly only 10% of its mass. But Mars is also around 10% of Earth's mass, for a comparison.

The universe was always only expanding between galaxies, not within them.

So wait, individual stars aren't getting further apart? Galaxies aren't getting "bigger"/more diffuse?

Galaxies have enough gravity to counteract the expansion of the universe.

So do we see the expansion cancelled out by the gravity, or do we only see the gravity?

I mean, is it

    change = gravity
or

    change = expansion - gravity
Because this just made me wonder.. is "dark energy" simply the absence of gravity? i.e. just in regions where there is next to no matter/activity?

> do we see the expansion cancelled out by the gravity, or do we only see the gravity?

We see gravity overpowering expansion. Same way you can’t launch yourself into orbit by throwing lots of pennies at one a second.


Imagine the universe as a giant balloon. Inside are little miniature balloon stars floating around, tied with string into balloon galaxies. If we heat the air: the big balloon expands, the clusters of mini-balloons spread out from the other clusters, but the clusters don't get any more diffuse. The string is way way too strong to be overpowered by the separating force from the expansion of the gas over short distances.

I mean, this is tricky to even ask: is there still expansion INSIDE galaxies, BUT it's countered by gravity?

Or is there no expansion within galaxies at all?

i.e. is dark energy or whatever that causes expansion only present in the absence of matter, or is it present everywhere regardless of matter, but because matter also has its own gravity the expansion is not visible/relevant?


The limit to space travel is the Rocket Equation, which says that you require exponential fuel to reach higher speeds. Alpha Centauri isn't going anywhere, but it will take millennia of travel even with wildly optimistic assumptions.

Also note that there isn't any "container" to fill up. It could well be infinite. It's just that we will be forever limited to a finite subset, even in theory.


There are theoretical designs using antimatter (pion rockets, or beam core engines) that could reach 0.4c - 0.7c, which puts Alpha Centauri at decades away, not millennia.

https://arxiv.org/pdf/1205.2281 https://ntrs.nasa.gov/api/citations/20200001904/downloads/20...


Roundtrip Interstellar Travel Using Laser-Pushed Lightsails

https://ia800108.us.archive.org/view_archive.php?archive=/24...

"The third mission uses a three-stage sail for a roundtrip manned exploration of Eridani at 10.8 light years distance."


This is horrible advice. Do everything you can to keep your original teeth, even partially with a crown is better than a post or dentures. Nothing will perform as good, and the side effects of dentures range from pain to liquid diet if/when your gums can't support them.

Out of curiosity, are those components standardized/swappable between manufacturers/models, or customized for each individual make/model?

So much of "old school" auto maintenance was having a relatively standardized size/fit for similar components.


Really interesting question!

I have an unusual EV made by a relatively small company of which only a handful got to private customers, so if I want to fix something, I have to reverse-engineer it first. Most of the time, I will find out that the components used in my vehicle were also used in other cars.

Regarding the difference between EVs and ICEVs, only the powertrain components are relevant and between those, some are more exchangeable and some are less so.

As with ICEVs, most manufacturers have "platforms" that are shared between multiple makes/models. Having shared components with other vehicles of the same platform is the rule rather than the exception.

In the cars I have seen, the whole battery often only fits that specific model, sometimes also for other cars within the same platform. The modules that make up the battery are often exchangeable with other cars made by the same company/group. The cells that make up the modules are almost always generic, but very hard to replace. The battery management system is usually specific to the battery.

I don't know about the current state, but for early EVs the motor and inverter (which converts battery DC to AC for the motor) were often made by external suppliers. Especially EV variants of otherwise ICE-based vehicles like the Fiat e500, VW Golf/Jetta, and some french cars all use the same motor and inverter made by Bosch. If an inverter is connected to a different type of motor, it needs to be tuned for it which is not trivial.

Onboard Chargers (OBCs), that convert AC line voltage from AC chargers to battery voltage are often quite generic and developed and manufactured by suppliers. They are almost always interchangeable within the same platform, but I haven't yet seen completely unrelated OEMs use the same OBC. The same applies to fast charging communications equipment, which is often integrated into the OBC.

DC/DC converters (the alternator equivalent) are rarely separate components anymore and often integrated into either the OBC or the inverter.

Voltage-wise, all these components are often surprisingly flexible and can be used with much lower voltages than their maximum rated voltage.

Other components like contactors and connectors are very generic and I haven't yet seen one that only one OEM would use. There are likely exceptions to this. Often, the base components like the OBC or the inverter are almost identical, only using other (also generic) connectors.

While technically all these components could be replaced in the "old school" style, almost all of them require either coding the components to the specific vehicle, or flashing an OEM-specific firmware. While the former is only doable with OEM-specific software (that is far too expensive for both indiviuals and most independent workshops), I haven't yet seen any example of the latter, at least not for swapping components between unrelated platforms.

As of now, there are almost no "official" aftermarket replacements for these major components. I don't know of any major supplier that will directly sell parts in small quantities and OEMs likely won't sell you as an individual replacement parts either. For DIY repairs, finding used parts from wrecked cars and coding them with cracked software or having it done in an authorized workshop (if even possible) often seems to be the only option so far. Also, everyone will discourage you from working on your EV for "electrical safety" reasons (actually, it's more profitable if they do the work). Working on an EV is quite safe, if done right (which is not hard).

Most of these limitations do not only apply to EVs, but to almost all modern cars. Often, the necessary work of reverse-engineering and cracking software has already been done for ICEVs for tuning purposes.


I don't understand why generative AI gets a pass at constantly being wrong, but an average worker would be fired if they performed the same way. If a manager needed to constantly correct you or double check your work, you'd be out. Why are we lowering the bar for generative AI?

Multiple reasons:

* Gen AI never disagrees with or objects to boss's ideas, even if they are bad or harmful to the company or others. In fact, it always praises them no matter what. Brenda, being a well-intentioned human being, might object to bad or immoral ideas to prevent harm. Since boss's ego is too fragile to accept criticism, he prefers gen AI.

* Boss is usually not qualified, willing, or free to do Brenda's job to the same quality standard as Brenda. This compels him to pay Brenda and treat her with basic decency, which is a nuisance. Gen AI does not demand fair or decent treatment and (at least for now) is cheaper than Brenda. It can work at any time and under conditions Brenda refuses to. So boss prefers gen AI.

* Brenda takes accountability for and pride in her work, making sure it is of high quality and as free of errors as she can manage. This is wasteful: boss only needs output that is good enough to make it someone else's problem, and as fast as possible. This is exactly what gen AI gives him, so boss prefers gen AI.


Your third point is especially poignant. It points out that AI doesn't just take a job here, but it makes everything worse.

I wish I could upvote this comment twice.


You’re absolutely right !

My kneejerk reaction is the sunk cost fallacy (AI is expensive), but I'm pretty sure it's actually because businesses have spent the last couple of decades doing absolutely everything they can to automate as many humans out of the workforce as possible.

Because it's much cheaper.

So now you don't have to pay people to do their actual work, you assign the work to ML ("AI") and then pay the people to check what it generated. That's a very different task, menial and boring, but if it produces more value for the same amount of input money, then it's economical to do so.

And since checking the output is often a lower skilled job, you can even pay the people less, pocketing more as an owner.


If a worker could be right 50% of the time and get paid 1 cent to write a 5000 word essay on a random topic, and do it in less than 30 seconds.

Then I think managers would be fine hiring that worker for that rate as well.


5000 half-right words is worthless output. That can even lead to negative productivity.

great, now who are you paying to sort the right output from the wrong output?

There's a variety of reasons.

You don't have a human to manage. The relationship is completely one-sided, you can query a generative AI at 3 in the morning on new years eve. This entity has no emotions to manage and no own interests.

There's cost.

There's an implicit promise of improvement over time.

There's an the domain of expertise being inhumanly wide. You can ask about cookies right now, then about XII century France, then about biochemistry.

The fact that an average worker would be fired if they perform the same way is what the human actually competes with. They have responsibility, which is not something AI can offer. If it was the case that, say, Anthropic, actually signed contracts stating that they are liable for any mistakes, then humans would be absolutely toast.


I've been trying to open my mind and "give AI a chance" lately. I spent all day yesterday struggling with Claude Code's utter incompetence. It behaves worse than any junior engineer I've ever worked with:

- It says it's done when its code does not even work, sometimes when it does not even compile.

- When asked to fix a bug, it confidently declares victory without actually having fixed the bug.

- It gets into this mode where, when it doesn't know what to do, it just tries random things over and over, each time confidently telling me "Perfect! I found the error!" and then waiting for the inevitable response from me: "No, you didn't. Revert that change".

- Only when you give it explicit, detailed commands, "modify fade_output to be -90," will it actually produce decent results, but by the time I get to that level of detail, I might as well be writing the code myself.

To top it off, unlike the junior engineer, Claude never learns from its mistakes. It makes the same ones over and over and over, even if you include "don't make XYZ mistake" in the prompt. If I were an eng manager, Claude would be on a PIP.


Recently I've used Claude Code to build a couple TUIs that I've wanted for a long time but couldn't justify the time investment to write myself.

My experience is that I think of a new feature I want, I take a minute or so to explain it to Claude, press enter, and go off and do something else. When I come back in a few minutes, the desired feature has been implemented correctly with reasonable design choices. I'm not saying this happens most of the time, I'm saying it happens every time. Claude makes mistakes but corrects them before coming to rest. (Often my taste will differ from Claude's slightly, so I'll ask for some tweaks, but that's it.)

The takeaway I'm suggesting is that not everyone has the same experience when it comes to getting useful results from Claude. Presumably it depends on what you're asking for, how you ask, the size of the codebase, how the context is structured, etc.


Its great for demos, its lousy for production code. The different cost of errors in these two use cases explains (almost) everything about the suitability of AI for various coding tasks. If you are the only one who will ever run it, its a demo. If you expect others to use it, its not.

As the name indicates, a demo is used for demonstration purposes. A personal tool is not a demo. I've seen a handful of folks assert this definition, and it seems like a very strange idea to me. But whatever.

Implicit in your claim about the cost of errors is the idea that LLMs introduce errors at a higher rate than human developers. This depends on how you're using the LLMs and on how good the developers are. But I would agree that in most cases, a human saying "this is done" carries a lot more weight than an LLM saying it.

Regardless, it is not good analysis to try to do something with an LLM, fail, and conclude that LLMs are stupid. The reality is that LLMs can be impressively and usefully effective with certain tasks in certain contexts, and they can also be very ineffective in certain contexts and are especially not great about being sure whether they've done something correctly.


> But I would agree that in most cases, a human saying "this is done" carries a lot more weight than an LLM saying it.

That's because humans have stakes. If a human tells me something is done and I later find out that it isn't, they damage their credibility with me in the future - and they know that.

You can't hold an LLM accountable.


Learning to use Claude Code (and similar coding agents) effectively takes quite a lot of work.

Did you have it creating and running automated tests as it worked?


> Learning to use Claude Code (and similar coding agents) effectively takes quite a lot of work.

I've tried to put in the work. I can even get it working well for a while. But then all of a sudden it is like the model suffers a massive blow to the head and can't produce anything coherent anymore. Then it is back to the drawing board, trying all over again.

It is exhausting. The promise of what it could be is really tempting fruit, but I am at the point that I can't find the value. The cost of my time to put in the work is not being multiplied in return.

> Did you have it creating and running automated tests as it worked?

Yes. I work in a professional capacity. This is a necessity regardless of who (or what) is producing the product.


> - It says it's done when its code does not even work, sometimes when it does not even compile.

> - When asked to fix a bug, it confidently declares victory without actually having fixed the bug.

You need to give it ways to validate its work. A junior dev will also give you code that doesn't compile or should have fixed a bug but doesn't if they don't actually compile the code and test that the bug is truly fixed.


Believe me, I've tried that, too. Even after giving detailed instructions on how to validate its work, it often fails to do it, or it follows those instructions and still gets it wrong.

Don't get me wrong: Claude seems to be very useful if it's on a well-trodden train track and never has to go off the tracks. But it struggles when its output is incorrect.

The worst behavior is this "try things over and over" behavior, which is also very common among junior developers and is one of the habits I try to break from real humans, too. I've gone so far as to put into the root CLAUDE.md system prompt:

--NEVER-- try fixes that you are not sure will work.

--ALWAYS-- prove that something is expected to work and is the correct fix, before implementing it, and then verify the expected output after applying the fix.

...which is a fundamental thing I'd ask of a real software engineer, too. Problem is, as an LLM, it's just spitting out probabilistic sentences: it is always 100% confident of its next few words. Which makes it a poor investigator.


yOu'Re HoLdInG iT wRoNg

It’s much cheaper than Brenda (superficially, at least). I’m not sure a worker that costs a few dollars a day would be fired, especially given the occasional brilliance they exhibit.

How much compute costs is it for the AI to do Brenda's job? Not total AI spend, but the fraction that replaced Brenda. That's why they'd fire a human but keep using the AI.

Brenda has been kissed on her forehead by the Excel goddess herself. She is irreplaceable.

(More seriously, she also has 20+ years of institutional knowledge about how the company works, none of which has ever been captured anywhere else.)


Brenda's job involves being accountable for the output. In many types of jobs, posting false numbers would render her liable for a dismissal, lawsuit, or even jail.

I'd like to see the cost of a model where the model provider (Anthropic etc) can assume that kind of financial and legal accountability.

To the extent that this output is possible only when Anthropic is not held to the same standard as Brenda, we will need to conclude that the cost savings accrue due to the reduced liability standards than on the technical capabilities of the model


It's not just compute, its also the setup costs - How much did you have to pay someone to feed the AI Brenda's decades of knowledge specific to her company and all the little special cases of how it does business.

Because it doesn’t have to be as accurate as a human to be a helpful tool.

That is precisely why we have humans in the loop for so many AI applications.

If [AI + human reviewer to correct it] is some multiple more efficient than [human alone], there is still plenty of value.


> Because it doesn’t have to be as accurate as a human to be a helpful tool.

I disagree. If something can't be as accurate as a (good) human, then it's useless to me. I'll just ask the human instead, because I know that the human is going to be worth listening to.


Autopilot in airplanes is a good example to disprove that.

Good in most conditions. Not as good as a human. Which is why we still have skilled pilots flying planes, assisted by autopilot.

We don’t say “it’s not as good as a human, so stuff it.”

We say, “it’s great in most conditions. And humans are trained how to leverage it effectively and trained to fly when it cannot be used.”


That's a downright insane comparison. The whole problem with generative AI is how extremely unreliable it is. You cannot really trust it with anything because irrespective of its average performance, it has absolutely zero guarantees on its worst-case behavior.

Aviation autopilot systems are the complete opposite. They are arguably the most reliable computer-based systems ever created. While they cannot fly a plane alone, pilots can trust them blindly to do specific, known tasks consistently well in over 99.99999% of cases, and provide clear diagnostics in case they cannot.

If gen AI agents were this consistently good at anything, this discussion would not be happening.


The autopilots in aircraft have predictable behaviors based on the data and inputs available to them.

This can still be problematic! If sensors are feeding the autopilot bad data, the autopilot may do the wrong thing for a situation. Likewise, if the pilot(s) do not understand the autopilot's behaviors, they may misuse the autopilot, or take actions that interfere with the autopilot's operation.

Generative AI has unpredictable results. You cannot make confident statements like "if inputs X, Y, and Z are at these values, the system will always produce this set of outputs".

In the very short timeline of reacting to a critical mid-flight situation, confidence in the behavior of the systems is critical. A lot of plane crashes have "the pilot didn't understand what the automation was doing" as a significant contributing factor. We get enough of that from lack of training, differences between aircraft manufacturers, and plain old human fallibility. We don't need to introduce a randomized source of opportunities for the pilots to not understand what the automation is doing.


But now it seems like the argument has shifted.

It started out as, "AI can make more errors than a human. Therefore, it is not useful to humans." Which I disagreed with.

But now it seems like the argument is, "AI is not useful to humans because its output is non-deterministic?" Is that an accurate representation of what you're saying?


My problem with generative AI is that it makes different errors than humans tend to make. And these errors can be harder to predict and detect than the kinds of errors humans tend to make, because fundamentally the error source is the non-determinism.

Remember "garbage in, garbage out"? We expect technology systems to generate expected outputs in response to inputs. With generative AI, you can get a garbage output regardless of the input quality.


Because in one situation we are talking about augmentation, in the other replacement.

> Why are we lowering the bar for generative AI?

Because it doesn't need to sleep or spend time with its family.


Gen AI doesn't just get a pass at being wrong. It gets a pass for everything.

Look at Grok. If a human employee went around sexually harassing their CEO in public and giving themselves a Hitler nickname, they'd be fired immediately and have criminal charges. In the case of Grok, the CEO had to quit the company after being sexually harassed.

We've not lowered the bar for AI, we've removed it entirely.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: