Hacker News new | past | comments | ask | show | jobs | submit login

It's also a good lesson for the new AI cycle we're in now. Often inserting ML subsystems into your broader system just makes it go from "deterministically but fixably bad" to "mysteriously and unfixably bad".



I think that’ll define the industry for the coming decades. I used to work in machine translation and it was the same. The older rules-based engines that were carefully crafted by humans worked well on the test suite and if a new case was found, a human could fix it. When machine learning came on the scene, more “impressive” models that were built quicker came out - but when a translation was bad no one knew how to fix it other than retraining and crossing one’s fingers.


As someone who worked in rules-based ML before the recent transformers (and unsupervised learning in general) hype, rules-based approaches were laughably bad. Only now are nondeterministic approaches to ML surpassing human level tasks, something which would not have been feasible, perhaps not even possible in a finite amount of human development time, via human-created rules.


The thing is that AI is completely unpredictable without human curated results. Stable diffusion made me relent and admit that AI is here now for real, but I no longer think so. It's more like artificial schizophrenia. It does have some results, often plausible seeming results, but it's not real.


Yes, but I think the other lesson might be that those black box machine translations have ended up being more valuable? It sucks when things don't always work, but that is also kind of life and if the AI version worked more often that is usually ok (as long as the occasional failures aren't so catastrophic as to ruin everything)


> Yes, but I think the other lesson might be that those black box machine translations have ended up being more valuable?

The key difference is how tolerant the specific use case is of a probably-correct answer.

The things recent-AI excels at now (generative, translation, etc.) are very tolerant of "usually correct." If a model can do more, and is right most of the time, then it's more valuable.

There are many other types of use cases, though.


A case in point is the ubiquity of Pleco in the Chinese/English space. It’s a dictionary, not a translator, and pretty much every non-native speaker who learns or needs to speak Chinese uses it. It has no ML features and hasn’t changed much in the past decade (or even two). People love it because it does one specific task extremely well.

On the other hand ML has absolutely revolutionised translation (of longer text), where having a model containing prior knowledge about the world is essential.


Can’t help but read that and think of Tesla’s Autopilot and “Full Self Driving”. For some comparisons they claim to be safer per mile than human drivers … just don’t think too much about the error modes where the occasional stationary object isn’t detected and you plow into it at highway speed.


relevant to the grandparent’s point: I am demoing FSD in my Tesla and what I find really annoying is that the old Autopilot allowed you to select a maximum speed that the car will drive. Well, on “FSD” apparently you have no choice but to hand full longitudinal control over to the model.

I am probably the 0.01% of Tesla drivers who have the computer chime when I exceed the speed limit by some offset. Very regularly, even when FSD is in “chill” mode, the model will speed by +7-9 mph on most roads. (I gotta think that the young 20 somethings who make up Tesla's audience also contributed their poor driving habits to Tesla's training data set) This results in constant beeps, even as the FSD software violates my own criteria for speed warning.

So somehow the FSD feature becomes "more capable" while becoming much less legible to the human controller. I think this is a bad thing generally but it seems to be the fad today.


I have no experience with Tesla and their self-driving features. When you wrote "chill" mode, I assume it means the lowest level of aggressiveness. Did you contact Tesla to complain the car is still too aggressive? There should be a mode that tries to drive exactly the speed limit, where reasonable -- not over or under.


Yes there is a “chill” mode that refers to maximum allowed acceleration and “chill mode” that refers to the level if aggressiveness with autopilot. With both turned on the car still exceeds the speed limit by quite a bit. I am sure Tesla is aware.


> For some comparisons they claim to be safer per mile than human drivers

They are lying with statistics, for the more challenging locations and conditions the AI will give up and let the human take over or the human notices something bad and takes over. So Tesla miles are miles are cherry picked and their data is not open so a third party can make real statistics and compare apples to apples.


Or in some cases, the Tesla slows down, then changes its mind and starts accelerating again to run over child-like obstructions.

Ex: https://www.youtube.com/watch?v=URpTJ1Xpjuk&t=293s


Tesla's driver assist since the very beginning to now seems to not posses object/decision permanence.

Here you can see it detected an obstacle (as evidenced by info on screen), made a decision to stop, however it failed to detect existence of the object right in front of the car, promptly forgot about the object and decision to stop and happily accelerated over the obstacle. When tackling a more complex intersection it can happily change its mind with regards to exit lane multiple times, e.g. it will plan to exit on one side of a divider, replan to exit onto upcoming traffic, replan again.


Well Tesla might be the single worst actor in the entire AI space, but I do somewhat understand your point. The lake of predictable failures is a huge problem with AI, I'm not sure that understandability is by itself. I will never understand the brain of an Uber driver for example


yes, who exactly looked at the 70% accuracy of "live automatic closed captioning" and decided Great! ship it boys!


My guess: They are hoping user feedback will help them to fix the bugs later -- iterate to 99%. Plus, they are probably under unrealistic deadlines to delivery _something_.


But rule-based machine translation, from what I've seen, is just so bad. ChatGPT (and other LLM) is miles ahead. After seeing what ChatGPT does, I can't even call rule-based machine translation "tranlation".

*Disclaimer: as someone who's not an AI researcher but did quite some human translation works before.


I think NM translation was broken all along. Not in the neural network part but in choosing the right answer. https://aclanthology.org/2020.coling-main.398.pdf


Since LLMs are loosely based on NM models, it seems research on newer sampling methods like Mirostat might help here.


Perhaps using a ML to craft the deterministic rules and then have a human go over them is the sweet spot.


Rules could never work for translation unless the incoming text was formatted in a specific way. Eg, you just couldn't translate a conversation transcript in a pro-drop language like Japanese into English sentence-by-sentence, because the original text just wouldn't have sentences in it. So you need some "intelligence" to know who is saying what.


I've heard AI described as the payday loan (or "high-interest credit card") of technical debt.


I think - I hope, rather - that technically minded people who are advocating for the use of ML understand the short comings and hallucinations... but we need to be frank about the fact that the business layer above us (with a few rare exceptions) absolutely does not understand the limitations of AI and views it as a magic box where they type in "Write me a story about a bunny" and get twelve paragraphs of text out. As someone working in a healthcare adjacent field I've seen the glint in executive's eyes when talking about AI and it can provide real benefits in data summarization and annotation assistance... but there are limits to what you should trust it with and if it's something big-i Important then you'll always want to have a human vetting step.


> I hope, rather - that technically minded people who are advocating for the use of ML understand the short comings and hallucinations.

The people I see who are most excited about ML are business types who just see it as a black boxes that makes stock valuation go vroom.

The people that deeply love building things, really enjoy the process of making itself, are profoundly sceptical.

I look at generative AI as sort of like an army of free interns. If your idea of a fun way to make a thing is to dictate orders to a horde of well-meaning but untrained highly-caffienated interns, then using generative AI to make your thing is probably thrilling. You get to feel like an executive producer who can make a lot of stuff happen by simply prompting someone/something to do your bidding.

But if you actually care about the grit and texture of actual creation, then that workflow isn't exactly appealing.


They wouldn’t think this way if stock investors weren’t so often such naive lemmings ready to jump off yet another cliff with each other.


We get it, you're skeptical of the current hype bubble. But that's one helluva no true Scotsman you've got going on there. Because a true builder, one that deeply loves building things wouldn't want to use text to create an image. Anyone who does is a business type or an executive producer. A true builder wouldn't think about what they want to do in such nasty thing as words. Creation comes from the soul, which we all know machines, and business people, don't have.

Using English, instead of C, to get a computer to do something doesn't turn you into a beaurocrat any more than using Python or Javascript instead does.

Only a person that truly loves building things, far deeper than you'll ever know, someone that's never programmed in a compiled language, would get that.


Getting drunk off that AI kool-aid aren't ya


the othering of creators because they use a different paintbrush was bothering me.


I can relate, AI is a tool, and if I want to write my code by LEGOing a bunch of AI-generated functions together, I should be able to.


please go other yourself somewhere else


Hit a nerve, it seems. Apologies.


> Using English, instead of C, to get a computer to do something doesn't turn you into a beaurocrat any more than using Python or Javascript instead does.

If one uses English in as precise a way as one crafts code, sure.

Most people do not (cannot?) use English that precisely.

There's little technical difference between using English and using code to create...

... but there is a huge difference on the other side of the keyboard, as lots of people know English, including people who aren't used to fully thinking through a problem and tackling all the corner cases.


> Most people do not (cannot?) use English that precisely.

No one can, which is why any place human interaction needs anything anywhere close to the determinancy of code, normal natural langauge is abandoned for domain-specific constructed languages built from pieces of natural language with meanings crafted especially for the particular domain as the interface language between the people (and often formalized domain-specific human-to-human communication protocols with specs as detailed as you’d see from the IETF.)


I gotta say, I love how you use english to perfectly demonstrate how imprecise english is without pre-understood context to disambiguate meaning.


using English has been tried many times in the history computing; Cobol, SQL, just to name a very few.

Still needed domain experts back then, and, IMHO, in years/decades to come


Or you can draw pretty pictures in LabVIEW lol


Was it intentional to reply with another no true Scotsman in turn here?


Yeah, I was also reading their response and was confused. "Creation comes from the soul, which we all know machines, and business people, don't have" ... "far deeper than you'll ever know", I mean, come on.


If you have to ask, then you missed it


I’m not optimistic on that point: the executive class is very openly salivating at the prospect of mass layoffs, and that means a lot of technical staff aren’t quick to inject some reality – if Gartner is saying it’s rainbows and unicorns, saying they’re exaggerating can be taken as volunteering to be laid off first even if you’re right.


Yeah but what comes after the mass layoffs? Getting hired to clean up the mess that AI eventually creates? Depending on the business it could end up becoming more expensive than if they had never adopted GenAI at all. Think about how many companies hopped on the Big Data Bandwagon when they had nothing even coming close to what "Big Data" actually meant. That wasn't as catastrophic as what AI would do but it still was throwing money in the wrong direction.


I’m sure we’re going to see plenty of that but from the perspective of a person who isn’t rich enough to laugh off unemployment, how does that help? If speaking up got you fired, you won’t get your old job back or compensation for the stress of looking in a bad market. If you stick around, you’re under more pressure to bail out the business from the added stress of those bad calls and you’re far more likely to see retribution than thanks for having disagreed with your CEO: it takes a very rare person to appreciate criticism and the people who don’t aren’t going to get in the situation of making such a huge bet on a fad to begin with – they’d have been more careful to find something it’s actually good for.


> technically minded people who are advocating for the use of ML understand the short comings and hallucinations

really, my impression is the opposite. They are driven by doing cool tech things and building fresh product, while getting rid of "antiquated, old" product. Very little thought given to the long term impact of their work. Criticism of the use cases are often hand waved away because you are messing with their bread and butter.


> but we need to be frank about the fact that the business layer above us (with a few rare exceptions) absolutely does not understand the limitations of AI and views it as a magic box where they type in

I think we also need to be aware that this business layer above us that often sees __computers__ as a magic box where they type in. There's definitely a large spectrum of how magical this seems to that layer, but the issue remains that there are subtleties that are often important but difficult to explain without detailed technical knowledge. I think there's a lot of good ML can do (being a ML researcher myself), but I often find it ham-fisted into projects simply to say that the project has ML. I think the clearest flag to any engineer that this layer above them has limited domain knowledge is by looking at how much importance they place on KPIs/metrics. Are they targets or are they guides? Because I can assure you, all metrics are flawed -- but some metrics are less flawed than others (and benchmark hacking is unfortunately the norm in ML research[0]).

[0] There's just too much happening so fast and too many papers to reasonably review in a timely manner. It's a competitive environment, where gatekeepers are competitors, and where everyone is absolutely crunched for time and pressured to feel like they need to move even faster. You bet reviews get lazy. The problems aren't "posting preprints on twitter" or "LLMs giving summaries", it's that the traditional peer review system (especially in conference settings) poorly scales and is significantly affected by hype. Unfortunately I think this ends up railroading us in research directions and makes it significantly challenging for graduate students to publish without being connected to big labs (aka, requiring big compute) (tuning is another common way to escape compute constraints, but that falls under "railroading"). There's still some pretty big and fundamental questions that need to be chipped away at but are difficult to publish given the environment. /rant


This is why hallucinations will never be fixed in language models. That's just how they work.


mysteriously with a helping of random too!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: