Recover from upsets is the big thing. Maintaining flight level, speed, and heading while upside down isn’t acceptable.
Levels of safety are another consideration, car autopilot’s don’t use multiple levels of redundancy on everything because they can stop without falling out of the sky.
That's still massively simpler than making a self-driving car.
It's trivially easy to fly a plane in straight level flight, to the extent that you don't actually need any automation at all to do it. You simply trim the aircraft to fly in the attitude you want and over a reasonable timescale it will do just that.
> It's trivially easy to fly a plane in straight level flight, to the extent that you don't actually need any automation at all to do it. You simply trim the aircraft to fly in the attitude
That seemingly shifts the difficulty from the autopilot to the airframe. But that’s not actually good enough, it doesn’t keep an aircraft flying when it’s missing a large chunk of wing for example. https://taskandpurpose.com/tech-tactics/1983-negev-mid-air-c...
Instead, you’re talking about the happy path and if we accept the happy path as enough there’s the weekend equivalents of self driving cars built using minimal effort, however being production worthy is about more than being occasionally useful.
Autopilot is difficult because you need to do several things well or people will defiantly die. Self driving cars are far more forgiving of occasional mistakes but again it’s the or people die bits that makes it difficult. Tesla isn’t actually ahead of the game, they are just willing to take more risks with their customers and the general public’s lives.
> Self driving cars are far more forgiving of occasional mistakes
I would say not, no.
It's almost impossible to crash a plane. There's nothing to hit except the ground, and you stay away from that unless you really really mean to get close.
It's very easy to crash a car, and if you do that most of the time you'll kill people outside the car, often quite a lot of them.
There are no production aircraft fitted with autopilots that can correct for breaking a wing off.
Autopilots have contributed to a significant number of crashes and that’s with a very safety conscious industry.
In a hypothetical Tesla style let’s take more risk approach, buggy autopilots can surprisingly quickly get into a situation at cruising altitude which isn’t recoverable before hitting the ground. What is the worst possible thing an autopilot could do in this situation is eye opening here.
> There are no production aircraft fitted with autopilots that can correct for breaking a wing off.
Granted that specific case depends on the aircraft being a lifting body etc so obviously doesn’t extend to commercial aviation. But my point was lack of aerodynamic stability on its own isn’t enough that giving up is ok.
> Autopilots have contributed to a significant number of crashes and that’s with a very safety conscious industry.
"Contributed to", in the sense that the pilots decided to just blindly trust the autopilot and let it make a developing situation worse rather than, oh I don't know, maybe FLYING THE DAMN PLANE.
> buggy autopilots can surprisingly quickly get into a situation at cruising altitude which isn’t recoverable before hitting the ground
If you allow the autopilot to fly the plane into the ground, yes. If you're paying attention you ought to be able to recover just about anything, if most of the plane is still working. The vast majority of incidents where aircraft have departed controlled flight and crashed are because the pilots lost sight of the important thing - FLYING THE DAMN PLANE.
> But my point was lack of aerodynamic stability on its own isn’t enough that giving up is ok.
It's got nothing to do with aerodynamic stability. If you adjust the steering and suspension in a car correctly, it'll drive in a perfectly straight line with no user input for a surprisingly long way. With modern electronic power steering and throttle-by-wire systems it's actually surprisingly easy to turn an off-the-shelf car (even something cheap, secondhand, and quite old like a 2010s Vauxhall Corsa) into a simple line-following robot like we used to build at uni in the 80s and 90s in robotics class. Sure, you need a disused aerodrome to play with it, but it'll work.
There is the far greater problem that self-driving cars have to cope with a far more rapidly changing environment than an aircraft. A self-flying plane would be far easier to get right than a self-driving car.
A human driver can't just react, painfully slowly, in the way that current "self-driving" cars do, they have to anticipate and be "reacting" before the problem even begins to start. You do it yourself, even if you don't realise it. You hang back from that car because you know they're going to - there, right across two lanes, not so much as a glance in their mirror, what did I tell you? - they're going to do something boneheaded. That car's just pulled in, the passenger in the back is about to open their door right into your - nicely done, you moved out to the line and missed them by 50cm at least.
Self-driving cars can't do that, and probably never will. Self-flying aircraft won't need to do that.
And an autopilot is a surprisingly simple device that responds in simple and predictable ways to sensor inputs.
> "Contributed to", in the sense that the pilots decided to just blindly trust the autopilot and let it make a developing situation worse rather than, oh I don't know, maybe FLYING THE DAMN PLANE.
Excuses don’t save lives. You can’t trust pilots or drivers to always make the correct decision instantly. Any system designed in such a manner will get people killed.
> If you allow the autopilot to fly the plane into the ground, yes.
Things can be unrecoverable a full minute before impact. There’s some seriously harrowing NTSB reports, and that’s just what’s already happened possible failure modes are practically endless.
> Excuses don’t save lives. You can’t trust pilots or drivers to always make the correct decision instantly. Any system designed in such a manner will get people killed.
Okay, so what's your answer? Stick yet another computer in to go wrong and fly the plane into the ground when it gets the wrong idea about a situation? Add yet more sensors to the car to prevent the driver steering away from an obstacle because it thinks they're not using their indicators yet?
> Things can be unrecoverable a full minute before impact.
Can you find an example of one that isn't down to gross mechanical failure, or just plain Operator Idiocy?
> Okay, so what's your answer? Stick yet another computer in to go wrong and fly the plane into the ground when it gets the wrong idea about a situation? Add yet more sensors to the car to prevent the driver steering away from an obstacle because it thinks they're not using their indicators yet?
I’m not condemning the airline industry here, the safety conscious approach has done a good job over time especially in terms of redundancy. A major area of improvement is the way autopilots are communicating with pilots, but that’s a hard process.
The car industry isn’t doing nearly as well in terms of redundancy etc so there’s many obvious areas of improvement through solid engineering without changing anything fundamental. That said, communication is again lacking.
> Can you find an example of one that isn't down to gross mechanical failure, or just plain Operator Idiocy?
Operator Idiocy isn’t some clearly defined line, an aircraft with a moderate fuel leak can look like idiocy after the fact but it’s an easy mistake to make. That’s exactly the kind of thing autopilots could catch not just from fuel sensors but how the flight characteristics change as the aircraft gets lighter, but aircraft have happily flown into trouble over the ocean.
Simultaneously, if you hire human translators, you are likely to get machine translations. Maybe not often or overtly, but the translation industry has not been healthy for a while.
The industry is sick because everyone is looking for the lowest prices, but translators don't like machine translation. They don't want to just review the output, because actually doing the translation leads to better understanding of what they have to do.
I think Google has already shown that in the long run, people accept ads and prefer them to paying a subscription fee. If that weren’t true, then YouTube Premium would have double-digit % of youtube users and Kagi Search would be huge.
Right but it is widely acknowledged that despite acceptance (we lack other options) this process eventually degrades the quality of the tool as successive waves of product managers decide “just a little bit more advertisement”.
The problem that providers like Youtube have with the "pay to remove ads" model is that the people with enough disposable income that they're willing to pay $14/month to remove ads are the same demographic of people that advertisers are willing to pay the most to show ads to. It's the same reason why if you watch TV during the middle of the day, the ads are all for medicine (paid for by your insurance), personal injury attorneys, (paid for by the person you're suing), and cash advances for structured settlements (i.e. if you already have a settlement paying $500/mo for 30 years but you'd rather have $20,000 now) rather than for anything you actually have to buy.
What will coca cola pay me to sign a contract where I drink nothing but coca cola for this year under penalty of imprisonment? Think I can crack six figs?
It is not a choice between ads or subscription. The choice is between ads, adblockers or subscriptions. Hardly anyone will pay the subscription when they have a free way or blocking the ads. It is wild that an AI company is banking on ad funded, when the second major use of the tech will be to block ads entirely. Even in the physical world when AR tech is good enough. Now that is a use for the AI chip on my next PC that I can get behind.
The difference here is the qualitative difference that has existed between Google Search results and other competitors. Switching away from Google Search is a high friction move for most people. I'm not sure the same goes for AI chat.
That may be true, but you can’t compare average GenAI with the best humans because there are many reasons the human output is low quality: budget, timelines, oversights, not having the best artists, etc. Very few games use the best human artists for everything.
Same with programming. The best humans write better code than Codex, but the awful government portals and enterprise apps you’re using today were also written by humans.
Model capability improvements are very uneven. Changes between one model and the next tend to benefit certain areas substantially without moving the needle on others. You see this across all frontier labs’ model releases. Also the version numbering is BS (remember GPT-4.5 followed by GPT-4.1?).
Note that this is not relevant for reasoning models, since they will think about the problem in whatever order it wants to before outputting the answer. Since it can “refer” back to its thinking when outputting the final answer, the output order is less relevant to the correctness. The relative robustness is likely why openai is trying to force reasoning onto everyone.
This is misleading if not wrong. A thinking model doesn’t fundamentally work any different from a non-thinking model. It is still next token prediction, with the same position independence, and still suffers from the same context poisoning issues. It’s just that the “thinking” step injects this instruction to take a moment and consider the situation before acting, as a core system behavior.
But specialized instructions to weigh alternatives still works better as it ends up thinking about thinking, thinking, then making a choice.
I think you are misleading as well. Thinking models do recursively generate the final “best” prompt to get the most accurate output. Unless you are genuinely giving new useful information in the prompt, it is kind of useless to structure the prompt in one way or another because reasoning models can generate intermediate steps that give best output. The evidence on this is clear - benchmarks reveal that thinking models are way more performant.
You're both kind of right.
The order is less important for reasoning models, but if you carefully read thinking traces you'll find that the final answer is sometimes not the same as the last intermediary result. On slightly more challenging problems LLMs flip flop quite a bit and ordering the output cleverly can uplift the result. That might stop being true for newer or future models but I iterated quite a bit in this for sonnet 4.
This article spent a lot of words to say very little. Specifically, it doesn’t really say why working towards AGI doesn’t bring advancements to “practical” applications and why the gazillion AI startups out there won’t either. Instead, we need Trump to step up?
More and more I feel like these policy articles about AI are an endless stream of slop written by people who aren’t familiar with and have never worked on current AI.
That’s an interesting point. It’s not hard to imagine that LLMs are much more intelligent in areas where humans hit architectural limitations. Processing tokens seems to be a struggle for humans (look at how few animals do it overall, too), but since so much of the human brain is dedicated to movement planning, it makes sense that we still have an edge there.
The past few years I’ve been hearing crazy stories of workarounds and scripts to deal with all these new features in Windows. Isn’t that what was preventing people from using Linux? Replacing utilman.exe with cmd.exe is not something a normal user would ever do.
I was thinking the same thing. Never thought I'd see a world where Arch has an installer (and, jokes aside, many Linux distros have very straightforward GUI installers) while people have to... "hit Shift+F10 to get a terminal, then enter start ms-cxh:localonly" to install Windows with a local account. Jeez.
reply