It's more like, good things happen to be harder while bad things are more often easy. So being able and eager to do hard things lets you actually choose (and then hopefully you choose the good stuff).
for the life of my I can't understand why y'all care so much about this. This is what bad software is? The corner radii are slightly off? Doesn't that seem a bit... particular?
I think I know what you (and Apple) mean, but it doesn't make sense IMHO.
Because then, in apps without a toolbar, like Terminal, the window corner is concentric with the close button and has a smaller radius.
In apps with a toolbar, it still concentric with the close button, but has a larger radius. Because it probably also tries to be concentric with and accommodate toolbar items on the right-hand side.
But then, why not just keep the larger radius for both window types, so they are consistent? It wouldn't break concentricity and you wouldn't have non-matching corners, especially a the bottom, where there's usually few or no elements to be concentric with anyway.
That's exactly the kind of detail I'm talking about. There is a nice idea behind it, but it has not been thought through IMHO.
I have to look at it all day, so no. What would you call bad software? Bad code? Electron? None of that has any meaningful effect on my day to day experience as a user. But no matter what apps I'm using, Apple's terrible design decisions are ever-present. It's like having dirty glasses.
You don’t need to care, but for the ones who do, Apple was one of the few vendors one could identify with. Attention to detail and craftsmanship was their motto.
It's incredibly ugly, part of the value of Apple was esthetics. Also it's distracting, when I want to focus on something I want distracting elements to get out of the way.
Personally it's the fact they introduced such an extreme and opinionated UI design change, but then couldn't be bothered to roll it out consistently. The old window radii were fine. Sequoia looked good. The OS felt good as a whole. Then we get an update where you have a very opinionated theme forced on you in a half-assed fashion. If you could just disable it, return to classic radii, it'd be a nothing burger and Apple would roll it back if enough people disabled it.
There are people who have OCD and can’t help but seeing these things. It’s great for coding and seeing minor changes but its shit for real life - trust me.
The number of times auto update of some app has caused the thought process “but that wasn’t like that yesterday… or was it… hm… oh it was an update”. Just minor things, small mostly unnoticeable if don’t have an “eye for details”.
> There are people who have OCD and can’t help but seeing these things. It’s great for coding and seeing minor changes but its shit for real life - trust me.
I don't have OCD, but easily notice inconsistencies in various design choices these mega-corporations continue to fumble.
It's less "OMG I can't focus on coding because Calculator and TextEdit aren't sharing the same border radius" but more "The UX/UI department seems like they're on perpetual vacation if Apple is letting simple things like this slip through", and this specific case is just an example, every version of macOS seems to get worse when it comes to consistency.
This happens to non-native English speakers a lot (like me). My style of writing is heavily influenced by everything I read. And since I also do research using LLMs, I'll probably sound more and more as an AI as well, just by reading its responses constantly.
I just don't know what's supposed to be natural writing anymore. It's not in the books, disappears from the internet, what's left? Some old blogs for now maybe.
The wave of LLM-style writing taking over the internet is definitely a bit scary. Feels like a similar problem to GenAI code/style eventually dominating the data that LLMs are trained on.
But luckily there's a large body of well written books/blogs/talks/speeches out there. Also anecdotally, I feel like a lot of the "bad writing" I see online these days is usually in the tech sphere.
I've had the same thought, maybe more grandiosely. The idea is that LLM prompts are code -- after all they are text that gets 'compiled' (by the LLM) into a lower-level language (the actual code). The compile process is more involved because it might involve some back-and-forth, but on the other hand it is much higher level. The goal is to have a web of prompts become the source of truth for the software: sort of like the flowchart that describes the codebase 'is' the codebase.
No it doesn’t get compiled. Compilation is a translation from one formal language to another that can be rigorously modeled and is generally reproducible.
Translating from a natural language spec to code involves a truly massive amount of decision making because it’s ambiguous. For a non trivial program, 2 implementations of the same natural language spec will have thousands of observable differences.
Where we are today, that is agents require guardrails to keep from spinning out, there is no way to let agents work on code autonomously or constantly recompile specs that won’t end up with all of those observable differences constantly shifting, resulting in unusable software.
Tests can’t prevent this because for a test suite to cover all observable behavior, it would need to be more complex than the code. In which case, it wouldn’t be any easier for machine or human to understand.
The only solution to this problem is that LLMs get better.
Personally I think at the point they can pull this off, they can do any white collar job, and there’s not point in planning for that future because it results in either Mad Max or Star Trek.
well you have to expand your definition of "compile" a bit. There is clearly a similarity, whether or not you want to call it the same word. Maybe it needs a neologism akin to 'transpiled'.
other than that you seem to be arguing against someone other than me. I certainly agree that agents / existing options would be chaotic hell to use this way. But I think the high-level idea has some potential, independent of that.
I fundamentally don’t think the higher level idea has any potential because of the ambiguity of natural language. And I certainly don’t think it has anything in common with compilation unless you want to stretch the definition so far as to say that engineers are compilers. It’s delegation not abstraction.
I think we’ll either get to the point where AI is so advanced it replaces the manager, the PM, the engineer, the designer, and the CEO, or we’ll keep using formal languages to specify how computers should work.
The "prompts are code" framing is right, and the compile analogy holds further than people think. Real code has structure: typed parameters, return types, separated concerns. A raw prose prompt is more like a shell one-liner with everything inlined. It works, but it breaks when you try to reuse or modify it.
If you take the compile idea seriously, the next step is to give prompts the same structure code has: separate the role from the context from the constraints from the output spec. Then compile that into XML for the model.
I built flompt (https://github.com/Nyrok/flompt) as a tool for this. Canvas where you place typed blocks (role, objective, constraints, output format, etc.) and compile to structured XML. Basically an IDE for prompts, not a text editor. A star would help a lot if this resonates.
One problem with this is that there isn't really a "current prompt" that completely describes the current source code; each source file is accompanied by a full chat log, including false starts and misunderstandings. It's sort of like reading a git history instead of the actual file.
> each source file is accompanied by a full chat log, including false starts and misunderstandings. It's sort of like reading a git history instead of the actual file.
My Git history contains links between the false starts and misunderstandings and the corrections, which then also include a paragraph on my this was a misunderstanding or false start. It is a lot better than just a single linear log from LLMs.
> each source file is accompanied by a full chat log, including false starts and misunderstandings. It's sort of like reading a git history instead of the actual file.
My Git history contains links between the false starts and misunderstandings and the corrections, which then also include a paragraph on my this was a misunderstanding or false start. It is a lot better than just a single linear log.
true, but that just means that's the problem to solve. probably the ideal architecture isn't possible right now. But I sorta imagine that you could later on take the full transcript of that conversation and expect any LLM to implement more or less the same thing based on it, so that eventually it becomes a full 'spec'.
And maybe there is a way to trim the parts out of it that are not needed... like to automatically produce an initial prompt which is equivalent to the results of a longer session, but is precise enough so as to not need clarification upon reprocessing it. Something like that? I'm not sure if that's something that already exists.
> But I sorta imagine that you could later on take the full transcript of that conversation and expect any LLM to implement more or less the same thing based on it
Why would you think this though? There are an infinite number of programs that can satisfy any non-trivial spec.
We have theoretical solutions to LLM non-determinism, we have no theoretical solutions to prompt instability especially when we can’t even measure what correct is.
yeah but all of the infinite programs are valid if they satisfy the spec (well, within reason). That's kinda the point. Implementation details like how the code is structured or what language it's in are swept under the rug, akin to how today you don't really care what register layout the compiler chooses for some code.
There has never been a non trivial
program in the history of the world that could just “sweep all the implementation details under the rug”.
Compilers use rigorous modeling to guarantee semantic equality and that is only possible because they are translating between formal languages.
A natural language spec can never be precise enough to specify all possible observable behaviors, so your bot swarm trying to satisfy the spec is guaranteed to constantly change observable behaviors.
This gets exposed to users and churn, jank, and workflow breaking bugs.
I know people love to hate on the AI overviews, and I'm a person who generally hates both google and AI. But--I see them as basically good and ideal. After all most of the time I am googling something like trivial, like a simple fact. And for the last decade when I have to click into sites for the information it's some SEO spam-ridden garbage site. So I am very glad to not have to interact with those anymore.
Of course Google gets little credit for this since it was their own malfeasance that led to all the SEO spam anyway (and the horrible expertsexchange-quality tech information, and stupid recipe sites that put life stories first)... but at least there now there is a backpressure against some of the spammy crap.
I am also convinced that the people here reporting that the overviews are always wrong are... basically lying? Or more likely applying some serious negative bias to the pattern they're reporting. The overviews are wrong sometimes, yes, but surely it is like 10% of the time, not always. Probably they're biased because they're generally mad at google, or AI being shoved in their face in general, and I get that... but you don't make the case against google/AI stronger by misrepresenting it; it is a stronger argument if it's accurate and resonates with everyones' experiences.
> -I see them as basically good and ideal. After all most of the time I am googling something like trivial, like a simple fact. And for the last decade when I have to click into sites for the information it's some SEO spam-ridden garbage site.
What good is it if the overviews lie some percentage of the time (your own guess is 10%) and you have to search to verify that they aren't making shit up anyway. Also, those SEO spam-ridden garbage sites google feeds you whenever you bother to look past the undependable AI summaries are mostly written by AI these days and prone to the same problem of lying which only makes fact checking google's auto-bullshitter even harder.
a lot of times the thing I'm searching for is something I kinda know and just want to see verified, or which, as soon as I see it, I'll know if it's right or not. So... some good?
That's incomplete, because another "nobody remembers" is when the hallucination differs from reality, but the reader doesn't promptly detect the problem and remember where they got it from.
Think about the urban legends in the style of "the average person eats X spiders per year." It's extremely unlikely that Rumor Patient Zero is in a position to realize it's wrong, or that they will inform the next person that it came from an LLM summary.
It's more like, good things happen to be harder while bad things are more often easy. So being able and eager to do hard things lets you actually choose (and then hopefully you choose the good stuff).
reply