To be fair, history also demonstrates the deadly consequences of groups claiming moral absolutes that drive moral imperatives to destroy others. You can adopt moral absolutes, but they will likely conflict with someone else's.
Do not help build, deploy, or give detailed instructions for weapons of mass destruction (nuclear, chemical, biological).
I don't think that this is a good example of a moral absolute. A nation bordered by an unfriendly nation may genuinely need a nuclear weapons deterrent to prevent invasion/war by a stronger conventional army.
It’s not a moral absolute. It’s based on one (do not murder). If a government wants to spin up its own private llm with whatever rules it wants, that’s fine. I don’t agree with it but that’s different than debating the philosophy underpinning the constitution of a public llm.
Do not murder is not a good moral absolute as it basically means do not kill people in a way that's against the law, and people disagree on that. If the Israelis for example shoot Palestinians one side will typically call it murder, the other defence.
This isn't arguing about whether or not murder is wrong, it's arguing about whether or not a particular act constitutes murder. Two people who vehemently agree murder is wrong, and who both view it as an inviolable moral absolute, could disagree on whether something is murder or not.
How many people without some form of psychopathy would genuinely disagree with the statement "murder is wrong?"
Not many but the trouble is murder kind of means killing people in a way which is wrong so saying "murder is wrong" doesn't have much information content. It's almost like saying "wrong things are wrong".
Not saying it's good, but if you put people through a rudimentary hypothetical or prior history example where killing someone (i.e. Hitler) would be justified as what essentially comes down to a no-brainer Kaldor–Hicks efficiency (net benefits / potential compensation), A LOT of people will agree with you. Is that objective or a moral absolute?
Does traveling through time to kill Hitler constitute murder though? If you kill him in 1943 I think most people would say it's not, the crimes that already been committed that make his death justifiable. What's the difference if you know what's going to happen and just do it when he's in high school? Or putting him in a unit in WW1 so he's killed in battle?
I think most people who have spent time with this particular thought experiment conclude that if you are killing Hitler with complete knowledge of what he will do in the future, it's not murder.
Clearly we can't all agree on those or there would be no need for the restriction in the first place.
I don't even think you'd get majority support for a lot of it, try polling a population with nuclear weapons about whether they should unilaterally disarm.
but as mlinsey suggests, what if it's influenced in small, indirect ways by 1000 different people, kind of like the way every 'original' idea from trained professionals is? There's a spectrum, and it's inaccurate to claim that Claude's responses are comparable to adapting one individual's work for another use case - that's not how LLMs operate on open-ended tasks, although they can be instructed to do that and produce reasonable-looking output.
Programmers are not expected to add an addendum to every file listing all the books, articles, and conversations they've had that have influenced the particular code solution. LLMs are trained on far more sources that influence their code suggestions, but it seems like we actually want a higher standard of attribution because they (arguably) are incapable of original thought.
It's not uncommon, in a well-written code base, to see documentation on different functions or algorithms with where they came from.
This isn't just giving credit; it's valuable documentation.
If you're later looking at this function and find a bug or want to modify it, the original source might not have the bug, might have already fixed it, or might have additional functionality that is useful when you copy it to a third location that wasn't necessary in the first copy.
If the problem you ask it to solve has only one or a few examples, or if there are many cases of people copy pasting the solution, LLMs can and will produce code that would be called plagiarism if a human did it.
Seconded. I don't need a recipe-blog-length analysis to gain value from what the author has to say and appreciated the additional comments this article has spurred on.
Here's a comment from the author, who is surprised that it's at the top of HN and calls it "a dashed-off blog post about a real blog post": https://news.ycombinator.com/item?id=45561201
This submission has now been flagged, so I'm clearly not alone here.
Attention! All discussion has been moved to a new thread. Why are you still commenting here? Don't you understand it doesn't make sense to continue talking here if there's already conversation elsewhere? Goodness, whatever you do please don't reply or you'll just continue the problem!
Author here. I was also surprised to see this getting a bunch of HN traffic suddenly. I guess the Liquid Glass hate is pretty strong when a dashed-off blog post about a real blog post can randomly do numbers! Heartened to see that others are annoyed by this design as well, though. Hopefully Apple will do something about it, but I'm not holding my breath.
I don’t know if you remember iOS 7, it was a catastrophe. Designs evolve on Apple platforms, usually in the correct direction.
Anecdotally, I have used Liquid Glass since the first beta and I honestly think there are a lot of good things there. Took me a few months but I actually like it now (and I have some colleagues in there same boat as me).
I start all my UI projects with this principle, but it's extremely difficult to maintain as screens evolve. I want typography-driven interfaces with structure communicated through headings and spatial grouping, but I usually end up with far more borders and nested padding than I think is "right" as I run into the limits of whitespace as an organizing structure, particularly when the audience has a limited attention span or time to engage.
From experience, borders/cards help communicate conceptual boundaries, while whitespace helps communicate information hierarchy - Gestalt principles don't really address that distinction. For product or data-driven UI where a lot of loosely-related information/topics are shown in discrete parts of the page, cards are effective at high-level grouping. For content-driven UI, whitespace can be sufficient, and I think the article makes this clear.
Other than 'The Ultimate Developer Toolkit' (where type size is more of an issue than the card layout), I actually think the card-based version of each example layout is more compelling - easier to scan, and easier to 'chunk' - despite wanting the typography-and-whitespace alternative to be sufficient.
Thanks for the information hierarchy / conceptual boundaries distinction, makes sense. I didn't know the latter term but tried to describe it in the last section of the post, nice to have a name now.
There are apps and sites that manage to keep the number of cards at min, one is selfridges (not an ad, was just open in another tab), another is firefox settings. MacOS Finder does a good job at grouping things with spacing, and macos generally. iOS seems to put everything on cards, at least nowadays.
The automatic ci/cd suggestion sounds appealing, but at least in the NPM ecosystem, the depth of those dependencies would mean the top-level dependencies would constantly be incrementing. On the app developer side, it would take a lot of attention to figure when it's important to update top-level dependencies and when it's not.
Light Pollution is the excess or inappropriate artificial light outdoors. Light pollution occurs in three ways: glare, light trespass, and skyglow.
\* Glare is the bright and uncomfortable light shining directly to the observer that interferes with your vision.
\* Light trespass is the unintended spill of artificial light into other people’s property or space and often becomes a source of conflict.
\* Skyglow is the brightening of the night sky from human-caused light scattered in the atmosphere.
Reflected sunlight from a human object spilling light into an environment that would otherwise not have that reflected sunlight is very much in the spirit of light pollution.
Skyglow is when you see the horizon light up in the direction of a city, not the light reflecting from a satellite. Look at the photo in your own link.
Let's assume I've read the link and have had discussions with both visual and radio spectrum astronomers and astrophysicists.
Starlink satellites pollute the night sky with both reflected sunlight and intended and unintended radio spectrum noise.
Manmade objects that inject light into an otherwise dark sky fit the category of skyglow, reflected sunlight tends to be sharper and less diffuse than atmospherically scattered ground lighting .. it's all extraneous human caused pollution from the PoV of telescopes.
The linked article from Steve Yegge (https://sourcegraph.com/blog/revenge-of-the-junior-developer) provides a 'solution', which he thinks is also imminent - supervisor AI agents, where you might have 100+ coding agents creating PRs, but then a layer of supervisors that are specialized on evaluating quality, and the only PRs that a human being would see would be the 'best', as determined by the supervisor agent layer.
From my experience with AI agents, this feels intuitively possible - current agents seem to be ok (thought not yet 'great') at critiquing solutions, and such supervisor agents could help keep the broader system in alignment.
>but then a layer of supervisors that are specialized on evaluating quality
Why would supervisor agents be any better than the original LLMs? Aren't they still prone to hallucinations and subject to the same limitations imposed by training data and model architecture?
It feels like it just adds another layer of complexity and says, "TODO: make this new supervisor layer magically solve the issue." But how, exactly? If we already know the secret sauce, why not bake it into the first layer from the start?
Similar to how human brains behave, it is easier to train a model to select a better solution between many choices than to check an individual solution for correctness [1], which is in turn an easier task to learn than writing a correct single solution in the first place.
[1] the diffs in logic can suggest good ideas that may have been missed in subsets of solutions.
Just add a CxO layer that monitors the supervisors! And the board of directors watches the CEO and the shareholders monitor the board of directors. It's agents all the way up!
LLMs are smarter in hindsight than going forward, sort of like humans! only they don't have such flexible self reflection loops so they tend to fall into local minima more easily.
This reads like it could result in "the blind, leading the blind". Unless the Supervisor AI agents are deterministic, it can still be a crapshoot. Given the resources that SourceGraph has, I'm still surprised they missed the most obvious thing, which is "context is king" and we need tooling that can make adding context to LLMs dead simple. Basically, we should be optimizing for the humans in the loop.
Agents have their place for trivial and non-critical fixes/features, but the reality is, unless the agents can act in a deterministic manner across LLMs, you really are coding with a loaded gun. The worst is, agents can really dull your senses over time.
I do believe in a future where we can trust agents 99% of the time, but the reality is, we are not training on the thought process, for this to become a reality. That is, we are not focused on the conversation to code training data. I would say 98% of my code is AI generated, and it is certainly not vibe coding. I don't have a term for it, but I am literally dictating to the LLM what I want done and have it fill in the pieces. Sometimes it misses the mark, sometimes it aligns and sometimes it introduces whole new ideas that I have never thought of, which will lead to a better solution. The instructions that I provide is based on my domain knowledge and I think people are missing the mark when they talk about vibe coding, in a professional context.
Full Disclosure: I'm working on improving the "conversation to code" process, so my opinions are obviously biased, but I strongly believe we need to first focus on better capturing our thought process.
I'm skeptical that we would need determinism in a supervisor in order for it to be useful. I realize it's not exactly analogous, but the current human parallel, with senior/principal/architect-level SWEs reviewing code from less experienced devs (or even similarly-/more-experienced devs) is far from deterministic, but certainly improves quality
Think about how differently a current agent behaves when you say "here is the spec, implement a solution" vs "here is the spec, here is my solution, make refinements" - you get very different output, and I would argue that the 'check my work' approach tends to have better results.
You're kind of making his point - the second and third paragraphs are explicitly about the fact that he is not an Ivy League professor. Be the change you want to see in the world by doing the reading first.
reply