I used to hear things like “if cigarettes/alcohol were invented now, they would never allow it”, indicating that consumer protection used to be a thing, as early as 10-20 years ago. Now when AI hit the market it was obvious how bad and dangerous it was, yet governments (even the supposedly good ones in Europe which still [pretend to] do consumer protection) did nothing to protect their citizens from the harms AI was causing.
If we still did (or ever did) consumer protection like that cigarette/alcohol myth above indicates, then the makers of that tool would indeed be responsible for when their products does dangerous things.
Have an opinion on the design, imagine something, then tell it to do just that, then iterate. It's when you're unspecific you get the generic, bland and typical LLM design, you just have to be subjective and influence it in some (human) direction.
Here is a provocative thought - maybe these are the so-called "better designs" from LLMs? It's not like writing English sentences is some huge secret you are sitting on that no one else knows.
> It's not like writing English sentences is some huge secret you are sitting on that no one else knows.
I'd actually say what really makes an excellent engineer stick out among many great engineers, is their ability to communicate clearly and knowing what needs to be communicated vs not, basically being way better at language and communication in general, and they also understand the important of it.
My point being that not everyone writes as good prompts as everyone else, the way you communicate, how clearly and how exact you are matters a lot, much more than you seemingly is under the impression of.
Same goes with the "LLM does web design" example from before, a web designer with great communication skills in web design, will (naturally) have a better prompt for something that'll potentially could look good, compared to a web designer that isn't at good at communicating what they actually want.
Outside design systems I rarely get good CSS from LLMs.
3D type stuff too, it's useless outside boilerplate.
Very little spatial reasoning training, no end-user subjective reasoning inference (Google is starting to though even in unrelated chats), so it's no surprise the LLM doesn't know what you want.
Since I don't even know what I want half the time until I saw it, the subjective reasoning piece is key - that is, being able to predict what I'll want to pretty good accuracy. Then you have your agents etc.
Aaron Swartz never did whatever it was he was going to do. He was caught and hounded to death before that.
But he was working with scientific papers— the outputs of public institutions— and his likely goal was releasing them to the public. What proprietary AI companies have done in training LLMs on every book in existence is nothing like that.
A lot of what they have done is the reverse. They have used a lot of such publicly funded information (and a lot of other freely available information) to train LLMs that are proprietary.
It's convenient that Henry Rice lived long before the age of language cults. I don't even think Rice wrote software, he's just a mathematician, he proved this nice property in mathematics. Stuff like FORTRAN and ALGOL happens later.
Also though, just as for the Halting Problem, we are always allowed a three-way split. Rice proves that "Has property" vs "Does not have property" can't be done, but "Has property" vs "Does not have property" vs "Shrug - I dunno, seems hard" is possible, and indeed easy if you're OK with lots of machines landing in the "Shrug" pile. You can expend as much work as you like to shrink that pile, Rice just proved it would need infinite work to empty it completely.
Another way around the Rice's theorem is the Curry-Howard correspondence. A constructive proof of existence of a program that has a property can be transformed into a program that has this property. Yet another way is to have a programming language where syntactic correctness implies a range of semantic properties.
I don't like youtube gemini summaries. whenever I have asked gemini about detailed comprehensive report on a long hour podcasts, it always refuses. hence claude is my go report generation AI.
This is an active area of research. Demis Hassabis proposed training a model with a strict knowledge cutoff before 1915, and seeing whether it can independently arrive at general relativity.
> As part of our continued collaboration with Anthropic, we had the opportunity to apply an early version of Claude Mythos Preview to Firefox. This week’s release of Firefox 150 includes fixes for 271 vulnerabilities identified during this initial evaluation.
I understand that they are trying to say that it is getting better... 271 vulnerabilities is a lot. I have been using FF for a long time. I am now considering if using it at all was a mistake or not. And I think it was.
reply