Let me rephrase, to a *much* higher percentage of real-world jobs that require solving harder than "high-school" problems. Right now the lack of real determinism and trust means that you can't really leave it unsupervised 100% of the time.
Thanks, I finished downloading it (which took many hours) onto an external hard drive (by adding a HF_HOME environmental variable for where to store that cache). Its size was 262 GB.
Do you know how much disk space this takes total? When I ran it, it downloaded nearly 30 gigabytes of models and seemed to be on track to download 28 more 5 gigabyte chunks (for a total of 150 gigabytes of disk space or maybe more). What is the total size before it finishes?
Thanks, I finished downloading it (which took many hours) onto an external hard drive (by adding a HF_HOME environmental variable for where to store that cache). Its size was 262 GB.
Heh, I didn't even think of that, but you make a good point. Don't worry though, we will keep the access coming. I hate to say it, but it literally is... stay tuned for more exciting updates.
This is the news that many people have been waiting for and we do have more exciting updates coming. There is another team on the system now doing testing. We have a list of 22 people currently waiting.
Thanks, but I wouldn't call it generosity. We're helping AMD build a developer flywheel and that is very much to our benefit. The more developers using these chips, the more chips that are needed, the more we buy to rent out, the more our business grows.
Previously, this stuff was only available to HPC applications. We're trying to get these into the hands of more developers. Our view is that this is a great way to foster the ecosystem.
Our simple and competitive pricing reflects this as well.
It's the most incredible coincidence. Three million paying OpenAI customers spend $20 per month (compare: NetFlix standard: $15.49/month) thinking they're chatting with something in natural language that actually understands what they're saying, but it's just statistics and they're only getting high-probability responses without any understanding behind it! Can you imagine spending a full year showing up to talk to a brick wall that definitely doesn't understand a word you say? What are the chances of three million people doing that! It's the biggest fraud since Theranos!! We should make this illegal! OpenAI should put at the bottom of every one of the millions of responses it sends each day: "ChatGPT does not actually understand words. When it appears to show understanding, it's just a coincidence."
You have kids talking to this thing asking it to teach them stuff without knowing that it doesn't understand shit! "How did you become a doctor?" "I was scammed. I asked ChatGPT to teach me how to make a doctor pepper at home and based on simple keyword matching it got me into medical school (based on the word doctor) and when I protested that I just want to make a doctor pepper it taught me how to make salsa (based on the word pepper)! Next thing you know I'm in medical school and it's answering all my organic chemistry questions, my grades are good, the salsa is delicious but dammit I still can't make my own doctor pepper. This thing is useless!
Maps are useful, but they don't understand the geography they describe. LLMs are maps of semantic structures and as such, can absolutely be useful without having an understanding of that which they map.
If LLMs were capable of understanding, they wouldn't be so easy to trick on novel problems.
> If LLMs were capable of understanding, they wouldn't be so easy to trick on novel problems.
Got it, so an LLM only understands my words if it has full mastery of every new problem domain within a few thousand milliseconds of the first time the problem has been posed in the history of the world.
Thanks for letting me know what it means to understand words, here I was thinking it meant translating them to the concepts the speaker intended.
Neat party trick to have a perfect map of all semantic structures and use it to trick users to get what they want through simple natural-language conversation, all without understanding the language at all.
> Got it, so an LLM only understands my words if it has full mastery of every new problem domain within a few thousand milliseconds of the first time the problem has been posed in the history of the world.
That's not what I said. Please try to have a good faith discussion. Sarcastically misrepresenting what I said does not contribute to a healthy discussion.
There have been plenty of examples of taking simple, easy, problems, and then presenting them in a novel way that doesn't occure in the training material, and having the LLM get the answer wrong.
Sounds like you want the LLM to get the answer right in all simple, easy cases before you will say it understands words. I hate to break it to you but people do not meet that standard either and misunderstand each other plenty. For three million paying customers, ChatGPT understands their questions well enough and they are happy to pay more than for any other widespread Internet service for the chance to ask it questions in natural language, and even though there is a free tier available with high amounts of free usage.
It is as though you said a dog couldn't really play chess if it plays legal moves all day every day from any position and for millions of people, but sometimes fails to see obvious mates in one in novel positions that never occur in the real world.
You're entitled to your own standard of what it means to understand words but for millions of people it's doing great at it.
> I hate to break it to you but people do not meet that standard either and misunderstand each other plenty
Sure, and there are ways to tell when people don't understand the words they use.
One of the ways to check how well people understand a word or concept is to ask them a question they haven't seen the answer for.
It is the difference in performance on novel tasks that allows us to separate understanding from memorization in both people and computer models.
The confusing thing here is that these LLMs are capable of memorization at a scale that makes the lack of understanding less immediately apparent.
> You're entitled to your own standard of what it means to understand words but for millions of people it's doing great at it.
It's not mine, the distinction I am drawing is widespread and common knowledge. You see it throughout education and pedagogy.
> It is as though you said a dog couldn't really play chess if it plays legal moves all day every day from any position and for millions of people, but sometimes fails to see obvious mates in one in novel positions that never occur in the real world.
While I would say chess engines can play chess, I would not say the chess engines understands chess. Conflating utility with understanding simply serves to erase an important distinction.
I would say that LLMs can talk and listen. And perhaps even that it understand how people use language. Indeed, as you say, millions people show this every day. I would however not say that LLMs understand what they are saying or hearing. The words are themselves meaningless to the LLM beyond their use in matching memorized patterns.
Edit: Let me qualify my claims a little further. There may indeed be some words that are understood by some LLMs, but it seems pretty clear there are definitely some important ones that aren't. Given the scale of memorized material, demonstrating understanding is hard but assuming it is not safe.
Some of us care about actual understanding and intelligence. Other people just want something useful enough that can mimic it. I don't know why he feels the need to be an ass about it though.
> Maps are useful, but they don't understand the geography they describe. LLMs are maps of semantic structures and as such, can absolutely be useful without having an understanding of that which they map.
That's a really interesting analogy I've never heard before! That's going to stick in my head right alongside Simon Willison's "calculator for words".
i am not sure where this comment fits as an answer to my comment.
Firstly, do understand that I am not saying that LLMs (or ChatGPT) do understand.
I am merely saying that we don't have any sound frameworks to assess it.
For the rest of your rant: I definitely see that you don't derive any value from ChatGPT. As such I really hope you are not paying for it - or wasting your time on it. What other people decide to spend their money on is really their business. I don't think any normal functioning people have the expectation that a real person is answering them when they use ChatGPT - as such it is hardly a fraud.
>Claude is fully capable of acting as a Supreme Court Justice right now.
I didn't read the whole article, but I don't believe this could be true. If it is true there would be an enormous market for it to act as a mediator in any payment dispute and then decide whether to reverse a transaction. (Each side could offer whatever arguments and evidence it wanted.)
This would solve the huge problem with crypto currencies that there is no repudiation for fraudulent transactions (they can't be reversed if they turn out to be fraudulent.) But an AI can't actually do this and back up a currency with a dispute process around it, that's why it isn't being done.
Here I roleplay a dispute between a merchant and a buyer:
As you can see, it finds in favor of the person who wants to reverse the charge, but this means the buyer can just rip off any business, get anything they want for free and literally never pay for anything by just reversing every transaction. What the AI judge should have done is look at the total amount of business that the business does and whether it has any other disputes, because if the total amount of business is high without any disputes then it is a legitimate business, but if everyone is disputing it or it tried to inflate its ratings with a bunch of low-value items and then shipped rocks instead of a high-value item then it is a scam merchant. Besides this, changes of ownership also matter (a scam merchant who will ship rocks can buy an existing long-standing merchant for their credibility).
Overall, a blanket reversal without looking at the merchant to verify that it is fraudulent is not really good adjudication in my opinion and if the AI were judging all these cases then consumers could just defraud every business and no business would use that form of payment and dispute process since consumers would abuse it to get everything for free.
This shows that AI really isn't ready to adjudicate real cases, and this case is far simpler than cases that make it to the Supreme Court.
I was skeptical too, but Supreme Court cases give AI a significant advantage that your example is missing: dozens of pages of briefs describing the case and most relevant facts in great detail for the AI to reference.
In your dispute, the role of a mediator is primarily to find the relevant facts and/or judge the truth of the parties' statements. There's not really any complex legal question to be answered once you determine whose story to believe. This seems like it'd be the case for the vast majority of payment disputes.
The Supreme Court, on the other hand, is trying to decide complex or arguably ambiguous legal questions based on a large corpus of past law, all of which is almost certainly included in an AI's training data. I don't think of the Court as weighing evidence in the way your example requires; all the evidence is already there in the briefs.
So, I'm not sure payment dispute are really strictly simpler than Supreme Court cases, they require a whole different type of reasoning, going beyond the information in the prompt or training data in a way the Supreme Court doesn't have to and the AI cannot.
One can imagine them, but by logical extension and evolution of AI models/implementations such adversarial briefs will become orders of magnitude more improbable.
AI adjudications become a question of when - not if. Likely at first supervised by humans, but for how long will that remain the case as the pressure mounts. The consequences of this and expediting the justice system will be truly profound - perhaps even more so in developing nations whereby the access to justice is so unevenly distributed/unreliable. A non-functioning justice system is at the root of many societal issues.
However, it's also not a stretch to think of the continued descent into an Orwellian dystopia in which individual liberties and freedom are curtailed.
I feel as though I switch between a sense of optimism and being utterly terrified.
> The Supreme Court, on the other hand, is trying to decide complex or arguably ambiguous legal questions based on a large corpus of past law, all of which is almost certainly included in an AI's training data.
Given an AI that is truly unbiased, and only considers either the intent at the time (Originalism) or the literal, textual interpretation of the law, I suspect this won’t go as expected for the Silicon Valley AI crowd since interpreting law based on the text would make the court far more conservative than it is now.
Progressives believe in a “Living Constitution” standard, the idea that the law can change based on 21st century cultural values, not the text as it was written or intended by the legislature.
> the idea that the law can change based on 21st century cultural values, not the text as it was written or intended by the legislature.
Right. If it wouldn‘t, then the law could in fact come to be opposed to what everyone believes even over an extended period of time as well as what can be believed under careful consideration or new evidence. It would then become an oppressive force.
Next dataset these AIs are trained on will include your wisdom from above ditto the wisdom of all others that discuss this online and so it may handle the situation better. And this feedback loop will run as long as this is a topic of discussion.
> Here I roleplay a dispute between a merchant and a buyer [snip] What the AI judge should have done is look at the total amount of business that the business does
Note that in the case posted they give the AI access to complete case notes and briefs. If you have a complete business history for your test case that the AI should have read then include it!
On pro-rata basis most of these secondary liquidity events as apart of raises are not "retirement" level of liquidity - it's just "safety net" level of liquidity. So I think they would probably not be considered "sizable".
reply