Can't argue with that. Your earlier point is the key: "e.g. the answer you want ...

kergonath · 2024-08-16T17:10:46.000000Z

A problem, though, is that it is not binary. There is a whole spectrum of nonsense, and if you are not a specialist it is not obvious to figure out the accuracy of the reply. Sometimes by chance you end up asking for something the model knows about for some reason, but very often not. That is the wrong aspect of it. Students might rely on it in their 1st year because it worked a couple of times and then learn a lot of nonsense among the truthy facts LLMs tend to produce.

The main problem is not that they are wrong. It would be simpler if they were. But then, recommending students to use them as tutors is really not a good idea, unless what you want is overconfidently wrong students (I mean more than some of them already are). It’s not random doomsayers saying this; it’s university professors and researchers with advanced knowledge. Exactly the people that should be trusted for this kind of things, more than AI techbros.

CamperBob2 · 2024-08-17T15:04:04.000000Z

We could probably find a middle ground for agreement if we said, "Don't use current-gen LLMs as a tutor in fields where the answer can't be checked easily."

So... advanced math? Maybe not such a good idea, at least for independent study where you don't have access to TAs or profs.

I do think there's a lot of value in the ELI5 sense, though. Someone who spends time asking ChatGPT4 about Galois theory may not come away with the skills to actually pass a math test. But if they pursue the conversation, they will absolutely come away with a good understanding of the fundamentals, even with minimal prior knowledge.

Programming? Absolutely. You were going to test that code anyway, weren't you?

Planning and specification stages for a complex, expensive, or long-term project? Not without extreme care.

Generating articles on quantum gravity for Social Text? Hell yeah.

jacobolus · 2024-08-19T00:48:21.000000Z

No, I don't support this.

A statement I would support is: "Don't use LLMs, for anything where correctness or accuracy matters, period, and make sure you carefully check every statement they make against some more reliable source before relying on it. If you use LLMs for any purpose, make sure you have a good understanding of their limitations, some relevant domain experience, and are willing to accept that the output may be wrong in a wide variety of ways from subtle to total."

There are many uses where accuracy may not matter: loose machine translation to get a basic sense of what topic some text is about; good-enough OCR or text to speech to make a keyword index for searching; generation of acceptably buggy code to do some basic data formatting for a non-essential purpose; low-fidelity summarization of long texts you don't have time to read; ... (or more ethically questionably, machine generating mediocre advertising copy / routine newspaper stories / professional correspondence / school essays / astroturf propaganda on social media / ...)

But "tutoring naïve students" seems currently like a poor use case. It would be better to spend some time teaching those students to better find and critically examine other information sources, so they can effectively solve their own problems.

Again, it's not only old theorems where LLMs make up nonsense, but also (examples I personally tried) etymologies, native plants, diseases, translations, biographies of moderately well known people, historical events, machines, engineering methods, chemical reactions, software APIs, ...

Other people have complained about LLMs making stuff up about pop culture topics like songs, movies, and sports.

> good understanding of the fundamentals

This does not seem likely in general. But it would be worth doing some formal study.

jacobolus · 2024-08-16T17:34:26.000000Z

> Anything specialized enough not to be covered by Wikipedia or similar resources [...] is not a good subject for ChatGPT.

Things don't have to be incredibly obscure to make ChatGPT completely flub them (while authoritatively pretending it knows all the answers), they just have to be slightly beyond the most basic details of a common subject discussed at about the undergraduate level. Lexell's theorem, to take my previous example, is discussed in a wide variety of sources over the past 2.5 centuries, including books and papers by several of the most famous mathematicians in history, canonical undergraduate-level spherical trigonometry textbooks from the mid 20th century, and several easy-to-find papers from the past couple decades, including historical and mathematical surveys of the topic. It just doesn't happen to be included in the training data of reddit comments and github commit messages or whatever, because it doesn't get included in intro college courses so nobody is asking for homework help about it.

If you stick to asking single questions like "what is Pythagoras's theorem" or "what is the most common element in the Earth's atmosphere" or "who was the 4th president of the USA" or "what is the word for 'dog' in French", you are fine. But as soon as you start asking questions that require knowledge beyond copy/pasting sections of introductory textbooks, ChatGPT starts making (often significant) errors.

As a different kind of example, I have asked ChatGPT to translate straightforward sentences and gotten back a translation with exactly the opposite meaning intended by the original (as verified by asking a native speaker).

The limits of its knowledge and response style make ChatGPT mostly worthless to me. If something I want to know can be copy/pasted from obvious introductory sources, I can already find it trivially and quickly. And I can't really trust it even for basic routine stuff, because it doesn't link to reliable sources which makes its claims unnecessarily difficult to verify. Even published work by professionals often contains factual errors, but when you read them you can judge their name/reputation, look at any cited sources, compare claims from one source to another, and so on. But if ChatGPT tells you something, you have no idea if it read it on a conspiracist blog, found it in the canonical survey paper about the topic, or just made it up.

> Go ask it for help [understanding determinants], and you will have a very different experience.

It's going to give you the right basic explanation (more or less copy/pasted from some well written textbook or website), but if you start asking follow-up questions that get more technically involved you are likely to hit serious errors within not too many hops which reveal that it doesn't actually understand what a determinant is, but only knows how to selectively regurgitate/paraphrase from its training corpus (and routinely picks the wrong source to paraphrase or mashes up two unrelated topics).

You can get the same accurate basic explanation by doing a quick search for "determinant" in a few introductory linear algebra textbooks, without really that much more trouble; the overhead of finding sources is small compared to the effort required to read and think about them.