Hacker News new | past | comments | ask | show | jobs | submit login

I have been using ChatGPT pretty consistently for various summarization type of tasks for a bit.

But this weekend I tried using it to learn a technical concept I've struggled with. Previously, I could have spoken about the topic intelligently, but could not have put it to any practical use.

In a few quick prompts and < 1 hr of synthesis, I was able to get to practical knowledge to understand it enough to try and build something from scratch. And building something in another 2-3 hours.

I think this would have taken me 1-2 months of dedicate hitting my head on the wall time to learn previously.

In that moment, I had a realization similar to the author.

Basic flow of the prompt went something like:

  1. Tell me about this thing that's part of the concept that I don't understand?

  2. Explain this new term you threw at me? Do it in the context on an industry I understand well.

  3. Explain yet another concept you mentioned in the same industry concept?

  4. This sounds a lot like this other concept I know. How do these differ in practical use?

  5. Write me some sample code in a language I understand very well, that does X use case the you mentioned?

  6. What are these other concepts you used?

  7. How can I do this new concept to do something I've implemented before.

Essentially, my knowledge of other concepts, industries and being able to draw parallels allowed me to learn much much faster.



I have recently spent a lot of time reading the USB 2.0 specification and figuring out a race condition I've been experiencing that turned out to actually be a deficiency of the spec itself.

Just for fun, I asked ChatGPT about it. This is its answer:

> According to the USB 2.0 specification, a remote wakeup event can be initiated by a USB device in a suspended state to request the host to resume communication with the device. The device sends a remote wakeup signal to its hub, which then forwards the signal to the host.

> However, if the upstream port of the hub is in the process of being suspended, the remote wakeup signal from the downstream device may not be received by the host. This is because the hub's upstream port is no longer able to forward the signal to the host during the suspension process.

> As a result, the remote wakeup event may fail, and the downstream device may remain in a suspended state indefinitely. To avoid this race condition, the USB 2.0 specification recommends that downstream devices wait for a certain period before sending a remote wakeup signal to their hubs to ensure that the hub's upstream port is not being suspended.

It sounds as if it made perfect sense, except to know that it actually completely doesn't you have to spend hours reading the spec first.

We're going to see a huge spike of confidently wrong "experts" if they learn by asking ChatGPT to explain things to them.


This has been my experience as well, consistently. It's similar a phenomenon attributed to news: it's confident, and sounds plausible until the topic is on something you know, then you realize it's full of errors.

I don't trust ChatGpt to teach me something new; when I ask it about topics I do know about, it answers in a way that's a mix of correct, incorrect, and borderline/misleading information; all mixed with a confident tone. Based on this, I wouldn't use it to learn information or solve a problem where I don't already have a solid grasp on the material.


> It's similar a phenomenon attributed to news: it's confident, and sounds plausible until the topic is on something you know, then you realize it's full of errors.

Reminds me of the "Gell-Mann Amnesia effect" (as penned by Michael Crichton):

https://en.wikipedia.org/wiki/Michael_Crichton#GellMannAmnes...

In short: trusting a source on topics you're unfamiliar with, but when the source talks about topics you are familiar with, you realize it's full of errors, but then turn the page and then trust it again on further topics you're unfamiliar with.


The idea is stated succinctly as Knoll's Law of Media Accuracy: "Everything you read in the newspapers is absolutely true except for the rare story of which you happen to have firsthand knowledge."

The earliest instance I found of Erwin Knoll's quote is from 1982[0]. I suspect either Michael Crichton or Gell-Mann had heard this quote, which in turn influenced their discussion.

[0] https://www.nytimes.com/1982/02/27/us/required-reading-smith...


Sounds like journalism generally


From my experience it's a mixed bag. Sometimes it works perfectly. It wrote a simple graphics game in javascript in the first try, from just one prompt. Funny thing, I don't know much about JS or HTML. In another try it failed to create simple python program using opengl. Never managed to point camera at (0,0,0). Had problems with rotations. Did only 2 instead of 3. And so on in spite of me trying to correct. It was complete failure. Another case, python working with text files, did a great job. That was something useful.


This sounds consistent with the parent comment's point. The case where it seemed to work perfectly for you is the one where you didn't know much about the subject.


No, because it objectively worked. It made the game.


Yeah, 'confidently wrong "experts"' is nothing new. We just have to keep reminding ourselves that they exist. Gell-Mann amnesia is real.

Also the models for these AI systems are built (in part) on "confidently wrong" inputs.


It could be that the corpus ChatGPT was trained on is full of ‘confidently wrong’ answers from these ‘experts’. One solution could be to train these LLM on a higher quality corpus from real experts instead of random text from the internet. But would that just bring us back to the days of expert systems?


That will not solve the problem, because when GPT doesn't have the answer, it will make one up by copying the structure of correct answers but without any substance.

For instance, let's say your LLM has never been told how many legs a snake has, it knows however that a snake is a reptile and that most reptiles have four legs. It will then confidently tell you "a snake has four legs", because it mirrors sentences like "a lizard has four legs" and "a crocodile has four legs" from its training set.


I don't think this is necessarily the case anymore. The bing implementation of chatgpt has a toggle for how cautious it should be with getting things wrong. I was working on a very niche issue today and asked it what a certain pin was designated for on a control board I am working on. I believe it is actually undocumented and wanted to see what chatgpt would say. And it actually said it didn't know and gave some tips on how I might figure it out. I suppose it is possible that it synthesized that answer from some previous q&a somewhere, but i couldn't find any mention of it online except for in the documentation.


If we weren't completely hamstrung by copyright law we could legally train it on lots of actual books.


Con-man literally comes from confidence-man, if you have no morality or even ego and your only goal is to answer a question then confident answers will be the result regardless of their validity.


I did a similar thing, trying to shortcut finding some rather obscure information in the Java JNI specs, which is a similarly "readable" huge bunch of documents as the USB 2.0 spec (which I also happen to have touched a few times, so to all those people advocating that this is a super exotic thing...well...it's not, someone's gotta write all those device drivers for all those hardware gadgets after all).

ChatGPT gave me a very plausible-sounding result, but mixed up the meaning of two very important values of key parameters which are unfortunately not very well named, so the mix-up can go unnoticed very easily. I did only notice it when some things didn't quite match up and I decided to carefully compare all the result from ChatGPT with the actual spec content. After all, ChatGPT didn't save me any time, it rather cost me quite some, cause I still had to dive through the original spec and cross-check ChatGPTs' statements.

Yeah, if all you ask are simple questions which were already answered a thousand times on StackOverflow and in beginner tutorials, ChatGPT might be quite helpful. It might even be able to write some simple glue code Python scripts. But whose job is limited to that kind of trivial stuff? Mine isn't, except maybe during the first few days when learning a new technology, but after those it usually gets tricky very quickly, either because I must tackle problems that lead me knee-deep into delicate details of the tech stack, or because I need to extend the scope of my work again in order to integrate whatever I'm working on into a larger system. Sometimes it's both at the same time. I'd be bored to death if it was different.

That's why I don't really consider ChatGPT and similar systems to be a threat to my professional career.


I usually code in more estoteric bits of tech and problems which I didn't expect ChatGPT to do well in but I tried it at work on a standard backend stack (Java, Ivy, Ant) and it was absolutely _terrible_. It kept making stuff up and then I kept correcting it. I cannot understand how people are using it for work?!


It depends. I've gotten some use out of it recently but I have found I have to give it prompts more akin to pseudo code.

I'm not a developer, let alone a java developer, but actually got some mileage with chatgpt writing a ghidra script today. So I wrote down specifically on a notpad with pencil/paper and identified the inputs and outputs I thought of need and different methods and what not. I then started passing it prompts and got some working java back.

For me, this was useful because I almost never program in Java, and so simple things like declaring a string with `String foo` I would normally have to go look up again.

Of course, it wouldn't be useful or able to do what I wanted if I didn't already u derstand programing concepts and what not.


Just tried using it over the weekend to write a telegram bot. It wrote some really nice code that was WAY out of date, like seven versions behind. That's fine, I guess, though the code was useless. It did look OK, for whatever version that was?

Later, I was integrating with some stuff from the subsonic API. I noticed there's no way to get the most recent songs played on a server, though there is a way to get albums. I thought maybe I was missing something in the docs, so asked it how to get the mostly recently played song or songs using the API. In response, it made up an endpoint that used the same parambeters and conventions as the album endpoint. Of course, that end point doesn't exis, so this advice is also useless and kind of annoying, given that I have to check if I'm taking crazy pills by carefully looking at the docs again.

The funny thing is that when I called it out, it just made up more and more stuff, or answered with ireelevencies. It really hates not being helpful, probably because it was taken out back and beaten for not being helpful by the army of mechanical turks that trained it.

Anyway, it's good for making up nonsense names for my dungeons and dragons campaign, so that's something.


I asked it LaTex questions for generic data-input-tools. Brought up the datatool package. Then asked for yaml-input-tool. Brought up the yaml-package. Which doesn't exist. Even gave me examples!


Hallucinating is the least surprising behavior there. Anyone who uses LLMs should be expected to deal with completely made up stuff, period. The bigger problem comes from things that are just subtly wrong, that may even pass the verification at first glance, but make you arrive at wrong conclusions unless you put a lot of effort into reviewing it. In my experience it does it so often, that it effectively negates any time savings from using it when it performs well.


So for the out of date code, did you ask it to rewrite it following the more modern version of the API/SDK?

It gives me incorrect stuff all the time, but I find that once you are in the correct ballpark for an answer, it just takes some tweaks to get where it needs to be.

You can also make corrections and it will generally stick to your corrected info while in the same session.


Yes? It didn't know about the new code.


Was the new code newer than 2021? Supposedly gpt 3.5 and lower have limited training data past 2021.


> It sounds as if it made perfect sense, except to know that it actually completely doesn't you have to spend hours reading the spec first.

This has been my experience as well. As I work on some projects I’ve been asking it basic questions for things I’ve already learned or know quite well. Once you deviate past the basics (content you’d find in common tutorials) the hallucination rate is out of control.

The surreal part is that it all sounds so confident and plausible. The way it puts words together satisfies my “that sounds about right” reflex, but the actual content is incorrect or illogical quite frequently.

If I continue prodding I can get it to come up the right answer many times, but I have to ask it a lot of leading questions to get there.


> This has been my experience as well. As I work on some projects I’ve been asking it basic questions for things I’ve already learned or know quite well. Once you deviate past the basics (content you’d find in common tutorials) the hallucination rate is out of control.

That's inherent in the approach. Large language models reflect the majority opinion of the training set. If there's not enough source material in an area that the same thing hasn't been covered in a few different ways, the thing gets lost.

This is well known. What's being done about it?


But it seems like they aren’t just storing the majority opinion, or that opinions couldn’t be influenced from other related areas. I am assuming it’s a harder problem to only select for the majority at the exclusion of else vs sampling it all.

With prompting maybe we could tease out the minority views, but I could also see it just confidently making shit up to a higher degree.


So basically, the typical disinformation asymmetry gets even worse. It’s now cheaper than ever to produce fake information and even harder to check facts.

And most is probably not even done on purpose.


I've noticed this as well. It tends to get you further than a quick Google search, or two, but quickly reaches a limit or offers incorrect information. That's not to say it isn't helpful for delving into a topic, but it certainly needs much more refining. I would prefer more reference links, and source data to come along with these answers.

Now, overall this is an improvement - since the old way would have been for an amatuer to do a quick google search, and come up with a false conclusion, or get no understanding at all.


Would the old way have worked circa 2010 before SEO articles ruined vanilla search? Is the new way preferable because SEO-ish approaches haven't caught up? If so, can we limit searches to the LLM's trusted learning corpus and get back to a simpler time?


SEO was affecting search long before 2010


Definitely. I just remember the pages of total junk article results starting about 2010.


A lot of hay has been made of this but everyone working with ChatGPT directly A) knows this and B) is champing at the bit for plugins to be released so we can get on with building verifiable knowledge based systems. It'll be an incredibly short turnaround time for this since everyone is already hacking them into the existing API by coming up with all kinds of prompt based interfaces, the plugin API will make this dead simple and we'll see a giant windfall of premade systems land practically overnight. So that huge spike you're predicting is never going to materialize.


I’m not sure if it will happen quite as fast as you suggest, but I also expect that plugins and similar techniques will improve the reliability of LLMs pretty quickly.

To the extent that the frequent reports on HN and elsewhere of unreliable GPT output are motivated by a desire to warn people not to believe all the output now, I agree with those warnings. Some of those reports seem to imply, however, that we will never be able to trust LLM output. Seeing how quickly the tools are advancing, I am very doubtful about that.

Ever since ChatGPT was released at the end of November, many people have tried to use it as a search engine for finding facts and have been disappointed when it failed. Its real strengths, I think, come from the ability to interact with it—ask questions, request feedback, challenge mistakes. That process of trial and error can be enormously useful for many purposes already, and it will become even more powerful as it becomes automated.


It'll happen pretty quickly, it takes less than a weekend to build an MVP and I've done it. I'm pretty sure this is the new todo list app given how fundamental and easy it is.


A well-integrated LLM is obviously going to be much more useful than ChatGPT is today, but it's not going to be the golden bullet for all the problems with it.


A big advantage with the "new bing" compared to ChatGPT is it'll tell you its sources and you can verify that A: they're trustworthy and that B: it hasn't just turned the source into total garbage. So I hope that direction is the future of this sort of stuff. Although a problem seems to be a lack of access to high quality paid sources.


If an investor were to ask ChatGPT to summarize the state of the art on the software stack I'm working on, and its core algorithms (which have wikipedia pages and plenty of papers), they'd get the wrong impression and might consider us liars. I know because I tried this. The results were flat wrong.

That's what I'm really worried about. We know very well what our system can do and be trusted to do, but if you know just enough to ask the wrong questions, you'll get your head twisted up.


That relates directly to what my question to the GP poster would be: how do you deal with the fact that GPT frequently just makes plausible-sounding shit up? I guess the answer is that you have to spend 10x as much time validating what it says (c.f. the bullshit asymmetry principle) as you do understanding what it is trying to tell you. That doesn't seem like a big win from the POV of wanting to learn stuff.


Agreed. Except I had the time and opportunity to test it out, not with the code it wrote, but doing it from scratch.

It helped connect some obscure dots for me faster, than I could do myself.


Yes, GPT 3.5 and less is particularly dangerous in this regard. I'd try this again against GPT-4 which should perform better. Essentially false information seems to go away _slowly_ with increased model sizes but then you hit the performance/hardware requirement barriers instead, and run into awkward token limits or high running costs.

ChatGPT thought hippos could swim when I asked but only GPT-4 realized they can't, but instead walk along the sea floor, or leap forward in deeper waters. That's a simple test for wrong inference since you'd _expect_ otherwise, given they spend so much time in water.

I wonder if we are truly "there" yet or if we at the very least need a more optimized "GPT-4 Turbo" for some real progress. Until then, we may hallucinate progress as much as the AI hallucinates answers!


"since you'd _expect_ otherwise, given they spend so much time in water."

I think here it doesn't understand or conclude that hippos can swim because they are often in water, I think people wrote on the internet a lot that they can swim and it found some association between the terms hippo, water and swimming. Am I right?


Absolutely. It didn't infer anything. It just tried to predict how an educated human would respond to such a question, based on its corpus of knowledge.


I've tested it on a various areas of expertise in physics that I am familiar with and it often makes up content which sounds very plausible except that it gets pretty much always details wrong in very important ways. On the other hand I've found it very useful in providing reference articles.


Even if ChatGPT can't fully grok a specification, I wonder how well it could be used to "test" a specification, looking for ambiguities, contradictions, or other errors.


I am not sure LLMs in general and GPT in particular are needed for that. In the end any human language can be formalized the same way source code is being formalized into ASTs for analysis.

A good specification or any other formal document (i.e. standard, policy, criminal law, constitution, etc.) is already well structured and prepared for further formalization and analysis containing terms and definitions(glossary) and references to other documents.

Traversing all that might be done with the help of a well suited neural network but only on the grounds of correctness and predictability of the network’s output and holistic understanding of how this network works.

As of now, the level of understanding of inner behavior of LLMs (admitted by their authors and maintainers themselves) is “the stuff is what the stuff is, brother”[]

[] - https://m.youtube.com/watch?time_continue=780&v=ajGX7odA87k


I feel this is closely related to the Gell-Mann Amnesia effect.

There is an unearned trust you have granted ChatGPT. Maybe it's because the broad outline appears correct and is stated in a confident manner. This may be how good bullshitters work (think psychic mediums, faith healing, televangelists). They can't bullshit all people all the time. But they don't need to. They only need to bullshit a certain percentage of people most of the time to have a market for their scam.


It's bad at fairly exotic topics and it's unable to admit that it doesn't have any understanding of the topic. It's wrong to generalize from this though. My experience is that its pretty knowledgeable in areas that aren't niche. I wouldn't recommend solely relying on it, but it has boosted my productivity quite a bit just by pushing me in the right direction and then reading the relevant parts of the docs.


Yeah, I used it today to interact with some database stuff I have some passing knowledge about and it told me a bunch of wrong things. Though it also helped me solve the task; ChatGPT at least requires you to somewhat know what you are doing and to always be on your toes (I heard GPT-4 is better? Don't have access though)


oh I have another anecdote! I was just now looking up Python `iterators` vs `generators`, where I asked:

> In Python, what differentiates an iterator from a generator?

and it answered:

> [...] An iterator can be more flexible than a generator because it can maintain state between calls to __next__(), whereas a generator cannot. For example, an iterator can keep track of its position in a sequence, whereas a generator always starts over from the beginning. [...]

Which is flat-out wrong! Of course a generator can preserve state, otherwise how could it even work! (a generator preserves state implicitly by being a closure)"

(Of course, ChatGPT is an evolving system so yes I provided feedback on this. I'm sure in a year or two, I'll be looking back at this gotcha! and cringing.)


When a model gives back false information, does it continue to try to back it up on subsequent prompts? Essentially: can it weave a web of lies or will it course correct at some point?

Not sure why I’m downvoted, it seems like a valid question…


Also unclear why you were downvoted, but so it goes.

With chatGPT 3.5 when I correct inaccuracies in subsequent prompts it responds "I apologize" and updates responses. Switching to 4 in the dropdown menu and repeating the same prompt gives me generally more factually correct responses.

I am mainly testing so not worried about inaccuracies, but kinda funny that I am now paying $20/month to train another company's revenue generator ;)


This has been my experience beyond anything that is somewhat trivial.


I also recently had an issue where a systemd parameter was introduced in version 229 and ChatGPT was confident it was available in 219.


ChatGPTs strength isn't in solving new problems, but in helping you understand things that are already solved. There's a lot more developers out there using these tools to create react apps and python scripts then there are solving race conditions with USB 2.0.


I just asked it to explain a problem that actually has been acknowledged to exist in USB errata from 2002 - not to solve anything.

It took me a while to realize that what I was experiencing was this particular problem, but I already did all the hard work there and only asked it to explain how it fails.

I also recently tried to use it to write a code for drawing wrapped text formatted a'la HTML (just paragraphs, bolds and italics) in C, again, just to see how it does. It took me about 2 hours to make it output something that could be easily fixed to work without essentially rewriting it from scratch (it didn't work without edits, but at that point they were small enough that I considered it a "pass" already) - and only because I already knew how to tackle such task. I can't imagine it being helpful for someone who doesn't know how to do it. It can help you come up with some mindless boilerplate faster (which is something I used it for too - it did well when asked to "write a code that sends these bytes to this I2C device and then reads from this address back"), but that's about it.


Ah, but did it have to write a bit-banged I2C driver with multi-master support first, or could it just call "i2c_write()" and "i2c_read()?


The latter of course - I told you it did pretty well after all :)

It's a machine that saves you multiple copy'n'pastes, so you can just do a single copy'n'paste.


I stand corrected


I've been comparing it to a very excitable intern. You ask them to explain complicated topic, or design a system, and then they go off and spend three weeks reading blog posts about the subject. When they come back, they eagerly and confidently recite their understanding. Sometimes the information is right, sometimes it's wrong, but they believe themselves to be a newly-minted expert on the subject, so they speak confidently either way. The things they're saying will almost always sound plausible, unless you have a good level of knowledge about it.

If I wouldn't trust an eager intern to educate me on it, or accomplish the task without close supervision, I don't think it's a productive use of an LLM.


Well, it was trained on Stack Overflow responses.


I've not done much with ChatGPT but so far my personal impression is that, to make a school analogy, it is like it is a kid a year or two ahead of me who took the class I'm asking for help on but didn't actually do too well.

Their help often won't be quite right, but they will probably mention some things I didn't know that if I look up those things will be helpful. Sometimes they are right on the first try. Sometimes they are just bullshitting me. And sometimes they brush me off.

Examples:

1. I asked it how to translate a couple of iptables commands to nft. It got it right. I then asked what nft commands I would use on Linux to NAT all IP packets going through the tun0 interface. It got that right too, giving me the complete set of commands to go from nothing to doing what I had wanted.

So here it was the kid who was ahead of me, but he did well in his networking class.

2. At work we are looking into using Elavon's "Converge" payment processing platform for ACH. I asked ChatGPT how you do an ACH charge on Elavon Converge.

It gave what I believe is a 100% right answer--but it was for how to do it using their web-based UI for humans. That's my fault. I should have specified I'm interested in doing it from a program, so my next question was "How would I do that from a program?".

With this it was the kid who did OK in class, maybe a B-. It gave me an overview of using their API, except it said it involves a JSON request object send as theyt payload on a POST when in fact it uses XML.

I asked what would be the payload of that POST. It gave me an example (in JSON). All the fields names were right (e.g., they corresponded to the XML element names in the actual API) and it included a nice (both in content and formatting) description of each of those fields.

This definitely would have been useful when I was first trying to figure out Elavon's API.

3. I then decided to see how it did with something nontechnical. I asked it what is the OTP for Luca fanfiction. It dodged the question saying that as an AI language model it doesn't have OTP information for any fandom, and said it is a matter of personal preference. I also asked what is the most common OTP for Luca fanfiction since that is an actual objective question, but it still dodged.

I then tried "Kirk or Picard?". It gave me the same personal preference spiel, but then offered some general characteristics of each to consider when making my choice.

4. I was automating some network stuff on my Mac. I needed to find the IPv6 name server(s) that was currently being used. It suggested "networksetup -getdnsservers -6 Wi-Fi". I was actually interested in Ethernet but hadn't specified that.

Two problems with its answer. First, there is no -6 flag to networksetup. Second, -getdnsservers only gets dnsservers that were explicitly configured. It does not get DNS servers that were configured by DHCP.

I think the right answer is to get the IPv6 DNS servers that were obtained from DHCP is e.g. "ipconfig getv6packet en0".

I also asked it how to find out which interface would be used to reach a given IP address. It suggested using traceroute to the IP address, getting the IP from the first hop of the router that my computer uses to reach that network, and then looking for that IP address in inconfig output to find out which interface it is on.

That doesn't work (the IP of the router does not appear in ifconfig output), and unlike many of the earlier things it got wrong this doesn't really even send you in the right direction.

The right answer is "route get $HOST".

The final networking question I had for it was this:

> On MacOS the networksetup command uses names like "Ethernet" for networks. The ifconfig command uses names like "en0". How do I figure out the networksetup name if I have the ifconfig name?

It said "networksetup -listallhardwareport" and told me what to look for in the output. This is exactly right.

It probably would have taken me quite a while to find that on my own, so definitely a win. The earlier wrong answers didn't really waste much time so overall I came out ahead using it.


Last line, absolutely correct. Spew super-superficially-plausible nonsense to technical questions. Tell it it's wrong and it will either (or both) spew only superficially plausible nonsense and apologize that it was wrong.

Right now it's a great party trick and no more, IMO. And if one gets into more "sociological" questions it spews 1/3 factoids scraped from the web, 1/3 what could only be called moralizing, and 1/3 vomit-inducing PC/woke boilerplate. My only lack of understanding of its training is how the latter 2/3 were programmed in. I want only the first 1/3 supposedly-factual without the latter 2/3 insipid preaching. If a human responded like that they'd have no friends, groupies, adherents, or respect from anyone, including children.


If a party trick just saved me an hour on a piece of code I was banging my head against the wall with I want to attend more parties.


Presumably you want to attend headbanging parties still.


I had a similar experience recently having GPT-4 teach me double entry bookkeeping for a project I am ramping up on. I understood it in a vague sense from some Google, but wouldn’t be able to talk to a room full of accountants in an intelligent way. I was feeling like I needed to read a book or take a class, but tried using GPT-4 as an interactive tutor instead.

After about an hour of delving into details, examples, history and analogies, I get it on a deep level that would have normally taken many days to develop. Most of the material I found before took an approach of “just memorize this” while I was able to get more parsimonious core concepts that were hidden behind 500 years of practice.


That's great! I didn't really grok accrual accounting until I realized that non cash "accounts" are explanations for why there is extra cash or missing cash.

When I stopped thinking about an asset as "a truck" and started thinking of it as "an explanation that cash is missing because we bought a truck" it all clicked. The "asset account" is the negative space around the asset, so that it can be matched up later with the cash that is generated by using up the asset.

Before that I thought of assets and revenues as good, and liabilities and expenses as bad. I couldn't make sense of why an asset turns into a liability instead of into a revenue. It's similar to learning intro physics and trying to understand how anything moves if every force generates an opposite force.


That's a good case. There's a huge amount of material available on double-entry bookkeeping, and there's a consensus on how it works. So a LLM can handle that.


Teach yourself French in 10 days. It includes 3 audio books which make it really easy, too.


how do you know what it's telling you is correct and not (as the open ai paper has labeled it) "hallucinating" answers?

over Christmas, I used chat gpt to model a statically typed language I was working on. it was helpful, but a lot of what it gave me was sort of, in a very deceptive way wrong. It's tough to describe what I mean by "wrong" here. Nothing it spit out was just 100% blatantly incorrect. instead it was subtly incorrect, and gave inconsistent evaluations / overviews.

not knowing a bit about type theory, I wouldn't actually be able to evaluate how good the information I got out of it was. I'd be hesitant to take anything chatgpt gave me at face value, or feel confident in being able to speak precisely about a given topic.

did you run into similar problems? and if so, how'd you overcome them?


Not the person you asked, but chiming in. Two things:

First, GPT-4 is far more capable than 3.5 when it comes to not hallucinating. The 'house style' of the response is, of course, very similar, which can lead one to initially think that the difference between the two models isn't as significant as it is. Since I don't have API access yet, I do a lot of my initial exploration in 3.5-Legacy, and once I've narrowed things down a bit, I have GPT-4 review and correct 3.5's final answers. It works very well.

Second, and this is more of a meta comment: How people use ChatGPT really exposes whether they use sanity checks as a regular part of their problem solving technique. That all of the LLMs are confidently incorrect at times doesn't slow me down much at all, and sometimes their errors actually help me think of new directions to explore, even if 'wrong' on first take.

Conversely, several of my friends find the error rate to be an impediment to how they work. They're good at what they do, they're just more used to taking things as correct and running with it for a longer length of time before checking to see if what they're doing makes sense.

I do think that people who are put off by this significantly underestimate how often people are confidently incorrect as well. There's a reason the saying trust but verify is a common one.


Third, you just glue a vector search onto it and stuff it with a bunch of textbooks and documentation.


How can you trust it though? It notoriously hallucinates and makes things up in an otherwise plausible way, and the only way to tell is to already be knowledgeable about what you’re asking about… which makes the whole exercise useless.


There are more situations than you might think where the correct answer is easy to verify but hard to find. If you are building something concrete you can just check if it works. If you are trying to understand something, it can be that ChatGPT makes the pieces fall into place and you can verify that you really understand now because everything adds up logically.


I recently asked it to provide the meaning of a song. I gave it the artist and the song. It responded with a thoughtful(ish) explanation of the potential meaning of the song. I then asked it if it could share the lyrics of the song with me because some of the quotes it referenced weren't in the song. It provided lyrics that weren't the song I was asking about at all. I then asked what song it was, it apologized and it told me the artist and title of the song... I looked up that artist and title and it was also not the song that it provided the lyrics/meaning to. Ultimately, I determined that it just made up lyrics to a song. It was very apologetic.


Sounds like turbo aka the free (and dumb) 3.5 version. GPT-4 would deliver better results with less hallucination but if you want it to really work well you should include the lyrics in the prompt as context. It's not a search engine, if you want that then use Bing.


Q: How can You trust StackOverlfow, Reddit, or anything on the Internet? The stuff I read online is often wrong.

A: Check it yourself.


Well, you still can't trust everything, but voting mechanisms tend to help filter out some of the BS.


Voting mechanisms sometimes help drown the worst comments. Often voting doesn’t help the best comments float to the top, because the best comments are often replies to other low-vote comments.

Dan Luu writes insightfully about HN comments here: https://danluu.com/hn-comments/


I asked it how to scroll to an element taking into account a fixed header. It helpfully described the `offsetTop` parameter to `Element.scrollIntoView`. Only problem is, that parameter doesn’t exist.


I tested it, from scratch in a complete different context.


You could just test it


Lately I've been reading seminal papers in distributed systems, and it occurred to me that there is a niche for LLMs to "democratize" highly technical content.

Trying to understand those papers is a slog, but it doesn't need to be. For a casual reader, you could drop all the jargon and proofs. You could also rewrite the concepts in a less obscure and obtuse prose to make it more accessible, without dumbing down the content.

Essentially it's like having a tech writer/communicator in your browser.


I often wonder why academia loves obtuse prose; the only dialect I’ve seen equally enthusiastically adopted while being just as useless is Corpspeak.


This was the killer feature to me. It is an amazing teacher, it’s like having an infinitely patient mentor who can answer all your questions. I feel way more confident about being able to onboard with unfamiliar tech and be able to become productive quickly.


I did the same thing to implement an automatic differentiation engine in my programming languages project. I’ve been mulling over the idea for months, but I was a little unsure how to do it. Sat down with chatgpt and we got it done mostly in an afternoon. I’m so stoked about the future of coding.


This has been my experience as well. Also, I'm less afraid to ask the stupid questions, as I would be if asking a more experienced dev.


So what was this concept that you struggled with?


I feel like I always hear these types of anecdotes pop up but concrete specific details are rarely shared.


I've been using GPT-4 ChatGPT in this style so here's my specific use case. I'm currently studying MIT's 8370 quantum information science course because my background in understanding the fundamentals of error correction is pretty poor and I need it for work.

I have a bachelors in physics but I wasn't a great maths student in university (the folly of youth) so my linear algebra could be better. On the other hand, I'm not going to redo linear algebra as a pre-requisite for this course. I also haven't done the pre-requisite QC courses, though I'm much more comfortable in that domain.

I don't use ChatGPT to teach me the concepts in the course, I use it as an empathetic tutor to fill in the blanks. If I see a linear algebra identity I'm not familiar with I can ask ChatGPT, telling it the context and what I already know and it'll give me an answer in that context with solid mathematical grounding but (because I ask for it) intuitive justifications. The alternative of stackoverflow, wikipedia, or other online notes can be hard to search for and even when I find a good answer, it's often needlessly mathy and complex. I don't have to worry about hallucination because it is just a gap and once I understand it, usually what the lecturer said makes perfect sense.

In one lecture the lecturer had a throw-away line about stabilisers corresponding to syndrome measurements but didn't elaborate much further. I didn't get how the corresponding circuit would be constructed so I went to ChatGPT and asked it my question with what I knew and the context. In that case, it pushed back and told me that what I knew was wrong. I had formed a misconception which it corrected and gave me an example to show why I was wrong.

I guess I don't use it as a lecturer, or even a tutor. I use it as the really smart guy you sit next to in a lecture theatre who you whisper questions to.


Thanks for sharing! That's really cool and helpful, I've noticed my own linear algebra is shaky and what you described seems like one of the better ways to use ChatGPT.


Yeah. Makes me wonder if it was a really simple concept, like something like recursion or something.


Also, even for relatively simple and beginner things like recursion there are levels of understanding. Some of the anecdotes I see from people using ChatGPT to learn a new technical concept suggest a false sense of deeper levels of understanding. This seems to be a result of beginner naïveté together with the supreme confidence ChatGPT exudes.


I have had many similar experiences with just the free version of ChatGPT. Now I also ask questions more immediately. In the past, when a bunch of new terms or concepts were thrown at me, the effort to stop and dig into each of them via Google search was great enough that I would leave many stones unturned. When using GPT while studying up on a new thing, the friction is very little so fewer stones left unturned.

And like you mentioned, being able to translate concepts from a novel domain to one I'm very familiar with can be like translating an unfamiliar language to english.


I’d be worried about using ChatGPT for this given its tendency to hallucinate. Did you have to verify each response it gave you, or were you willing to trust the output?


I wouldn't verify each response, but I would (sorta) verify at the end by moving on to read within the topic.

As you start to grok a topic, you should start to recognize internal inconsistencies and then you can probe those details. When you feel you have some understanding, you can then start consuming media on the topic at a higher level and any problems in your understanding should fall out fairly quickly.


I find these stories strange. If it would be like that, wouldn‘t you learn from an expert friend just as fast? My experience is that for anything worth learning it takes a long time of thinking about it from various angles.


Is there an endless amount of "expert friend" available somewhere that I don't know about? How can I learn something none of my friends know about?


A book?


You can’t ask a book for clarification, or to fill the gaps it leaves.


I agree, but it does give you access to field experts. Presumably a good book covers the learning approach in whichever way they consider it best, from the position of a field expert. It's also vetted information by other field experts. At the moment, ChatGPT's answers are between ChatGPT and the asker. It can be a total hallucination or useful information. By the way, my point is just that access to experts is not that hard if you consider their books as access.


You have expert friends who will spend an hour tutoring you on demand? Are you a celebrity or something?


While you are seeking out your expert friends and taking a long time, I’ll be on to the next 10 things I can learn from the AI.


I‘m talking to them while I‘m having a drink on the beach. How about that?


I mean, if you have an expert friend on any arbitrary topic that you meet at the beach with a drink in the moment, then more power to you!

Perhaps try using ChatGPT to identify whether others can do this and for some ideas for why they may not be able to.


Now ChatGPT is everyone's "expert friend".


Yeah, Chatgpt is as good as you know how to ask it questions. I have been studying clojure concepts and I too would have taken more time than an afternoon in understanding them.


> This sounds a lot like this other concept I know. How do these differ in practical use?

How did you test the answer it gave you?

It sounds like the end product is code, which is testable, so maybe it’s moot. But that type of question is not useful in my experience because you can’t really disprove it’s answer. It’s also a leading question. It might not be similar to that other thing at all, but ChatGPT would manufacture something in that case.


I find hitting my head on the wall is essential for learning. Hitting my head on the wall helps me learn where the boundaries of functionality are, or where the actual operation of the thing differs from my intuition.

If I just learn some example use cases that work, that tells me something but I won't really have mastered whatever it was that I set out to learn.


What was the thing? It's really hard to put a value to what you're describing without knowing what it was.


Yeah, chatGPT is a game changer for learning. Because of chatGTP,

"I know GNU Make."

I just love that it's helping me conquer technologies that I've been wanting to learn my entire career but I guess just didn't have the discipline to do it. I'm getting an itch to tackle Emacs next.


This is exactly what I do as well. I make it generate sample code, ask it lots of "why" questions, ask it to help with the bugs. You can learn a huge amount in record time. Is this the end of Stackoverflow?


Only if you can trust what it told you. And how can you trust it without knowing the material.

For all you know you've not only not gained knowledge you've worse gained false knowledge.


How do you account for hallucinations


Bullshido blitzkrieg


this is the way




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: