(I seemed to have made the HN gods upset) I tried asking the question more clear...

aspenmayer · 2024-05-13T22:10:31

> I think it “understood” the question because it “knew” how to write the Python code to get the right answer.

That’s what makes me suspicious of LLMs, they might just be coincidentally or accidentally answering in a way that you agree with.

Don’t mean to nitpick or be pedantic. I just think the question was really poorly worded and might have a lot of room for confirmation bias in the results.

JustExAWS · 2024-05-13T22:40:52

I reworded the question with the same results in the second example.

But here is another real world example I dug up out of my chat history. Each iteration of the code worked. I actually ran it a few days ago

https://chat.openai.com/share/4d02818c-c397-417a-8151-7bfd7d...

aspenmayer · 2024-05-13T23:04:09

> List of US Presidents with their ages at inauguration

That’s what the python script had at the top. I guess I don’t know why you didn’t ask that in the first place.

Edit: you’re not the same person who originally posted the comment I responded to, and I think I came off a bit too harshly here in text, but don’t mean any offense.

It was a good idea to ask to see the code. It was much more to the point and clear what question the LLM perceived you asking of it.

The second example about buckets was interesting. I guess LLMs help with coding if you know enough of of the problem and what a reasonable answer looks like, but you don’t know what you don’t know. LLMs are useful because you can just ask why things may not work or don’t work in any given context or generally speaking or in a completely open ended way that is often hard to explain or articulate for non-experts, making troubleshooting difficult as you might not even know how to search for solutions.

You might appreciate this link if you’re not familiar with it:

https://buckets.grayhatwarfare.com/

scarface_74 · 2024-05-14T01:19:26

I was demonstrating how bad that LLMs are at simple math.

If I just asked a list of ages in order, there was probably some training data for it to recite. By asking for it to reverse it, it was forcing the LLM to do math.

I also knew the answer was simple with Python.

On another note, with ChatGPT 4, you can ask it to verify its answers on the internet and to provide sources

https://chat.openai.com/share/66231d7f-9eb1-4116-9903-f09a42...

JustExAWS · 2024-05-13T23:10:44

I am the same person. I mentioned that in my original reply. That’s what I was trying to imply by this comment

> (I seemed to have made the HN gods upset)

I could see the Python in the original link when I asked. It shows up as a clickable link. It doesn’t show when you share it. I had to ask it.

aspenmayer · 2024-05-13T23:15:30

You’re also scarface_74? Not that there’s anything wrong with sockpuppets on HN in the absence of vote manipulation or ban evasion that I know of, I just don’t know why you’d use one in this manner, hence my confusion. Karma management?

I saw a blue icon of some kind on the link you shared but didn’t click it.

JustExAWS · 2024-05-13T23:34:32

I said the reason why, twice now

> I seemed to have made the HN gods upset.

My other account is rate limited for some odd reason. I looked back at my comments and I don’t see anything I said controversial.

The blue link is the Python code that was generated. I guess it doesn’t show in the app.

aspenmayer · 2024-05-13T23:40:33

No worries, that was somewhat ambiguous to me also, and confusing. I thought you might be a different person who had edited their comment after receiving downvotes. I mean, it’s reasonable to assume in most cases that different usernames are different people. Sorry to make you repeat yourself!

Maybe email hn@ycombinator.com to ask about your rate limits as I have encountered similar issues myself in the past and have found dang to be very helpful and informative in every way, even when the cause is valid and/or something I did wrong. #1 admin/mod on the internet imo