Hacker News new | past | comments | ask | show | jobs | submit login

Except the LLM can solve a general problem (or tell you why it cannot), while your calculator can only do that which it's been programmed.





Go ask your favorite LLM to write you some code to implement the backend of the S3 API and see how well it does. Heck, just ask it to implement list iteration against some KV object store API and be amazed at the complete garbage that gets emitted.

So I told it what I wanted, and it generated an initial solution and then modified it to do some file distribution. Without the ability to actually execute the code, this is an excellent first pass.

https://chatgpt.com/share/673b8c33-2ec8-8010-9f70-b0ed12a524...

Chat GPT can't directly execute code on my machine due to architectural limitations, but I imagine if I went and followed its instructions and told it what went wrong, it would correct it.

and that's just it, right? If i were to program this, I would be iterating. ChatGPT cannot do that because of how its architected (I don't think it would be hard to do this if you used the API and allowed some kind of tool use). However, if I told someone to go write me an S3 backend without ever executing it, and they came back with this... that would be great.

EDIT: with chunking: https://chatgpt.com/share/673b8c33-2ec8-8010-9f70-b0ed12a524...

IIRC, from another thread on this site, this is essentially how S3 is implemented (centralized metadata database that hashes out to nodes which implement a local storage mechanism -- MySQL I think).


And that's why it's dangerous to evaluate something when you don't understand what's going on. The implementation generated not only saves things directly to disk [1] [2] but it doesn't even implement file uploading correctly nor does it implementing listing of objects (which I guarantee you would be incorrect). Additionally, it makes a key mistake which is that uploading isn't a form but is the body of the request so it's already unable to have a real S3 client connect. But of course at first glance it has the appearance of maybe being something passable.

Source: I had to implement R2 from scratch and nothing generated here would have helped me as even a starting point. And this isn't even getting to complex things like supporting arbitrarily large uploads and encrypting things while also supporting seeked downloads or multipart uploads.

[1] No one would ever do this for all sorts of problems including that you'd have all sorts of security problems with attackers sending you /../ to escape bucket and account isolation.

[2] No one would ever do this because you've got nothing more than a toy S3 server. A real S3 implementation needs to distribute the data to multiple locations so that availability is maintained in the face of isolated hardware and software failures.


> I had to implement R2 from scratch and nothing generated here would have helped me as even a starting point.

Of course it wouldn't. You're a computer programmer. There's no point for you to use ChatGPT to do what you already know how to do.

> The implementation generated not only saves things directly to disk

There is nothing 'incorrect' about that, given my initial problem statement.

> Additionally, it makes a key mistake which is that uploading isn't a form but is the body of the request so it's already unable to have a real S3 client connect.

Again.. look at the prompt. I asked it to generate an object storage system, not an S3-compatible one.

It seems you're the one hallucinating.

EDIT: ChatGPT says: In short, the feedback likely stems from the implicit expectation of S3 API standards, and the discrepancy between that and the multipart form approach used in the code.

and

In summary, the expectation of S3 compatibility was a bias, and he should have recognized that the implementation was based on our explicitly discussed requirements, not the implicit ones he might have expected.


> There's no point for you to use ChatGPT to do what you already know how to do.

If it were more intelligent of course there would be. It would catch mistakes I wouldn't have thought about, it would output the work more quickly, etc. It's literally worse than if I'd assigned a junior engineer to do some of the legwork.

> ChatGPT says: In short, the feedback likely stems from the implicit expectation of S3 API standards, and the discrepancy between that and the multipart form approach used in the code. > In summary, the expectation of S3 compatibility was a bias, and he should have recognized that the implementation was based on our explicitly discussed requirements, not the implicit ones he might have expected

Now who's rationalizing. I was pretty clear in saying implement S3.


> Now who's rationalizing. I was pretty clear in saying implement S3.

In general, I don't deny the fact that humans fall into common pitfalls, such as not reading the question. As I pointed out this is a common human failing, a 'hallucination' if you will. Nevertheless, my failing to deliver that to chatgpt should not count against chatgpt, but rather me, a humble human who recognizes my failings. And again, this furthers my point that people hallucinate regularly, we just have a social way to get around it -- what we're doing right now... discussion!


My reply was purely around ChatGPT's response which I characterized as a rationalization. It clearly was following the S3 template since it copied many parts of the API but then failed to call out if it was deviating and why it made decisions to deviate.

Do you have any evidence besides anecdote?

what kind of evidence substantiates creativity?

Things I've used chat gpt for:

1. writing songs (couldn't find the generated lyrics online, so assume it's new)

2. Branding ideas (again couldn't find the logos online, so assuming they're new)

3. Recipes (with weird ingredients that I've not found put together online)

4. Vacations with lots of constraints (again, all the information is obviously available online, but it put it together for me and gave recommendations for my family particularly).

5. Theoretical physics explorations where I'm too lazy to write out the math (and why should I... chatgpt will do it for me...)

I think perhaps one reason people here do not have the same results is I typically use the API directly and modify the system prompt, which drastically changes the utility of chatgpt. The default prompt is too focused on retrieval and 'truth'. If you want creativity you have to ask it to be an artist.


No I think they don't have the results you do because they are trying to do those things well ...

The personal insult insinuated here is not appreciated and probably against community guidelines.

For what I needed, those things worked very well


Anecdotes have equal weight. All of these models frustrate me to no end but I only do things that have never been done before. And it isn't an insult because you have no evidence of quality.

> Anecdotes have equal weight. All of these models frustrate me to no end but I only do things that have never been done before. And it isn't an insult because you have no evidence of quality.

You have not specified what evidence would satisfy you.

And yes, it was an insult to insinuate I would accept sub par results whereas others would not.

EDIT: Chat GPT seems to have a solid understanding of why your comment comes across as insulting: https://chatgpt.com/share/673b95c9-7a98-8010-9f8a-9abf5374bb...

Maybe this should be taken as one point of evidence of greater ability?


I think you lead the result by not providing enough context like saying how there is no objective way to measure the quality of an LLM generation after the fact nor before.

Edit I asked ChatGPT with a more proper context: "It’s not inherently insulting to say that an LLM (Large Language Model) cannot guarantee the best quality because it’s a factual statement grounded in the nature of how these models work. LLMs rely on patterns in their training data and probabilistic reasoning rather than subjective or objective judgments about "best quality."


I can't criticize how you prompted it because you did not link the transcript :)

Zooming out, you seem to be in the wrong conversation. I said:

> the LLM can solve a general problem (or tell you why it cannot), while your calculator can only do that which it's been programmed.

You said:

> Do you have any evidence besides anecdote?

I think that -- for both of us now having used chat gpt to generate a response -- we have good evidence that the model can solve a general program (or tell you why it cannot), while a calculator can only do the arithmetic for which it's been programmed. If you want to counter, then a video of your calculator answering the question we just posed would be nice.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: