Hacker News new | past | comments | ask | show | jobs | submit login

yeah as usual these model can barely sustain a conversation and fall apart the moment actual instructions are given. typical prompt they fail to udnerstand:

"what is pistacchio? explain the question, not the answer."

all these toy llm: "pistacchio is..."

gpt is the only one that consistently understand these instructions: "The question "what is pistachio?" is asking for an explanation or description of the food item..."

this makes these llm basically useless for obtaining anything but hallucinated data.




It only makes them useless.of you insist on asking them in ways you already know will provide bad results instead of adapting your prompts.

This is a bit like complaining that your compiler refuses to produce the right outputs for code you've already determined is incorrect.


Asking LLM from things they learned in training mostly result in hallucinations and in general makes you unable to detect by which amount they are hallucinating: these models are unable to reflect on their output, and average output token probability is a lousy proxy for confidence scoring their results.

On the other hand, no amount of prompt engineering seems to make these LLM able to do question and answer over source documents which is the only realistic way by which factual information can be retrieved

You're welcome to bring examples of it tho if you're so confident.


I've had ChatGPT build a fire nctioning website, write a DNA server, fill in significant portions of specs, all without the problems you describe. I'm never going back to doing things from scratch - it's saving me immense amounts of time every single day. The only reasonable conclusion is that the way you're promoting it is counterproductive.


Good thing then that I specifically mentioned gpt as being able to follow instruction and that I was specifically mentioning the other models.

You're welcome to demonstrate the same ability on other models tho.


You can get useful results out of a whole lot of them as long as you actually prompt them in a way suitable for the models. The point I made originally was that if you just feed them an ambiguous question, then sure, you will get extremely variable and mostly useless results out. Ironically,

And I mentioned ChatGPT because from context of your comments here it was unclear on first read-through what you meant. Maybe consider that it's possible your prompting is not geared for the models you've tried.

Not least, specifically given that if you expect a model to know how to follow instructions, when most of them have not been through RLHF you're using them wrong. A lot of them needs prompt shaped as a completion, not a conversation.


you're welcome to provide examples to prove your points.


I have nothing to gain from spending time testing models for you because whatever I pick will just seem like cherry picking to you, and it doesn't matter to me whether or not you agree on the usability of these models. They work for me, and that's all that matters to me. Try a a few completions instead of a question. Or don't




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: