Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This kind of stuff is likely done without changing model parameters and instead via filtering on the server and prompt engineering. One day is simply too short to train and evaluate the model on a new fine tuned task.


I'm assuming the model has a hand writtn "prefilter" and "postfilter" which both modifies any prompt going in and the token that are spit out? If they discover that the model has problems with prompts phrased a certain way for example, it would be very easy to add a transform that converts prompts to a better format. Such filters and transforms could be part of a product sitting on top of the GPT4-model without being part of the model itself? As such, they could be deployed every day. But tracking changes in those bits wouldn't give any insight into the model itself only how the team works to block jailbreaks or improve corner cases.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: