Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's a really important reason to care about open source models: prompt engineering is fiddly enough without the risk of your model provider "upgrading" the model you are using in a way that breaks your existing prompts.

OpenAI already upset a lot of (admittedly non-paying academic) users when they shut off access to the old Ada code model with only a few week's notice.



I’m curious about how enterprises will manage model upgrades.

On one hand, as you mention, upgrades could break or degrade prompts in ways that are hard to fix. However, these models will need constant streams of updates for bugs and security fixes just like any other piece of software. Plus the temptation to get better performance.

The decisions around how and whether to upgrade LLMs will be much more complicated than upgrading Postgres versions.


Paying users who need this kind of stability are more likely get access to those models via Azure rather than from OpenAI directly, which comes with the appropriate enterprise support plans and guarantees.


Why would the models themselves need security fixes? The software running the models, sure, but you should be able to upgrade that without changing anything observable about the actual model.


LLMs (at least the ones with read/write memory) can exactly simulate the execution of a universal Turing machine [1]. AFAIK running such models will therefore entails the same fundamental security risks as ordinary software.

[1] https://arxiv.org/pdf/2301.04589.pdf


Not necessarily. The insecurity from LLMs comes from the fact they’re a black box - what if it turns out that particular version can be easily tricked into giving out terrorism ideas. You could try to add safeguards on top, but they’ve already been bypassed if it has been used for something like that. You might just have to retrain it somehow to make it safe


The OpenAI APi has model checkpoints, right now the chat options are:

gpt-4 gpt-3-5-turbo gpt-4-0314 gpt-3-5-turbo-0301


The 3.5 legacy model disappeared from the ChatGPT UI recently. Is it still available via the API?


Notably absent from the available model list is code-davinci-002 - a lot of people were burned by that one going away.


Those are ChatGPT models. The code-davinci-002 model is still available - they responded to community requests to keep it up.


Midjourney does this, as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: