Hacker News new | past | comments | ask | show | jobs | submit login

> In the simplest case, if your prompt contains 10 tokens and you request a single 90 token completion from the davinci engine, your request will use 100 tokens and will cost $0.006.

It's weird that the prompt characters are also being counted towards your token usage. So you're penalized if you have an elaborate prompt with lots of examples (as shown in the docs)?




It's not that weird; longer prompts require more compute. This pricing is directly proportional to the total compute required for a query, which scales with the sum of the input and output sequence lengths.


Let's take this example: https://beta.openai.com/examples/default-factual-answering

95% of the token usage in this example would consist of the prompt and I would only get 1 sentence in return. So if I wanted to generate another sentence, I would have to pay 95% of the cost towards the prompt again and again... Isn't there a way to create a template for the prompt so you only pay for the generated sentences?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: