The Opus model that seems to perform better than GPT4 is unfortunately much more...

chadash · on March 4, 2024

There’s a market for that though. If I am running a startup to generate video meeting summaries, the price of the models might matter a lot, because I can only charge so much for this service. On the other hand, if I’m selling a tool to have AI look for discrepancies in mergers and acquisitions contracts, the difference between $1 and $5 is immaterial… I’d be happy to pay 5x more for software that is 10% better because the numbers are so low to begin with.

My point is that there’s plenty of room for high priced but only slightly better models.

mrtksn · on March 4, 2024

That's quite expensive indeed. At full context of 200K, that would be at least $3 per use. I would hate it if I receive a refusal as answer at that rate.

jorgemf · on March 4, 2024

cost is relative. how much would it cost for a human to read and give you an answer for 200k tokens? Probably much more than $3.

vinay_ys · on March 4, 2024

You are not going to take the expensive human out of the loop where downside risk is high. You are likely to take the human out of the loop only in low risk low cost operations to begin with. For those use cases, these models are quite expensive.

jakderrida · on March 4, 2024

Yeah, but the human tends not to get morally indignant because my question involves killing a process to save resources.

hackerlight · on March 4, 2024

Their smallest model outperforms GPT-4 on Code. I'm sceptical that it'll hold up to real world use though.

nopinsight · on March 4, 2024

Just a note that the 67.0% HumanEval figure for GPT-4 is from its first release in March 2023. The actual performance of current ChatGPT-4 on similar problems might be better due to OpenAI's internal system prompts, possible fine-tuning, and other tricks.

declaredapple · on March 4, 2024

Yeah the output pricing I think is really interesting, 150% more expensive input tokens 250% more expensive output tokens, I wonder what's behind that?

That suggests the inference time is more expensive then the memory needed to load it in the first place I guess?

flawn · on March 4, 2024

Either something like that or just because the model's output is basically the best you can get and they utilize their market position.

Probably that and what you mentioned.

brookst · on March 4, 2024

This. Price is set by value delivered and what the market will pay for whatever capacity they have; it’s not a cost + X% market.

declaredapple · on March 4, 2024

I'm more curious about the input/output token discrepancy

Their pricing suggests that either output tokens are more expensive for some technical reason, or they're trying to encourage a specific type of usage pattern, etc.

brookst · on March 4, 2024

Or that market research showed a higher price for input tokens would drive customers away, while a lower price for output tokens would leave money on the table.

BeetleB · on March 4, 2024

> 150% more expensive input tokens 250% more expensive output tokens, I wonder what's behind that?

Nitpick: It's 50% and 150% more respectively.