Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Our monthly total spend is around $180 with a team of 6, about half technical; our biggest line items are for American models or subscriptions which we probably will be planning to get rid of.)

Please tell more :). Do you pay per token from bedrock / openrouter / somewhere else? How many tokens you use over the month, and how many for each task? Which harnesses?

 help



Pay for DeepSeek directly. One developer insists on having his own account and in theory expenses it, but he forgets to turn in $10 expense reports. (Total spend in last two months = about $45.)

Pay for OpenAI Pro directly, but I’m the only guy that uses Codex. $100 a month. My nontechnical partner likes to talk to ChatGPT 5.5 Pro for image related tasks (think generating interior decorating pics).

The nontechnical staff use a Gemini account on a Google family AI Pro sub. I use Antigravity when working on Android or Google Cloud API codebases.

Everyone gets OpenCode Go. The cost is trivial. $10 a month per person.

Pay for MiMo directly. We use it during Chinese off peak hours though. Total spend so far $25 in last month.

We run a few Qwen models locally and pretty much have them pegged all day. RTX 5090 on a PC and a Mac Studio.

There’s also Grok which is used for Imagine for artistic / graphic design related work. I also use the subscription for a vision model in my oh-my-pi harness.

We’re having discussions about how to pull in GLM-5.2 cost effectively. We compete with third world development shops so we can’t really pass on inference costs, but we can benefit from getting jobs done for customers faster. But ⅔ of our work is either internal or open source projects we can’t bill for.


> Pay for MiMo directly. We use it during Chinese off peak hours though. Total spend so far $25 in last month.

Team size, if you don't mind?

> We're having discussions about how to pull in GLM-5.2 cost effectively

Are you evaluating Alibaba's token plan ($50/mo) which includes Qwen3, MiniMax M2, Kimi K2, and GLM5 series.


6 people, 3 programmers, 3 non programmers who now use AI as much as anyone else.

I have not yet checked out Alibaba’s plans. We’re still just using OpenCode Go for GLM-5.2 and Qwen-3.7-Max.

I haven’t looked into MiniMax M3 much at all due to cost.


Not the GP, but I use Opus for planning, Deepseek for actual coding (implementing the plan) and GPT for review. GPT is inexhaustible on the $20/mo plan, Deepseek is dirt cheap (maybe $10/mo) and Claude is Claude.

GP is talking about API / token-based prices, that's why I asked.

I don't know, he said "subscriptions" in the line items, but eg I use Deepseek via the API.

Ah maybe you're right.

I can manage this budget with the chinese models in AWS BedRock. However, in my experience, they aren't as good as claude today.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: