More

2uryaa · 2026-03-14T01:01:34 1773450094

Thank you for the feedback. Taking note of this!

2uryaa · 2026-03-14T00:57:25 1773449845

Yes, we operate on GB200s and GH200s. Usually we are cheaper for many models and can get up to double the TPS.

2uryaa · 2026-03-14T00:56:52 1773449812

Yep, we are actively working on getting this down. We can meet SLAs with tuning for the real time vision workloads but trying to get rid of this compromise is our next big development task.

2uryaa · 2026-03-14T00:55:57 1773449757

For consumers, we want to just pass on price to performance ratio. For enthusiasts and companies, we do see people want their own models/ ability to use the massive amounts of data they have.

2uryaa · 2026-03-14T00:54:49 1773449689

That's really awesome to hear!!

2uryaa · 2026-03-14T00:54:26 1773449666

Hey Jack, we use GB200s for these workloads. Feel free to check those big models out on our site! We are doing Kimi, GLM, Minimax, etc.

jakestevens2 · 2026-03-14T16:00:49 1773504049

Nice! But that doesn’t answer the question. Do these optimizations don’t scale to multi-device workloads or not?

2uryaa · 2026-03-12T21:24:39 1773350679

Also curious about this. We have a 30 day content retention policy and have to have access to your fine-tuned model/LoRa if deploying that. If there's anything we can change, happy to hear it out.

mistercheese · 2026-03-14T06:51:13 1773471073

Would love a zdr option if possible, that’s honestly the main thing I’m going to OpenRouter for.

2uryaa · 2026-03-12T21:21:55 1773350515

We usually charge by GPU hour for those finetunes, around 8-10 dollars depending on GPU type and volume! This is similar to Modal, but since the engine is fully ours, you don't wait ~1 min for cold starts. Ideally, we will make onboarding super frictionless and self serve, but onboarding people manually for now.

2uryaa · 2026-03-12T21:19:46 1773350386

Haha sorry for the typo! Your F500 use case is exactly who we want to target, especially as they start serving finetunes on their own data. Thanks for the feedback!

reactordev · 2026-03-13T04:39:54 1773376794

The issue now is they are convinced OpenClaw can solve all their business process problems without touching Conway’s law.

2uryaa · 2026-03-12T21:18:10 1773350290

our SLA is actually higher and we are lower priced. We are also using this as a step into serving finetuned models for much cheaper than Fireworks/Together and not having the horrible cold starts of Modal. We're essentially trying to prove that our engine can hang with the best providers while multiplexing models.