Hacker News new | past | comments | ask | show | jobs | submit login

which gpus do you use? is the pricing ($1 per 1000 requests) independent of inference time and bandwidth? for instance, some of our models finish within 2 seconds while other models take ~60 seconds, depending on the input. we have been searching for something like this for a long time, but all the other options were lacking in one way or another.

Hey panabee, pricing is definitely something I want to iterate on. I wanted to start off simple and then move to tiered pricing based on what type of model you're running, whether you need CPU/GPU and tiers of execution time. Let's definitely connect as I'd love to serve your use case

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact