Hacker News new | past | comments | ask | show | jobs | submit login

Wow that is quite decent pricing actually. This makes me want to try to deploy something like llamafile/ollama or similar to fly.io+gpu for my personal on-demand llama/llm whims. (@simonw I’m looking at you— I feel like if you haven’t already done this on fly.io, that you’re probably thinking about it :) ) Seems private enough.. could throw some basic auth on top of it for me and trusted friends/family so I don’t get crazy bills. But privacy-wise with fly.io I think it’s good enough.

Only problem might be everytime it spins up to download the large model might be wasteful as far as getting charged for network/bw usage— wonder if it would be more cost-efficient to have persistent storage or just see how much time and bw it is to download on every cold start…




It's horrible horrible pricing! Their on demand price for A100 is what gets you H100 sxm in other places.


I am curious what "other places" are you comparing it to.


Also we’d have to do apples to apples right? One cannot complain about fly.io gpu pricing and then point to say vast.ai or lambda labs as “evidence” of that. They aren’t at all the same type of service..


We've got a one-liner for spinning up your own ollama UI. See https://github.com/fly-apps/ollama-open-webui


Oh snap thank you!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: