Wow that is quite decent pricing actually. This makes me want to try to deploy something like llamafile/ollama or similar to fly.io+gpu for my personal on-demand llama/llm whims. (@simonw I’m looking at you— I feel like if you haven’t already done this on fly.io, that you’re probably thinking about it :) ) Seems private enough.. could throw some basic auth on top of it for me and trusted friends/family so I don’t get crazy bills. But privacy-wise with fly.io I think it’s good enough.
Only problem might be everytime it spins up to download the large model might be wasteful as far as getting charged for network/bw usage— wonder if it would be more cost-efficient to have persistent storage or just see how much time and bw it is to download on every cold start…
Also we’d have to do apples to apples right? One cannot complain about fly.io gpu pricing and then point to say vast.ai or lambda labs as “evidence” of that. They aren’t at all the same type of service..
Only problem might be everytime it spins up to download the large model might be wasteful as far as getting charged for network/bw usage— wonder if it would be more cost-efficient to have persistent storage or just see how much time and bw it is to download on every cold start…