It's really a shame GPU slices aren't a thing -- a monthly cost of $1k for "a GPU" is just so far outside of what I could justify. I guess it's not terrible if I can batch-schedule a mega-gpu for an hour a day to catch up on tasks, but then I'm basically still looking at nearly $50/month.
I don't know exactly what type of cloud offering would satisfy my needs, but what's funny is that attaching an AMD consumer GPU to a Raspberry Pi is probably the most economical approach for a lot of problems.
Maybe something like a system where I could hotplug a full GPU into a system for a reservation of a few minutes at a time and then unplug it and let it go back into a pool?
FWIW it's that there's a large number of ML-based workflows that I'd like to plug into progscrape.com, but it's been very difficult to find a model that works without breaking the hobby-project bank.
Do you think that you can use those machines for confidential workflows for enterprise use? I'm currently struggling to balance running inference workloads on expensive AWS instances where I can trust that data remains private vs using more inexpensive platforms.
Of course you cannot use these machines "for confidential workflows for enterprise use", at least with AWS you know whose computer you're working with, but also keep in mind that it's really hard to steal your data as long as your data stays in memory, and you use something like mTLS to actually get it in and out of memory via E2EE. You can figure out the rest of your security model along the way, but anything sensitive (i.e. confidential) would surely fall way out of this model.
Just currently exploring how custom AI workflows (e.g. text to sql, custom report generation using private data) can help given the current SOTA. Looking to develop tooling over the next 3-6 months. I'd like to see what we can come up with before dropping $50-100k on hardware.
I threw together a toy project to see if it would help me understand the basic concepts and my takeaway was that, if you can shape your input into something a dedicated classification model (e.g. YOLO for document layout analysis) can work with, you can farm each class out to the most appropriate model.
It turns out that I can run most of the appropriate models on my ancient laptop if I don't mind waiting for the complicated ones to finish. If I do mind, I can just send that part to OpenAI or similar. If your workflow can scale horizontally like my OCR pipeline crap, every box in your shop with RAM >= 16GB might be useful.
Apologies if this is all stuff you're familiar with.
i use them a lot and constantly forget to turn mine off and it just drains my credits. i really need to write a job to turn them off when it's idle for longer than 20minutes
Which you are! What ever happened to the MIG implementation work that y’all were working on? Last I heard it was “cursed” and nearly made someone go insane, which is very normal for NVIDIA hardware :)
I don't know exactly what type of cloud offering would satisfy my needs, but what's funny is that attaching an AMD consumer GPU to a Raspberry Pi is probably the most economical approach for a lot of problems.
Maybe something like a system where I could hotplug a full GPU into a system for a reservation of a few minutes at a time and then unplug it and let it go back into a pool?
FWIW it's that there's a large number of ML-based workflows that I'd like to plug into progscrape.com, but it's been very difficult to find a model that works without breaking the hobby-project bank.