Hacker News new | past | comments | ask | show | jobs | submit login

Could you use some sort of RAID array of GPUs to compensate...?



nvidia-smi exposes all cards, so you could run the same workload on multiple cards. This (likely) won't solve the problem of certain failure modes being intrinsic to the work being completed/compute environment. I would speculate some of those aggressive failure modes would present themselves across all the hardware.

Maybe someone could run workloads across CUDA and ZLUDA (Nvidia, and other hardware), but really we just might need more reliability to efficiently and reliability run a file system across disparate GPU hardware.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: