Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah you're not wrong, but it's a bit misleading. This allows you to run faster, but it does it by allowing you to use a larger batch size (arguably not best practice but your mileage will vary). Memory pooling is a bit different in that you can treat the combined cards as a single card from TF/pytorch.


But batch size is prob least problem since you can do data parallelism (send half batch to each gpu, combine on cpu).

I think only model bigger than gpu mem is where you really wish for nvlink on v100s.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: