Hacker Newsnew | past | comments | ask | show | jobs | submit | pveldandi's submissionslogin
1.Show HN: How We Run 60 Hugging Face Models on 2 GPUs
4 points by pveldandi 9 days ago | past | 20 comments
2.Benchmark: A100 vs. H100 NVMe Random Read throughput during multi-GPU loading
1 point by pveldandi 61 days ago | past
3.Show HN: 50+ LLMs on 2 GPUs with 2-Second Swapping? We built AI-Native Runtime (github.com/inferx-net)
5 points by pveldandi 8 months ago | past
4.Show HN: InferX - AI Lambda-Like Inference Function as a Service
2 points by pveldandi 9 months ago | past
5.We're running 50 LLMs on 2 GPUs – no cold starts, no overprovisioning
4 points by pveldandi 9 months ago | past | 1 comment
6.Show HN: InferX – an AI-native OS for running 50 LLMs per GPU with hot swapping
3 points by pveldandi 9 months ago | past | 2 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: