Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Are there any reliable benchmarks for Machine Learning Model Serving?
6 points by KuriousCat on Feb 10, 2024 | hide | past | favorite | 3 comments
Hi, I am searching for benchmarks that compare the performance of various machine learning model serving frameworks. Some of the previous posts such as the following exist but they don't paint the full picture. Is there any reliable benchmark that gives a good snapshot of the state of art? 1. https://news.ycombinator.com/item?id=28760158 2. https://github.com/cortexlabs/cortex/tree/v0.15.1



This might be relevant. It’s a consortium with members like Baidu, nvidia, arm etc

https://mlcommons.org/benchmarks/inference-datacenter/


Not exactly what you’re looking fir, but perhaps you’ll find it useful - llama benchmarked on all M-series chips, and in comments there are comparisons with nvidia.

https://github.com/ggerganov/llama.cpp/discussions/4167


There is no single answer, mostly a lot of strengths and weaknesses around which formats you can use on your specific hardware / amount of VRAM / if you want to ue multiple GPUs, whether they are exact same GPU or not / etc.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: