Hacker News new | past | comments | ask | show | jobs | submit | zinccat's submissions login
1. Mooncake: A KVCache-Centric Disaggregated Architecture for LLM Serving (github.com/kvcache-ai)
13 points by zinccat 5 days ago | past | discuss
2. Best Papers at CVPR 2024 (fxguide.com)
1 point by zinccat 14 days ago | past
3. Llama3V is suspected to have been stolen from the MiniCPM-Llama3-v2.5 project (github.com/openbmb)
30 points by zinccat 31 days ago | past | 7 comments
4. Generative AI in Search: Let Google do the searching for you (blog.google)
6 points by zinccat 50 days ago | past
5. DeepSeek V2, near GPT4 performance with 30% GPT3.5 cost ($0.14M input tokens) (deepseek.com)
16 points by zinccat 57 days ago | past | 2 comments
6. SEQUOIA: Exact Llama2-70B on an RTX4090 with half-second per-token latency (infini-ai-lab.github.io)
131 points by zinccat 60 days ago | past | 61 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
