Hacker News new | past | comments | ask | show | jobs | submit | jsenn's favorites login
1. SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs (arxiv.org)
34 points by PaulHoule 58 days ago | 9 comments
2. 32k context length text embedding models (voyageai.com)
101 points by fzliu 6 months ago | 32 comments
3. Show HN: Llama 3.2 Interpretability with Sparse Autoencoders (github.com/paulpauls)
579 points by PaulPauls 6 months ago | 99 comments
4. Quantized Llama models with increased speed and a reduced memory footprint (meta.com)
508 points by egnehots 7 months ago | 122 comments
5. Detecting when LLMs are uncertain (thariq.io)
283 points by trq_ 7 months ago | 165 comments
6. Better RAG Results with Reciprocal Rank Fusion and Hybrid Search (assembled.com)
249 points by johnjwang 11 months ago | 57 comments
7. Vector indexing all of Wikipedia on a laptop (foojay.io)
513 points by tjake 11 months ago | 140 comments
8. Unprojecting text with ellipses (2016) (mzucker.github.io)
151 points by nmstoker on May 19, 2024 | 21 comments
9. Making Sense of Acquire-Release Semantics (davekilian.com)
159 points by sph on May 10, 2024 | 69 comments
10. Binary array set (nayuki.io)
63 points by stereoabuse on March 26, 2024 | 24 comments
11. The Era of 1-bit LLMs: ternary parameters for cost-effective computing (arxiv.org)
1040 points by fgfm on Feb 28, 2024 | 447 comments
12. The Continuity of Splines [video] (youtube.com)
192 points by mcorcuera on Dec 16, 2022 | 41 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: