Hacker News new | past | comments | ask | show | jobs | submit login

Eh, sure latency is suboptimal. But if you have a LLM in the mix, that latency will dominate the overall response time. At that point you might not care about how performant your index is, and since performance/cost is non linear, it can translate to very significant savings



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: