> I know multiple startups that use LLMs as their core bread-and-butter intelligence platform instead of tuned but traditional NLP models
It seems like LLMs would be perfect for start-ups that are iterating quickly. As the business, problem, and data mature though I would expect those LLMs to be consolidated into simpler models. This makes sense from a cost and reliability perspective. I wonder also about the impact of making your core IP a set of prompts beholden to the behavior of someone else’s model.
GPU RAM quantity isn’t typically correlated to inference rate. Precision/quantization levels do affect model size, which will affect inference rate. However, I would expect a smaller model to be faster (less RAM).
No, the Bloom paper is wild: 2.5 pages of author names written with no spacing, I counted 472 authors after deduplication. PaLM 2 has only 181 authors. ;)
I’d be wary of this decision since now you’ve elected to just work with the insurance salesperson, whose goal is to spend as little as possible on you. Providers at KP regularly tell you to “advocate for yourself” because now the doctors are beholden to the insurers. The regulatory agencies allow KP to self-police their quality metrics, so oversight is limited. As a KP member you waive your right to sue them for malpractice and have to engage in a KP designed arbitration process. If you’re in California there’s a limit on malpractice damages ($350k no death / $500k death) so chances are no lawyer will take a case against them unless it’s extremely clear cut and won’t require any domain experts ($$$).
KP may officially be a non-profit, but their CEO rakes in millions and their physician group is a separate for-profit entity.
There are some providers that make heroic efforts to provide quality care, but the overall culture seems to be very bureaucratic and big-co checkbox exercises.
It seems like LLMs would be perfect for start-ups that are iterating quickly. As the business, problem, and data mature though I would expect those LLMs to be consolidated into simpler models. This makes sense from a cost and reliability perspective. I wonder also about the impact of making your core IP a set of prompts beholden to the behavior of someone else’s model.