Hacker News new | past | comments | ask | show | jobs | submit login

Frontier model + rag is good when you need cross-discipline abilities and general knowledge, niche models are best when the domain is somewhat self contained (for instance, if you wanted a model that is amazing at role playing certain types of characters).

The future is model graphs with networked mixtures of experts, where models know about other models and can call them as part of recursive prompts, with some sort of online training to tune the weights of the model graph.




> The future is model graphs with networked mixtures of experts, where models know about other models and can call them as part of recursive prompts, with some sort of online training to tune the weights of the model graph.

What's the difference between that and combining all of the models into a single model? Aren't you just introducing limitations in communication and training between different parts of that über-model, limitations that may as well be encoded into the single model if they're useful? Are you just partitioning for training performance? Which is a big deal, of course, but it just seems like guessing the right partitioning and communication limitations is not going to be straightforward compared to the usual stupid "throw it all in one big pile and let it work itself out" approach.


The limitation is the amount of model you can fit on your hardware, and also sometimes information about one domain can incorrectly introduce biases in another which are very hard to fix, so training on one domain only will produce much better results.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: