Nothing like live unsupervised learning? That’s the only «real»-like AI that I can «see» being AI-y. We dont have enough and will never have enough data to make the supervised trained models generalizable. We will need something different, because the mixture of experts and alike wont suffice.
Even if AI should be home-run, it probably won't be for most people. In a more technical community, it's natural to think that most people care about "values" when it comes to tech. However, the reality is that people just want what's easiest / cheapest / most fun. It's great when what's easiest/cheapest/most fun aligns with what's best for the individual or society, but those cases are outliers. After spending a few years building a crypto startup, I left with more conviction around this theory.
> In a more technical community, it's natural to think that most people care about "values" when it comes to tech.
Do people actually think that? I know people sometimes joke about an HN bubble but I never really took that seriously. People are using stuff like AWS and Azure which probably wouldn't be the case if "values" was more important than what is most convinient. (relatively) few people today set up their own email servers and instead use Gmail or whatever.
You actually just kind of gave an example of what I meant! Developers are the consumers in this case, and they'll almost always opt to use AWS and Azure versus hosting their own infra. It's easier, faster, usually cheaper, and it makes you yourself are more marketable.
There's a reason software-as-a-service has beat out software-as-a-product. While people might think they want full ownership/control over what they're buying, they want it to be as safe or secure as possible, they want it to be self-hosting and not communicating information back to the distributor... that's just not what plays out for the most part.
OpenAI vs Mistral, Coinbase vs Metamask. I'm not that old, but I'm beginning to notice a trend where the fast, easy, cheap business model wins the most market share, and then slowly works towards providing more values-aligned features as it grows. The company aligned to consumer values from the get-go usually has a tough time keeping up (although they can usually still carve out a good niche from people who are particularly values-focused - Mistral and Metamask aren't doing poorly by any stretch of the imagination).
Because of privacy-reasons, convenience, embedding, security and safety in regards to government monopoly. Just imagine the paranoia in using government owned AIs in less free countries. I'm happy USA is at the forefront in AI development.
Because the ultimate end game for usefulness of this tech will be something akin to always on listening and understanding all your documents and personal data which I think we'd all feel better about if it happened locally.
Not necessarily saying I agree with the building of that, I just can't escape the idea that it's where we're heading.
Where is the money for training big expensive open source models going to come from once the investor hype blows over and companies like Mistral actually have to try to make a profit? They currently have negligible revenue despite their $6 billion valuation, that status quo can't be maintained forever.
I guess a parallel could be made to vue, react and svelte.
Who would bother investing so much time, energy and money into developing an open-source front-end framework for free/funded by corps that use it?
Vue, react and svelte don't earn anything I suppose, but then again I'm no expert in this field.
I guess like some other here have commented, techniques have to be implemented to minimize training time and I suppose the government needs to fund studies which then will benefit the general population to make it possible to train and host AI models.
But I honestly have no idea ... I'm just a simple dev running my mistral on a single RTX lol
We'll see, but I think the equivalent to people donating their own time to a traditional open source project would be people donating spare compute time for training, and nobody has figured out an effective way to do a Folding@home-style distributed version of ML training yet. Having to train open source models on big expensive compute clusters, and bid against commercial interests who also want to rent those clusters could be likened to having to pay everyone working on a traditional open source project full market rate for their time.
LLM training isn't even remotely brute force[1], even with tons of clever tricks to reduce the amount of compute necessary it's still fundamentally just a really hard problem.