For me Go is like the 80% language. I like TypeScript as well, but Go is just such a reliable workhorse I'd say? it's not "sexy" but it's just satisfying how it's just these simple building blocks that you can build extremely complex software with
I think that's because there's less local AI usage now since there's all kinds of image models by the big labs, so there's really no rush of people self hosting stable diffusion etc anymore
the space moved from Consumer to Enterprise pretty fast due to models getting bigger
Today's free models are not really bigger when you account for the use of MoE (with ever increasing sparsity, meaning a smaller fraction of active parameters), and better ways of managing KV caching. You can do useful things with very little RAM/VRAM, it just gets slower and slower the more you try to squeeze it where it doesn't quite belong. But that's not a problem if you're willing to wait for every answer.
yeah, but I mean more like the old setups where you'd just load a model on a 4090 or something, even with MoE it's a lot more complex and takes more VRAM, right? like it just seems not justifiable for most hobbyists
With sparse MoE it's worth running the experts in system RAM since that allows you to transparently use mmap and inactive experts can stay on disk. Of course that's also a slowdown unless you have enough RAM for the full set, but it lets you run much larger models on smaller systems.
We used to expose the dedicated servers directly (i.e. no CDN at all), and while that was fine latency-wise, the lack of DDoS protection was really the limiting factor. E.g. Hetzner will just blackhole your subnet if you get DDoSed.
It feels rather unviable nowadays to run a business without some CDN/DDoS protection service in front of your website.
yeah, but dealing with DDoS is easier in terms of DMCA unlike with CDNs because it's you hosting it, not the service provider (this is how Cloudflare avoids DMCA when you cache with them iirc)
so if you can just find a good dedicated server provider that won't cut you off, maybe that's a potential solution?
like another guy said, it's rhe marketing thing. the open claw thing became a hit sensation in the news etc. and now OpenAI can claim it as theirs. they have infinite money anyways, so might as well buy stuff like that I guess.
I'm also seeing a lot of new rambling in Sonnet 4.6 when compared to 4.5, more markdown slop and pointing out details and things in the context which isn't too useful etc...
which then causes increased token usage because you need to prompt multiple times.
reply