The setting is mostly cosmetic and only affects the Bluesky official app and web interface. People do find this setting helpful for curbing external waves of harassment (less motivated people just won't bother making an account), but the data is public and is available on the AT protocol: https://pdsls.dev/at://robpike.io/app.bsky.feed.post/3matwg6...
So nothing is stopping LLMs from training on that data per se.
That's assuming that AI companies are gathering data in a smart way. The entire MusicBrainz database can be downloaded for free but AI scrapers are still attempting to scrape it one HTML page at a time, which often leads into the service having errors and/or slowdowns.
Yea that’s true. I’m just saying if someone wants to put in a modicum of effort, AT ecosystem is highly scrapable by design. In fact apps themselves (like Bluesky) are essentially scrapers.
1/16 of a CPU is admittedly more terrifying, I remember wayyy back in the days of shared hosting we didn't give less than 1/5th a CPU, we had all sorts of issues at absolutely anything higher than that.
Pushed for my company to adopt TigerBeetle for its foundational service rewrite recently and couldn't be happier with the experience we're having so far.
Why are we still trying to get C++ to become a viable choice instead of calling it a day and moving on?
I guess it's my old theory that organisations turn into living beings and start living off of mere survival instinct even past the point of serving any purpose to society.
reply