This, along with recent Reddit goings-on has made me realize a major risk with the current structure of online communication. Take either Reddit or Stack Exchange as examples. They build a platform, and users contribute their time, thought, energy, and knowledge to build a community on that platform. Those companies can then gatekeep and restrict access to all that the community built, when all they did is provide the platform, and store the data. We need to rethink this model.
The thought and knowledge of communities and users need to belong to those communities and users. To people they intentionally and thoughtfully delegate to and trust. We need to decentralize our communications, like how the internet used to be before the arrival of social media and mega forums. We need to revert to small, focused forums, with less anonymous, more persistent communication, run by people we trust. Otherwise, we will continue to see mega companies harvest our data and use it (or not provide it) against our wishes. If we don’t work to mitigate that dynamic, we have nobody to blame for the poor outcomes but ourselves.
This was one of the promises originally of Stack Overflow: all the content is Creative Commons licensed so that if they "turned evil" (I believe it was Joel that put it this way) the community could, in a way, create a fork. https://web.archive.org/web/20230203170609/https://stackover...
Unfortunately the dumps themselves are not a legal requirement, just a gentleman's agreement, so realistically exercising this ability was still at the whim of the company.
That kind of shit is poison. It's like there's a weakness in community run stuff that allows people to come in and co-opt it for their own agenda. I don't know that this is a corporate issue, it's an issue of people not pushing back because they don't want to get accused of anything and letting special interests walk all over them. It's happening everywhere. But I agree, no point on dealing with people who spend their time on this garbage.
But the community is not running it: all the infrastructure is in the hands of a for-profit corporation. Contrast this with the Freenode/Libera split: because not just moderation but also hosting was done by the community, they could continue operations fairly quickly when Freenode turned evil.
So I guess that's the lesson we should learn from it (again): the community doesn't own shit if it does not run the daily operations.
I always wonder why original founders just sell the company and do something else. Why don't they try to control it more and make sure it stays aligned with needs of society more?
Either they can't because of shareholder/equity owners pressure, or they won't, because they really don't care and just said it for PR
I certainly wonder about the "do something else" in the sense of serial entrepreneurs. If I could cash in once I'd be done. If I had enough money to retire on, I would. Run a cat shelter or something.
But the actual answer here is probably a combo of a few things: One, running a company is probably not as much fun as building a company. Much of my career has been "pioneer" roles where nobody else has done the job before. At a certain point, the foundation is laid and the problems to solve are different and often less interesting -- at least to me. It's the build vs. maintain thing.
Two, they started with good and noble intentions. Money got involved. A lot of money got involved. The noble intentions were replaced with reality.
Three, have you met users? As a site grows you have to deal with more and more people and people can be very demanding and not very appreciative. Coupled with the previous factors, I think original founders get burnt out and decide to take the cash and move on. The allure of building anew is too much, the grind of maintenance is too much, and the cash is too good to pass up.
Also four... there's a peak for any site. You often don't know when or how, but you do now that someday your site's maximum value, interest, participation, and all that is going to peak and then decline. Sticking around to fight the good fight may just mean passing up a payday and being left with a declining property nobody wants anymore.
Probably 10-15 years ago now someone I new built a dev focussed B2B SaaS company that was quite successful. I’m not sure if they ever raised investment, but they were hyper efficient and definitely not following the VC model. Profitable, very small team, and that’s how they liked it. I compared notes with the founder regularly and found it very inspirational. He didn’t want to follow the typical VC model. He wanted to build a long term company that could sustain itself. I just loved everything about what they were doing and how well they were doing it.
And then one day, completely unexpectedly (to me) I read that they’ve been acquired by a private equity firm. I reached out to find out what happened and what changed. His answer was along the lines of “turns out everyone has an exit point after all. Priorities and motivations change and I’ve given this company everything I have to give it. It’s time to explore what’s next”
I think about it a lot. And I’ve witnessed it in various ways numerous time since. People that were hacking on a side project with the idea of “I just want this to be a fun passive income stream” seeing the adoption and love their work gets suddenly thinking “oh my, there’s potential here I didn’t see before! I could build a whole team and company around this… let’s go raise investment!”.
I think we’re just really bad predictors at how achieving the things we want will impact us emotionally. When we reach those milestones we react either more positive or more negatively than we could understand previously.
And yet you get people like Zuckerberg who'll stick around until the end. It's not like he cares about users or connecting people beyond them being a means to grow the company. Yet he saw through the company from its founding to the gigantic megacorp it is now, and it doesn't seem like hell ever want to quit. Why didn't he quit? It's not like his users are any more appreciative and he's far from beloved by anybody.
…or they might have determined that they‘d rather spend their time on something else.
Keeping control is a (mostly time) commitment and liability. You have to stay on top of things and actively decide on issues that inadvertently come up.
> Either they can't because of shareholder/equity owners pressure, or they won't, because they really don't care and just said it for PR
That is assuming the worst in people. Have you ever wanted to move onto something new? If you make something cool, it is not your lifelong obligation to oversee it.
They probably had some nice checks written with their name in the "pay to the order of"... but they weren't the ones doing the selling, negotiating the price, or having any say in it more than any other shareholder (weighted by the shares and voting status) would.
What? How do you justify declaring a person as a whole "super toxic" with only a link to an interview citing a link to a blog post including "Things like… hyper-competition… an over focus on aggressive competition… things like zero sum thinking" when:
* the post is about interviewing which mostly is, for better or worse, a competitive zero-sum process.
* the post is more than a decade old (then, more like two now.)
* the post is written by someone else.
I have no opinion on the guy one way or the other, and he may in fact be toxic, but that's hardly compelling evidence.
Because despite claims to the contrary most of these sites/projects aren't created for altruistic reasons, they were created to make money (at some point). Cashing out is typically part of the long term plan.
In the case of Stack Overflow, I think the reason for the data dumps was two-fold: one of the original founders (who left long ago) came across as at least idealistic and wanting to do the right thing. The other was pragmatic and most likely always thinking about the money angle. However, the other founder likely also saw the value of the data dumps from a PR standpoint which was quite valuable as they were initially trying to replace expertsexchange.com that paywalled most of the content. IIRC, they discussed the data dumps in the early days of their podcast.
Now that there's big money to be made from machine learning (both the models and the data they are trained on), they've likely decided 'screw it' on the PR value of the data dumps and would rather get some of that sweet, sweet machine learning money.
The thing is, I sympathize with them not wanting machine learning companies to make money off the site’s content without any benefit to the contributors, moderators, or the site itself. I worry that gating access won’t really change that and just mean that the site owners also benefit at the expense of the community.
It is a well known situation. The best thing is that I don't think it was intentional, contrary to other well-known "offenders".
Experts Exchange was well known for showing up in search results but not providing the answers without paying. Many people hated it and wanted search engines to implement some sort of deny list to filter it out automatically.
So the idea is that in case leadership wants to 'carve out a kingdom' that is not in line with community wishes, the community could take the data dump and create a clone of sorts? Then now the last snapshot for doing so would be the last data drop from March?
Assuming that the linked post is accurate and that the "approval from senior leadership" to turn the dump back on does not come...then yes, I would say so. Actually there is already Codidact, although if I recall correctly they explicitly ruled out importing SE data when they started up. https://codidact.org
There were a number of issues that lead to the decision not to do a grab and seed of SE into codidact.
There was the "what license is that post actually under? Is it 2.5? 3.0? 4.0?" which made things difficult.
There was the "what are the actual attribution requirements that SE has for sites that use its content?" This is a bit of an issue because it's never really clear what those requirements are and what you need to do. It can also hurt SEO because it's duplicated content. Furthermore, codidact leadership had already and enough dealings with SE lawyers and likely wanted to avoid any other.
Lastly, there was the desire to make a philosophical break with SE. The codidact founders didn't want to have anything to do with SE.
Some sites are doing ok. Others stood up but didn't have sufficient involvement to keep them going.
Having an imported site that is mostly inactive with activity on that same content is even more disappointing than having a mostly empty site. And active mirroring is a time-consuming process that runs into rate limit issues with an API.
I haven't used codidact (sorry, name needs replacing), ok just poked around
* too slow
* needs type ahead find search
* needs a GIST experience
The site looks good, presentation is really clean. Lots to like about it. But the think that replaces SO is going to have to be a step function in capabilities. That said, just fixing the weird descend into performative rule following and language-lawyering on SO might be that step function.
* "Tipping" or actually giving money to a question answerer would be cool
* Having a question asker being able to put a bounty on question would be cool
On the face of it, I am not getting scalability (in many senses) vibes from codidact.
Personally when the fork happened, I was interested in it... but I'm mostly "meh" about it since its another Q&A style format that maintains all the advantages and disadvantages of the format that SO provides.
I really hoped that they would have gone for something much more radical in terms of trying to create a way to share knowledge.
It's Stack(Exchange|Overflow) with better governance and a different development cycle time and focus - and I appreciate that... but its still that same Stack in terms of underlying format.
I would have been more interested if they went to something like how Discourse is different from forums. It's still a forum, but it took the structure of a forum in a new way that solves some of the traditional forum issues.
But getting that community there, and stable, and growing is a very hard problem. It's one of the things that Reddit and SO are fairly successful at doing because of the network effects and the corresponding exoduses from other services when they were growing.
There was a compelling story to tell of "why do you ant to switch." There's a story now, but mastodon, Lemmy, codidact and similar haven't really stood out. It feels like "we're the same but better... if you ignore the performance of the site."
"Yea, I could switch from reading reddit to reading Lemmy... but it doesn't have a feed of 200 cat pictures I need each day for my daily dose of eye bleach."
The shtshow is the reason to switch, but there also needs to be new capabilities on the other side. As soon as they break old.reddit.com, I am out.
I feel like https://en.wikipedia.org/wiki/ActivityPub is 10x more complex than it needs to be. The SO replacement should be an application on-top of an existing protocol.
We outlined some of our broad goals (intended differentiators) here: https://meta.codidact.com/posts/276296, in case that helps. Codidact is a work in progress. The biggest non-technical difference from SO is how we treat communities and their members: communities have a lot more autonomy, and we treat people decently. No stockholders are driving anti-community business decisions.
On the technical level, while Q&A is central, we also have other post types and other models. That post I linked to is an article in a blog that's part of our Meta community. The Electrical Engineering community has papers, so people can present information outside of the Q&A structure. Code Golf has a sandbox where people can get feedback on draft challenges before posting them. Software Development has a Code Review category. Some of our communities have added their own customizations to the code, like Code Golf's leaderboard for challenge answers. We want to work together with our communities to build what best serves their needs.
We've done some things that look small but might have larger effects. For example, the asker of a question can't mark one answer as "accepted" like on SO, but anybody can mark an answer as "works for me" -- or "outdated", or other annotations that communities can define. Scoring takes controversy into account, because +10/-5 and +5/-0 are very different even if they're both "net 5". With threaded comments, it doesn't matter so much if two people have an extended conversation; it's not in the way. Abilities are granted based on activity and reputation is just a number -- or can be turned off entirely if that's what a community wants. We're trying to make as much stuff configurable as we can, because we can't possibly know what's going to be best for every single community and don't have the hubris to claim we do.
We have the usual bootstrapping problem of a new thing. Our communities are small and trying to grow. Because they're small, visitors don't see thousands of questions and high activity, so they don't participate either and wander away, making it harder to build activity. We would love to find people who want to work with us to build communities. We recognize that helping to build a community with us is going to be harder and slower than just asking your question on SO, but if everyone were happy with SO this thread wouldn't be here, so maybe we're an option to a few people reading this?
(I haven't posted much on Hacker News, so I hope I've read the room correctly and that this kind of comment is ok. If not, I apologize and would appreciate correction so I don't repeat mistakes. Thank you.)
A decentralized system will never work because 99% of users do not care at all; the centralized systems are easier to sign up for and use. It's been demonstrated over and over and over again.
Even if the underlying tech is decentralized, the community will settle around one or a few big instances (for example, Gmail and GitHub) which often end up having significant control over the trajectory of the entire ecosystem. If you run your own email server and you get put onto Google's spam list - you're fucked.
I don't know that I agree. I think most people don't care about decentralization but they do care about the effects it brings.
Email is a great example where most people wouldn't be interested in a version of email that only let's you email other @gmail.com users. Having a email address that can contact anyone, a phone number that can ring any other phone number etc instead of being locked into a single corporation network is a clear value add that people care about.
The main issue from my perspective is that we only have a select few large tech companies that operate as monopolies so are effectively able to block out new decentralized protocols from coming to be.
RCS messaging is a great example which I think most people would use over alternatives like WhatsApp and Imessage except that apple refusing to support it locks a huge fraction of the market out and stops widespread adoption being possible.
I don't think it's a question of preference, or people being uninterested. It's just a boring and repeated story of corporate monopolies intentionally reducing consumer choice.
>Email is a great example where most people wouldn't be interested in a version of email that only let's you email other @gmail.com users. Having a email address that can contact anyone, a phone number that can ring any other phone number etc instead of being locked into a single corporation network is a clear value add that people care about.
That is only because those technologies predate those companies. Normal people don't care that you can't DM a Reddit user on Twitter or that your Instagram posts don't automatically show up on your Facebook page. People are generally fine with centralized corporate platforms as long as it isn't a restriction of a previously free technology and the network effect has done its thing to attract enough people to the platform.
Surely, normal people get exhausted and burned out with how many accounts they need to create on all the platforms, just to stay in touch with everyone in their circle of friends. It was already crazy for me in the 90s days of Instant Messaging, and pidgin was a Godsend to consolidate all those accounts and friends-list into one interface. Surely normal people hate on the sheer volume of apps they need to install on a phone with limited storage.
I know a few people who are on exactly one platform, and don't mind it, but most everyone seems to need tendrils all over the place to keep up with a normal Internet social life.
Now I have all those different (web or not) applications, that never support all my devices. Most application steal a crazy amount of data, consumes bandwidth as if free on the way, while assuming they are alone and can consume all the resources of the computer they are running on. And they all have different UI (dark) patterns, and bugs and this and that.
I guess I'm abnormal? They only way I contact any friends/family is text message or email. I've never used Facebook, Instagram, or whatever other platforms people use to "keep up with a normal Internet social life". Yet, I have a plethora of friends and don't feel like I have any problem keeping up with family. Maybe people don't really need these platforms.
I think things like being able to contact anyone are important to people, but decentralisation doesn't necessarily provide that (e.g. if I sign up on a Mastodon instance will I be able to see the messages of everyone on every Mastodon instance, and will they be able to see mine? Will I even know if somebody I care about can see my messages or not?)
I think decentralisation is not a selling point to most people. It's an implementation detail that they're happy to go along with but it's a negative if it make the experience worse, makes everything more complicated, if they can't talk to the people they know IRL, etc.
> [M]ost people don't care about decentralization but they do care about the effects it brings.
I’ve tried pointing those out as bluntly as possible as an experiment. As in “well, surprise, locked-in crap with impenetrable failure modes locks you in and is impenetrable when it fails, you signed up for that”.
People didn’t appreciate it, as I expected, but they did seem to recognize the truth of it. That is, the response was along the lines of being forced to use the thing to communicate with some person or institution, not of liking it or thinking it’s not at fault.
I don’t know how one would use this to organize an IM revolt (Riot? sorry), but there does seem to be at least some fuel for it even among people who are not outright IT professionals.
> RCS messaging is a great example which I think most people would use over alternatives like WhatsApp and iMessage [...].
RCS might make a slight amount of sense in a culture that still uses SMS / texts in non-negligible amounts, but that’s basically North America and Japan AFAIU? And I prefer that territory shrink, not grow, as I’m very much not thrilled by the idea of handing back over detailed control over IM—even just billing, not content—to phone carriers. Whatever the country, they have extensive history proving they can and will screw you for decades unless you can leave, and it will take everybody leaving for them to stop.
I agree, but to nitpick one thing: RCS isn't properly decentralized. It's controlled by carriers in the GSMA and, with the current way the infrastructure has been deployed, Google. Interoperable on the app level, yes, but not a poster child for decentralization.
> I don't think it's a question of preference, or people being uninterested. It's just a boring and repeated story of corporate monopolies intentionally reducing consumer choice.
not really. nothing at all is stopping one from starting a new social network that's federated. the issue is users have no reason to move.
it's more a question of incentives, and there's basically none to use something that you're not already using unless it's better, heavily advertised or you're simply paid to.
>If you run your own email server and you get put onto Google's spam list - you're fucked.
It's even worse than that- I ran my own email server, and for some reason gmail delayed any emails from their system to outside of their system. That meant that people would send me an email but I wouldn't get it for 20 minutes. These delays don't exist when using big email providers (it stopped being a problem when I switched to Fastmail, for example) but if you're running a small server Google makes it a nightmare.
Sounds like you need to allowlist google from your grey list. They have a long retry on their side. Once you tell a server to 'go away, come back later' it is really up to the server to decide when or if to retry. Additionally, If they use multiple sending IPs, you can end up grey listing again and again before they try back with a good ip.
You'd either need to allowlist the big providers sending blocks or just drop grey listing all together.
I have run my own email server off a Linode for the last decade-plus, and I have never encountered this. Most of the people with whom I correspond (I run my own business from my server) are on Gmail, and I have always received their emails instantly. If you were getting emails only 20 minutes later, I wonder if there was some server misconfiguration on your end, e.g. sending messages into graylisting delays.
I disagree, it worked before and it is the reason the internet even exists.
The core issue is that user generated data is owned by one individual company. There are existing system that don't have this issues e.g. Usenet or bittorrent.
We don't need to idiot proof the web. There are enough people to gather some place for a social network even if it's hard to use. The others can stay and will stay on reddit anyway until one day when they also had enough and learn to use some alternative.
The "value" of Reddit as a website is vastly overrated anyway. There's nothing on Reddit that can't be obtained elsewhere, folks just get stuck in patterns that are familiar and presume it's because options are limited.
The world will continue to spin without Reddit, or if Reddit isn't popular anymore, or if Reddit kicks all of its current users out, and so on.
Reddit’s value these days is as a super-forum. Instead of needing five logins and five accounts for separate hobbies, you can have one login, one account, and if you occasionally want to comment or ask questions about something else, you can do so without having to create another account. Doesn’t hurt that Reddit has a very good network effect.
That's how you use it, but I'd argue that 90% of Reddit's users don't operate the same way, considering they don't even have accounts to begin with.
In fact, I would even argue that Reddit is actively trying to push users who use Reddit this way off of the platform, as they're not as easily monetized. Reddit is notorious for having low conversion rates on its ads and extremely high ad blocking rates.
I'm definitely a push-away user. I'll never pay them money to use their site and I rarely comment in the ten years or so I've had an account. Same goes for HN; I would never pay money to use this site - I would just find a new site to frequent. I understand this is unfortunate for the website operator but I consider operating a web server at this point a labor of love, not a business model.
IMO the internet post-2010ish is inferior to the one before in theory. Early creators were thoughtful, they created protocols foreseeing a lot of these problems. I'm not sure what's gonna happen next but the parallel universe I'd like to be in is that the internet in the last 15 so years were anomalies or a curve that raises quickly then dies off.
Decentralized software is not the only alternative. A non-profit site would be much better than a publicly owned one. Further, it could be operated by a co-op and democratically run, with its own “laws”.
> A non-profit site would be much better than a publicly owned one. Further, it could be operated by a co-op and democratically run, with its own “laws”.
If it's better why isn't that how these sites are run? Wikipedia for example is an anomaly.
They're hard to capitalize. You need a skilled founder willing to forego profits, or a group of well intentioned devs with management experience, to organize the business side of things. You have to figure out non investment sources of income (charge for services, donations, etc. maybe no ads or at least no ad networks if you're worried about privacy or ethics). It's a lot of work, especially trying to scale that up to where you can actually pay employees salaries and benefits.
It's one thing to have an open source model for software that doesn't necessarily need that level of official organization, but once you start getting into backend infra costs, you can't run that on volunteer labor alone, you eventually have to fundraise for infra costs. If only there were a public or community driven internet... it's too bad the peer to peer models (federation, torrents, etc.) don't work well for real time global communications. Centralized messages are much easier to read and write to.
That doesn't mean such a model can't work or isn't good, it's just much harder, and all for no profit motive.
Realistically I could see a bunch of ex FANGers pooling income into a worker owned coop and starting a tech collective, then maybe handing it over to a 501c3 at some point or just keep it going as a private company without outside shareholders. But someone had to organize all that, and it's not the traditional strong suit of devs.
Fingers crossed though. Would love to see something like that happen.
The difficulty now is that you'll have to compete against Stack Overflow, whereas Stack Overflow had to compete with the paywalled Experts Exchange. Both for-profit and non-profit will have a very hard time unless Stack Overflow really goes off the rails in Quora-style it will probably stick around.
Also I think people underestimate how much effort goes in to moderating Stack Overflow, and how delicate the entire system is. There's already a whole bunch of Open Source Q&A software out there and I'm sure one of them will work fine, that's not really the hard part, and managing servers is also "just" a matter of spending time. Moderating and managing it all is much harder and more time-consuming; there are people whose ability to hold down a job or finish their homework quite literally depends on being to ask questions on Stack Overflow: there's a lot of incentive to abuse the crap out of it, more so than many other sites.
SO also had pretty serious influencers in the tech community to bootstrap everything.
However I’d guess this is existential for them. I’d guess an AI is checking and generating a huge percentage of homework (and work) and that’ll only increase. The place to attack SO is through the chat / code gen UI and git repo’s. New languages and frameworks are probably on GitHub in their most hardened state approximately as quickly as snippets land on SO, and ChatGPT’s upvote is a good indicator of quality sans SO.
"It's been demonstrated over and over and over again."
Well, yes. But it's also been demonstrated over and over again the risks of centralized sites. Maybe, just maybe, one of these days that lesson will stick and communities will take a longer term view. It seems like the cycle is happening a bit faster each time now, so maybe folks will get tired of the "damn, time to move to another site, again..." thing.
Or not. Convenience tends to trump lots of other considerations most of the time.
Between gmail and your own email server, there are thousands of medium size email providers that work ok, so email is a bad example, decentralization still works there. As for GitHub, the coders of all people should have known better than to pile into one site and sell their s̶o̶u̶l̶s̶ code to devil for a little bit of convenience.
It's worse. Even in case users did care about decentralization, such solutions are less reliable, not more. Subjective moderation, rule changes, rug pulls.
>when all they did is provide the platform, and store the data.
Is there not significant innovation and benefit that was designed and implemented in the first place that caused users to contribute their time, thought and energy?
I think the real problem here is when organizations that rely on a crowd-sourced business models decide they just have to be billionaires or solve all the worlds problems with their platforms, instead of just staying true to their model. I don't see what's wrong with just running a highly successful business that makes money for it's founders and doesn't have to go out and strive every day to be the next Facebook or Google.
Make no mistake. Platforms like Reddit and Stackoverflow are real, serious businesses. But why can't they exist and be a general successful business like your local mom and pop restaurant or toy store or whatever?
I run RadioReference.com and Broadcastify, both which are significant businesses but also rely almost solely on crowd sourced data and content. We're wildly successful - but I've never seen the need to hire 3,000 people, or IPO, or do series raises to expand into solving world peace. Our premium subscription pricing has been the same for 15 years. I completely eliminated advertising on one of the platforms last year. We make a lot of money. We provide a lot of value to our communities, and we carefully innovate and expand to provide value. It's a nice happy life for everyone involved, and I don't have to deal with a VC who will be determined to either make a trillion dollars or torpedo my business.
The core problem making it so difficult for this to ever actually happen is that it is 2023 and I guess you only just today somehow came to this realization as if it were new or unexpected or not something people had been saying for the past 25 years of us watching these online platforms abuse their positions of power and slowly turn the screws on people.
Over the past quarter of a century of people trying to create online walled gardens of hosted content we've seen this happen over and over again, and the examples are so numerous that reddit was itself a replacement for Digg and StackOverflow for Experts Exchange. And yet, somehow, today, you suddenly woke up :(.
The reality is that we live in a dystopian Eternal September where as people finally notice what is going on and leave they are just replaced by new people who don't care or simply didn't use the prior service and are attracted to the new shiny, and another 25 years from now you're going to see people making the same unapologetic "I now realize" statements.
What we need to do is figure out how to actually replicate the feeling you are having in a way that doesn't require you to have spent years on a platform and then watching it die so it can be communicated to people before they bother to use a new platform, and in a way that somehow makes them willing to collectively not experience viral lock-in.
(And we also need to figure out how to make people willing to accept doing that at some cost to themselves, whatever form that might take: people on HN continuously do the thing where they give up freedom for a little temporary convenience and then get angry at others for daring to suggest that something a bit harder to use or with any extra friction would ever be a sane thing for anyone to use :/.)
Back in 2017 I gave a talk at Mozilla Privacy Lab called "That's How You Get Dystopia" where I just documented a ton of examples of abuse of centralized power and the reality is that every few days I just come across more stuff to add to the list... and this talk doesn't even bother with all the numerous service that simply enshittified or shuttered.
I was saying to someone yesterday that "enshittification" is a sub-optimal coinage for something that really shouldn't need a new term, and which focuses attention on symptoms rather than root causes. If you give someone a power of attorney over your assets, you'll likely find that they start behaving less well towards you. Or if you give up agency, others will treat you like less of an agent. But what matters is not their behavior at the end but your decisions before that point.
> all they did is provide the platform, and store the data.
You're seriously underestimating the effort it took to build that platform and how much effort it continues to take to keep it running well. I'm not talking about technical challenges, but social ones. It took a long time for them to get the system and incentives right (and it's still not quite right, IMHO), and it takes continued effort to keep it running well in the form of moderation and stopping abuse (and here it also doesn't quite get things right).
I could bang out a "BufferUnderrun.com" in a few months; many people could. But that's not the hard part.
Once upon a time, most people who wanted or had something to say wrote their own little website and hosted it themselves (be it in a datacenter or a server in their closet). Some even ran forums and got fancy with server-side magic because that's what nerds do. Even the kids who couldn't afford anything had free, basic hosting services to choose from (anyone remember those days?).
The internet was designed as a distributed network and the denizens then were distributed. You only got as centralized as a given ISP or datacenter provider.
Of course, we all know as more and more commoners came onto the internet they didn't want to bother with developing or hosting or maintaining a website or anything. They just wanted to shitpost, for free, with blackjack and hookers.
And so "free" services like Reddit, Facebook, et al. came about to serve that demand. Information became centralized, because who the fuck has time to be responsible? Offload that crap!
The cost of that offloading of responsibility has now come knocking with debt collectors in tow, with interest.
I guess what I'm trying to say is: We don't need to rethink anything. We just need to take some god damn responsibility for ourselves. Responsibility is power, and with power you can tell commercial interests you disagree with to screw off.
The problem is information hoarding. If you imagine going to a pub meeting people regularly, and the pub records everything you ever said, and then one day the pub owner says they're going to charge you for the recordings, you'd laugh. Nobody would pay. If they tried to charge you to enter, you'd go to another pub, you wouldn't lament the "loss of your culture".
In fact, people don't record what they talk about in pubs because the point is the chat experience not the records of previous chats. Data isn't oil and it isn't quite sewage, it's more like quicksand or thickets of weeds growing and tangling around your feet. Like minimalists say 'stuff is bad' but stuff is useful, it's having stuff hidden in cupboards and drawers and a garage full of stuff and wanting a bigger house to hold more stuff and most of the stuff going unused because you can't bring yourself to let go of it, and companies advertising that more and newer stuff will make your life better and solve your problems, which is the biggest problem with stuff. Sufficientism might be a more appropriate name - enough stuff to make your life better and no more.
Enough chat to make your life better isn't "all of it kept forever".
Really? The takeaway isn’t rather that rent-seeking AI models need to figure out a way to reimburse companies and communities who’ve stored up all this capital?
Seems to me SO built and delivered huge, huge amounts of value and it’s now all at risk because multibillion dollar companies are free riding.
Users on SO created value and freely shared it with a community in expectation that the value they created would be freely and collectively shared with everyone. In SO's case this expectation was explicit; the data backup and API was billed as a deliberate choice designed to give users the freedom to migrate and scrape data in case the company went "evil." It was designed specifically to reduce SO's ownership claim over user-generated content.
It's not that SO has a moral right to control and profit from that content. The reality is that SO holding that content at all is a conditionally granted privilege that the community affords the site, and it is a privilege that was always designed to be revocable and the data moveable if SO started abusing its position of power as a host and trying to lock down access.
Some writing/content sites that have taken steps to restrict AI access based specifically on community request. That's a very different situation; if a community (particularly a closed or close-knit community) is collectively and (mostly) uniformly trying to avoid an AI scraping the content that they created, then good for them. There are communities online that are in that position. But "how will the company get reimbursed for our valuable asset" should not be part of that conversation. And SO in particular was set up around norms that deliberately allowed this kind of scraping. It's not their asset to protect.
> rent-seeking AI models
I have issues with modern AI economic models too but I don't think that "rent-seeking" is an accurate term to use. A better word would probably be "parasitic"; I understand (and at somewhat agree with) the argument that OpenAI is looking to repackage information it didn't create in a way that redirects attention away from the original source of information.
But I'm having a really hard time figuring out how OpenAI is hoarding a scarce asset to extract value by controlling access to that asset. The more obvious rent seeking behavior here is coming from SO, a company trying to restrict access to Creative Commons licensed content created for free by unpaid volunteers, and trying to reclassify that content as their corporate property.
I guess being as charitable as possible, I do worry about the SaaS model of many AIs that are dedicated to content generation, and I worry a little bit about AI models becoming heavily integrated into creative processes and then extracting a kind of monetary "creative tax" from artists/creators while heavily restricting what they are allowed to make. That's at least adjacent to rent seeking, but I'm still not sure it's the term I would use and I'm not convinced it's a scenario that's applicable here.
Good point that rent-seeking is maybe not the correct term now, but it looks increasingly like services will have to lock down content or shut down due to AI models frontrunning them with their own content. In that world, the AI models are in a great rent-seeking position (i.e. only they have the [old] content which was broadly available and now is not, due to their own incentive distortion).
In any case I buy your argument with regard to SO stewardship of this data and certainly my intuitions were that the major contributors are not super thrilled about their content being digested by models and spit out with no attribution, but that is absolutely an assumption on my part.
Would be interested to see a poll of those users on this question!
I do think if we were having this conversation about an explicitly community-owned forum or fanfic hosting service -- ie, a scenario where it's obvious that the community is behind the decision -- my reaction would likely be very different. I'm broadly pretty sympathetic to a forum saying, "we're doing this for us, not for a VC firm."
SO in specific though is an interesting site in that the value proposition of the site was very heavily based on this information being freely available and uncontrolled. I think they're in a position where it's much less appropriate for the site owners to try an clamp down on AI scraping.
If there is a strong movement from the SO community to change that, I'm not aware of it, but who knows, maybe I'm out of the loop.
Off the top of my head, another example of the distinction I'm getting at would be something like Wikipedia -- if the Wikipedia owners started trying to outright block site backups my immediate response would be, "well wait a second, that was not the deal we all made around this site, we signed up to help the Wikimedia foundation build an Open encyclopedia, even if that means it gets pulled into an AI dataset. We specifically didn't want the Wikimedia foundation to have the power to decide what usage of this data they would allow or deny."
> but it looks increasingly like services will have to lock down content or shut down due to AI models frontrunning them with their own content.
This feels like its slightly off to me.
An LLM that was trained on job postings to be able to categorize them isn't trying to do job postings ( https://wfhmap.com/algorithm/ ) but rather be able to do meaningful classification of bulk unstructured data.
An LLM trained on reddit is... weird to talk to, but talking to it doesn't replace asking a proper subreddit with people answering and comments back and forth. Is ChatGPT stealing views from people complaining about their job in /r/antiwork? Going to something in /r/news and sort by controversial and getting some popcorn turns out to be much more interesting than ChatGPT ever will be.
Maybe you can say that ChatGPT with some training of Stack Exchange sites has some utility (and that its really classified, tagged, and feedback given makes it even more useful), but GitHub CoPilot was trained on just GitHub stuff and its better at code than pretending "try {some broken code} hope that helps" is going to be useful for a LLM.
To me, this feels much more like CEOs that are having difficulty with existing monetization attempting to lock up the data that they have under a questionable pretext to monetize that to companies looking to train models for other things.
The sorting out of what the rights are on the output of models is something that needs to be sorted out - probably by the courts. I am still of the opinion that if something that might be copyrighted is used from any source, then the person doing the copying (who has agency) needs to do a license check themselves on it. I know that there is GPL code on Stack Overflow that looks like its licensed under CC 4.0 and if you copied the SO answer and put it in a BSD licensed repository, you'd be in violation of the GPL - and that's without touching any LLM.
There are also lots of non copyright things that the data could be used for. I'd like to make a AI-CATegorizer. Train it on a representative number of images form each of the reddit cat subs so that someone can ask it "here's a picture of my cat, what subs can this be posted to" and get back "/r/airplaneears /r/blackcats /r/stealthbombers" - and that's not something that is potentially generating copyrighted content (though it inherently uses it)... pretending that that those images were under a CC license, would it need to attribute all of the images that were part of the training data set to respond back with those three subreddits?
There are a couple of different motivations a company could have around blocking API access to prevent AI scraping:
A) scraping itself is too expensive. I suspect that's probably not the case with SO because they blocked backup. Downloading the database from the Internet Archive doesn't cost SO any money.
B) the AI is going to replace the original creators (or more likely, devalue their work and push wages lower) and they'd like to prevent that negative social consequence. This is the charitable interpretation, and I understand writers/programmers/artists being concerned about it, even if I'm slightly more cynical myself about how AI content generation is going to work out once the "shine" has worn off. Note that I'm not saying that this concern is necessarily right or that there aren't positive uses of AI that have nothing to do with replacing jobs; just that it's a concern that a site/community could reasonably have.
C) companies are realizing that there's a lot of VC money in AI right now, and they would very much like to be in the business of selling shovels, and their feeling is that if anyone anywhere is making money off of "their" content then they are morally deserving of some kind of cut no matter what. This is obviously the case for some companies, but is (charitably) probably not the case for all of them.
One test we could use to try and distinguish between B and C is -- if a company is blocking API access, are they then turning around and licensing that data or opening up paid API access, and if they are, is any of that money going to the users that made the content? If SO turns around and makes API access paid and continues to not pay any of the volunteers writing answers, at that point it's much easier to argue that they're trying to sell shovels, not trying to protect users.
This is also part of why I take a cynical view of what Reddit is doing with its API (although Reddit claims they're in camp A more than B). Reddit is probably not doing this to protect its users from theoretical AI displacement because it's still planning to license the data. It's just pricing it so high that only giant companies would be able to afford it.
> Stack Overflow senior leadership is working on a strategy to protect Stack Overflow data from being misused by companies building LLMs. While working on this strategy, we decided to stop the dump until we could put guardrails in place.
> We are working on setting up the infrastructure to do this correctly in the age of LLMs --- where we continue to be open and share the data with our developer community but work to set up a formal framework for large AI companies that want to leverage the data.
> We are looking for ways to gate access to the Dump, APIs, and SEDE, that will allow individuals access to the data while preventing misuse by organizations looking to profit from the work of our community. We are working to design and implement appropriate safeguards and still sorting out the details and timelines. We will provide regular updates on our progress to this group.
---
On reddit, for the A/C test... yea, I'm going to be cynical there that they're looking to sell the information (and it isn't so much trying to protect users). But also that 3rd party clients are not showing ads and may be poorly behaved when provided a free API with (what at one time) was generous rate limiting.
> First, I'd like to say that the intent of what Prashanth is saying is very simple: to return value to the community for the work that you have put in. The money that we raise from charging these huge companies that have billions of dollars on their balance sheet will be used for projects that directly benefit the community.
This is worded very specifically. Is SO planning to give money to users? They don't say anything like that; instead they say that they'll be "spending that money on the platform."
Well what does that actually mean? Every feature that SO builds could be characterized as "for the benefit of the community." It's hard not to read that response as just another way of saying "we're going to profit from this as a company, but don't worry because we use our profits to fund product development."
Heck, Reddit could make exactly the same claim, and in fact the linked Wired article actually makes that comparison:
> "Community platforms that fuel LLMs absolutely should be compensated for their contributions so that companies like us can reinvest back into our communities to continue to make them thrive," Stack Overflow’s Chandrasekar says. "We're very supportive of Reddit’s approach."
It seems more likely that there, too, it was the users doing the work, not SO.
the primary value on SO is generated by the users, and thus the value proposition to enticing new users is also generated by the users. SO is just a forum.
I see where you're coming from calling out AI data miners for rent-seeking, but most social media platforms are also engaging in rent-seeking behavior.
> The thought and knowledge of communities and users need to belong to those communities and users.
I would say: Need to be a public resource, belonging to no-one, i.e. no person, or group, or company should have legitimacy in denying access to it. They should all be considered _trustees_ of such a resource.
> when all they did is provide the platform, and store the data.
To be fair, SE Inc. did a lot more than provide the platform. A lot of development and design work, publication, a bit of the curation work, etc. I don't like how they behave but let's give them what they're due.
I believe we'll have more of these oh s** moments soon when people will finally realise why we need web3. Yes the whole space was full of scammers, charlatans but the technology and point was to create a substrate for networks on the internet.
The idea that these networks and communities need to run on centralised servers is archaic. The technology exists where people should be able to own their own network (followers, subs, following, posts).
Let's be honest though, the primary thing that attracted most powerful people to web3 wasn't decentralization, it was reintroduction of artificial scarcity into digital spaces. Web3 billed itself as empowering users, but it always had an undercurrent of commodification and gatekeeping.
And that's exactly why (ignoring the scammers or pump-and-dump businesses) it saw such heavy investment from VC/tech types. The promise they were interested in wasn't democratization even if that's what they told their users -- what they were interested in was taking a plentiful resource (digital bits) and building a scarce asset that they could use to further entrench exclusivity, status, and monopolistic control over what that asset represented.
Read back over every sales pitch for web3 games. At some point they always devolve into talking about how ordinary users will be able to rent seek: to "license" characters/weapons/gear and passively earn income from other players, or to hoard exclusive tokens/releases in the game and speculate on their future value. Web3 looked at infinite digital spaces and its response was, "infinity is a problem that we need to solve." And it's revealing to look at most web3 branded metaverse attempts and see just how quickly they reintroduced real-world concepts like housing/space scarcity (why on earth would we want a housing market in a digital space with no physical constraints?), and how quickly they leaned into cosmetics and customization as a monetization strategy rather than a user right to free expression.
In general, if a technological "paradigm" is primarily associated with and primarily popular with VC firms, it's probably not being developed with the user in mind.
On the other hand: federation, interoperability, mobile identities, and legal efforts to build a right to data export existed independently of web3 and have shown a lot more promise when it comes to actually increasing user agency.
Again I'm in agreement with everything you are saying here including the part with "other efforts".
I just think we shouldn't throw the baby out with the bathwater. Just because the space got ravaged by zero interest rates, VC's, scammers, charlatans, snake oil salesman and even worse, it doesn't mean the technology and the premise was wrong.
Having a ledger that is secured by decentralised consensus is not only useful but will be a necessity for the digital first future we are heading towards.
We are reaching the limits of the current paradigm. We see companies like Meta having to be the arbiter of consensus. We've seen platforms showing their true faces and commoditising peoples data and network which wasn't really theirs to begin with.
> Having a ledger that is secured by decentralised consensus is not only useful but will be a necessity for the digital first future we are heading towards.
I think I would disagree with this specifically; at least I would disagree that blockchain ledgers are necessary or helpful in most situations.
I'm not going to say that there's no applications that are useful that would rely on a blockchain, but they're extremely limited and the technology itself seemed to be mostly useful and mostly oriented towards turning plentiful resources into scarce ones.
It's of course bad for Meta to be the arbiter of consensus. It doesn't necessarily follow that distributed (in some ways still very centralized, at least on a conceptual level) ledgers are a good alternative, particularly now that we're seeing that basing consensus around eventual consistency, fragmentation, etc... seems to be at least more promising than the alternatives, even if it isn't perfect.
I'm very much in support of efforts like 3rd Room, Fediverse, Matrix -- seeing their success has given me very little reason to believe that distributed ledgers would be necessary, and has given me some reason to believe that in many instances they would be actively harmful. 3rd Room for example would (imo) absolutely be a worse project if it was economically/socially built around a distributed ledger. Not only is there no need for 3rd Room to have that kind of coordination around state in VR rooms, it would be actually harmful for 3rd Room to require all of its VR rooms to use a shared ledger/state. Consensus (distributed or not) isn't necessary for what they're trying to build.
web3 failed because it was a rebranding exercise by cryptocurrency holders trying to create new demand for their random numbers.
Centralized servers aren’t archaic, they’re a natural outcome of how social systems work: finding communities is hard; people want to contribute their ideas, not play sysadmin; spammers and AI researchers will create enormous costs for you; etc. If you federate, you will have more time dealing with those issues than a single focused competitor and you are unlikely to see free contributions which outweigh those costs.
Everything you mentioned is available now on Mastodon, and it’s really interesting to see how that works. Some people love having a small network of their friends, but a lot of people have trouble finding people they want to follow. Instances can have their own rules but dealing with abuse is now a multiparty process and since a lot of instances are run by volunteers that can be slow, unreliable, and inconsistent. Some small servers get hammered by storage and bandwidth demand but there’s no great path to monetization unless you have a ton of users willing to pay more than most people are used to paying for internet services.
In general, these are social problems and there is only so much technology can do to improve them.
I tend to agree but where we diverge is in the thinking that these problems cannot be solved with technology, I believe the opposite.
The point of web3 is to abstract away things like sysadmin by commoditising consensus. Once a blockchain gains mass & momentum it opens up a whole new world of possibilities to hack/reinvent social media.
You could use multiple different types of social media and still maintain a single identity (auth). This means you could find friends and friendlies everywhere you go.
The key is the consensus layer and the ability to store & and read critical metadata. I'll give a personal example:
I've been helping a friend to create a combination of Netflix and DVDs. We package the movie licenses into NFTs (MovieKeys) so when the user signs in with their wallet they can stream all the movies in they own.
There are so many possibilities with this but lets focus on the social part. In theory, a social media service could scan their users wallet for MovieKeys and create a social graph based on that. Heck you could create entire forums just out of the people who owns a certain moviekey. I wont go further because we go a lot of things cooking atm.
The general point is, the technology and the UX to make these things possible is just an arms length away. The entire space got ravaged by scam artists instead of trying to build real magical experiences people actually want.
> I've been helping a friend to create a combination of Netflix and DVDs. We package the movie licenses into NFTs (MovieKeys) so when the user signs in with their wallet they can stream all the movies in they own.
How do you plan to deal with piracy? I can't see many rights-holders going for a system which doesn't allow them to revoke keys.
There is where the Netflix side of the equation comes in. You can sell or gift your MovieKey like a DVD but at the end of the day its a license to stream on our platform.
All content is DRM protected and we'll ban your account from the platform if you pirate or clone your keys ad infinitum.
I'm personally not a fan of this solution but the technology is just not here yet to do this without a centralised control system.
So I get where my friend is coming from, no artist will want to have anything to do with us if people can just pirate the stuff or clone their keys ad infinitum.
> I believe we'll have more of these oh s* moments soon when people will finally realise why we need web3.
Well no, what we need is web0, the original premise of the Internet.
Every protocol was documented in open RFCs, everything is decentralized and everyone is free to use any client and server (or write their own) and everything interoperates. Nobody can own it, there's no "it" to own. That's the only solution to eliminate the otherwise neverending cycle of proprietary platforms followed by their inevitable "oh s* moment".
The world we live in today is very different the world web0 was created for, its older than me.
Don't get me wrong, standardised protocols play a very important role in the current world and it will play a bigger role in the future of the internet (Scuttlebutt, IPFS, Matrix..).
Its just not enough, we need a decentralised way to provide consensus on the internet. People wont set up their own servers, companies that provide these services will always look like a Pareto distribution (FAANG or MAMAA).
In other words, FAANG or what ever the next incarnation of these companies are always incentivised to get rid of interoperability. Selfish reasons aside, after a certain point interop will directly stand in the way of providing better user experiences.
This is why web3 is such an elegant system. It provides a substrate that directly incentivises interoperability. Auth and payments is taken care off, the only thing remaining is custom features but that gets rapidly commoditised, then the only frontier remaining is interoperability.
A good example of this is the NFT marketplaces: Auth is just connecting your wallet. Payments are taken care off. Then you build cool features but everybody else copies you and you copy them so thats a stalemate. Then you have to be interoperable, like OpenSea going multi chain or Magic Eden supporting Eth.
The key here is, the moment someone else supports interoperability, if you dont you are put at a large disadvantage. The same kind of dynamics will happen to decentralised social media platforms.
This problem is inherent to client/server software, and there are really only three ways to do it:
1. The server side of client/server is centralized and run by corporations
2. The server side is decentralized, meaning everyone has their own server
3. Abandon the server, clients connect directly to each other without a server intermediating
Option 3 would be ideal, but would require significant technological advances - it'll be a lo0ong time before bandwidth is cheap enough that Kim Kardashian can serve photos and movies to all of her fans direct from her phone. Option 1 is what we have now, and is terrible in a variety of ways.
Option 2 would be hard but is not obviously impossible, so still our best bet - sure, it's not viable now, but it sure seems like it could be, if an iphone's worth of r&d were put in to it. I would honestly be amazed if no one at Amazon is working on such a thing, since no one would benefit more than AWS from a future in which a cloud VM becomes one of the things that most middle-class families rent monthly.
Content-addressing together with P2P and extra paid relays for those who really need it. In terms of "superstars" sharing content, if they share their image which is content-addressed and can be fetched from anyone, it's enough that one peer shares it with three others for it to be reliable enough in practice. Content like that is also usually just relevant because of recency, so large swaths of people try to access it within 24h, after that the news cycle already moved on so won't be fetched much after that.
> We need to revert to small, focused forums, with less anonymous, more persistent communication, run by people we trust.
You're onto something. Team-BHP [1] is run exactly like this, and it seems to be working.
For those wondering, it's a car-enthusiasts website based in India. They've been around for around 18 odd years I think.
The moderators all have actual dayjobs.
When signing up you have to write a paragraph about why you're really a petrolhead (or dieselhead because Indians love European turbo-diesels :) ), and there's a human on the other end vetting your sign-up application! Plenty, including me, have been rejected atleast once. I got in on my 2nd attempt years later.
As a matter of principle they refuse to do car advertisements.
I don't know how well the site is engineered but it works. Check it out. But I suspect most non-Indians (such as most people on HN) wouldn't find it that useful as it's mostly about the Indian car scene.
I'm more concerned for authors of published works.
Imagine writing a text book with a royalty publishing deal. Your publisher decides they're going to use your book, amongst others, to train an LLM that can answer questions on your subject, and they're not going to pay you anything.
It's a legal gray area and they've got teams of lawyers whereas you do not.
> realize a major risk with the current structure of online communication
I wish this lesson could be learned once for all.
A long-lived community/repository cannot be built on a proprietary platform owned by some corporation. Full stop, no exception. It can't be done. A corporation will at some point need to maximize profit extraction which will ruin it for everyone. A corporation also won't support a platform forever nor can the entity itself survive forever. A single point of failure can't last forever.
> We need to decentralize our communications
Look at the solutions which have lasted longest. Email & mailing lists, going strong since the 1970s. Completely decentralized, interoperability defined by open standard protocols, anyone can build interoperable clients and servers. Nobody owns it. There's no "it" to own. That is what's needed for long term viability.
I'd argue its not just forums, but other key parts of the internet. Like Microsoft training co-pilot AI on github code, but not following the licensing of some code they straight up copy and suggest.
The risk of centralized systems was discussed long ago. The Cathedral and the Bazarr was published in 1999. None of these ideas are new. Everyone who payed any attention knew it was coming.
Incentives in this structure go both ways though, ideally keeping everything in symbiotic balance. Companies that alienate their users tend to not do well shortly after.
I like the idea of decentralized but I'd suggest you don't need to go fully decentralized where every peer has a full copy. Actually I kind of like the Bitcoin approach in that you have the ability to create a full peer, but most people do not. This would allow some decentralization and reduce risk, but not burden everybody with running a full peer.
How did newsgroups work in the days of yore? Depending on which ISP you used, you may or may not have all of the posts within a group if they even had the group. I remember paying for access to specific (can't remember the name) provider that had the most complete listing of newsgroups and had the longest retention of posts. viva a.b.m.a!!
Originally over UUCP (Unix Unix Copy Protocol) and done via dial ups at night (when the rest of the batch transfers were done - email too with the old bang path). The two servers would exchange all the batched email and news posts that were routed to the other side.
RFC 977 ( https://www.w3.org/Protocols/rfc977/rfc977 ) has an example of how files are copied between the two systems (section 4.6) including fetching and receiving mail.
Note that not all posts outbound are necessarily of interest to the other server. An IHAVE message could come back with either a "I want it" type response or a "not interested"
> The IHAVE command informs the server that the client has an article whose id is <messageid>. If the server desires a copy of that article, it will return a response instructing the client to send the entire article. If the server does not want the article (if, for example, the server already has a copy of it), a response indicating that the article is not wanted will be returned.
That's how some of the moderation worked - your server would say "I don't want anything that came by way of X host" or "not interested in that newsgroup."
One of the amusing things to me (looking back at this), if you're familiar with HTTP response codes, you'll likely get most of the way through the NNTP ones.
200 server ready - posting allowed
400 service discontinued
411 no such news group
500 command not recognized
But how did all of this line up within the federated or not conversation? If each ISP could host their own version, that doesn't sound federated. But who was in control of the "main" source of truth type of version?
There was no "main" source of truth for each version and each ISP could have its own set of posts. The bofh.* hierarchy for example had a very limited distribution and if it was found that one of the sites that provided it was leaking it to the general public they'd collectively cut off sites from being able to post or receive posts until they rectify their configuration.
Sure, an ISP could host its own news.answers site and not accept posts from others for that group nor pass messages from it out to others... but that would be a rather lonely place.
Likewise, one ISP may have a different set of posts that it shows for a news group because of moderation actions (we don't accept any posts greater than 1MB in size because of disk space issues and we don't host any newsgroup named *.binaries).
Federation comes from exchanging those posts using a common protocol - NNTP.
That doesn't stop grumpy admin of a federated instance to just do exactly same thing.
> run by people we trust
People change, or retire, just like corporation goals change.
Focusing on more independent is not enough. If you want truly unbreakable stuff first part of the puzzle is saving user's handle and identity in a way that can't be removed.
Then finding out a way to link that to their content so when place of hosting it goes away people can follow to the new place
Then just have all of that content be signed by that identity so users can verify that it is really that person.
And I can't believe I'm saying that unironically but blockchain might just be the solution for that.
Something like immutable log of:
* user declaring "I'm jeff@example.com, here are my public keys". Servers then validate via DNS record or some .well-known location entry whether user is allowed to declare they are from @example.com
* user declaring "behold! jeff@example.com stuff is <here>, and <here> and <here> are addresses for various federation systems". Only passes if that request is signed with above privkey of course
* user declaring "behold! My new public key is X and Y. And Z key is revoked!"
* user declaring "behold! I am now george.effluent@company.com! Re-does checks but for new domain and users previously subscribed to jeff@example com get served redirect".
etc.
Then when server admin inevitably goes rogue you can take your posts and subscribers and go somewhere else.
And when @example.com owner decides "well I'm just gonna to redirect stuff to ads", you can just change your handle and direct people to right place, and other handle is forever taken.
The problem is defining "these sorts of things". StackOverflow didn't do anything evil, they created a useful website and people flocked to it voluntarily.
The thought and knowledge of communities and users need to belong to those communities and users. To people they intentionally and thoughtfully delegate to and trust. We need to decentralize our communications, like how the internet used to be before the arrival of social media and mega forums. We need to revert to small, focused forums, with less anonymous, more persistent communication, run by people we trust. Otherwise, we will continue to see mega companies harvest our data and use it (or not provide it) against our wishes. If we don’t work to mitigate that dynamic, we have nobody to blame for the poor outcomes but ourselves.