Forums, perhaps. But small group chats (which I suppose are technically "dark") are the bedrock of the current internet writ large and where most of the content that filters up to places like Twitter comes from.
You're absolutely right and this is something I've studied and thought about quite a lot. The dark aspect of group chats does fundamentally separate them from forums, which had the benefit of searchability, permanence and topic longevity.
At least we will benefit from what forums are left in the form of model training data. People give LLMs a lot of shit, but it's possible one day that a language model ends up becoming a go-to oracle of future archeologists studying the present day.
Sometimes it's easy to take for granted how historic the current times are, and how interested people will be in the minuet and institutional knowledge which few bother to expend considerable resources preserving.
Wow, I hadn't made that connection. We should somehow bundle a current state-of-the-art LLM in a timecapsule right now, and maybe another one every decade.
If, a thousand years from now future historians need to study our time, they can just ask the LLM.
That would be an incredible modern analogue to the Arecibo message or Golden Record. Imagine being on the receiving end of such an artifact and not knowing how to operate it and being worried about breaking it.
Makes you also wonder if the future of long-range communication between planets or galaxies would involve LLM-based compression, embeddings, etc.
We definitely need to fix the hallucination problem though, or a receiving civilization might be extremely confused about our nature.