https://wiby.me/ is also excellent. Someone else linked it elsewhere in the thread but worth riding the coat tails of the top post for anyone interested.
I tried to access it. It displays a different web page in a frame, which has an invalid certificate (among other things, it is expired and is for the wrong domain name), and then when I bypass the certificate error, I get a 404 error.
I know the friction to add websites is the point, but might I recommend a way to add our own websites without having to promote two others. My rinkydink website qualifies, but all the other small websites I know are all already on the list.
Careful, if you show noscript/basic (x)html does a good enough job, you will get attacked by big tech shadow-paid hackers (or idiotic ones), that to force you to use their javascripted grotesquely massive and complex web-engines.
Well, maybe not a static document, but as soon as you have some basic HTML forms doing a good enough job, I would not be surprised to it gets attacked by big tech shadow-paid hackers (or idiotic ones) to push forward their massive and complex javascripted web engines which they have control over.
What? This can easily be averted by adding a captcha to the form (server-side validation so no JS needed) and/or some sort of rate limiter or firewall, e.g. blocking any IP address that sends too many requests too quickly.
This uses OPML blogrolls to crawl blog-to-blog recommendations. I seeded it with the blogs I follow and various planets (https://indieweb.org/planet) and then recursively followed recommendations to build an organic network. Lots of the content is tech-related, indieweb, and smallweb. It's grown to 17 languages and over 4000 RSS/atom feeds.
As an example, the linked blog has a page here [1] and it was discovered by a recommendation by [2].
Yes, semi-regularly, I did a fresh update since your message.
To reduce storage size, only the title/link/metadata of latest post from each feed is saved. I run the crawler manually, aiming for weekly, but sometimes less frequently. So this won't catch every post and it lags behind quite a bit. I'm hitting some hosting sites faster than I'd like, especially ones that support custom domain names, so I'm planning on fixing up the rate-limiting strategy before I put it in a daily cron job.
There's a plan for ArchiveTeam to use the RSS feed as another way of discovering blog posts to archive. I don't think it's generally useful to point your feed reader at it as there's quite a diverse collection of content.
I just tried using google to search for sites I see in ooh.directory, and it's very hard to get them to surface. I can take exact specific phrases from them, like "Scaling a Digital Panel Voltmeter" and without quotes neither Google nor Bing will find the site, and with quotes, only Bing finds the site: (https://zzncx.top/posts/scaling-a-digital-panel-voltmeter/)
Personal blogs with real information just can't be found anymore.
The Reddit thread [1] in which the author introduces Bearblog explains the sorry state of today's Internet a bit. "What's the point of blogging if I can't track users and harvest their email addresses?"
For me, the point of my pointless blogging is to sell dozens of books across the world with my words in them. That way, I feel like I’ve achieved immortality. No joke.
Wonderful collection of how-to’s to run your own server. Thanks for sharing.
Might I suggest (in the interest of privacy) that you give donators the option to use a Silent Payment address instead of a naked BYC address? I noticed you have a Monero address as well, so I assume you care about privacy
A lot of the good stuff got sucked in to walled gardens. People’s personal home pages or tacky MySpace pages were definitely more fun than the current semiprofessional content scroll. Forums like this very one were mostly subsumed in to Reddit. Nevermind the death of the bbs (not actually the internet I realise)
I feel like the internet needs a giant directory of indie websites. So you can actually surf around and find them.
The big modern search engines almost have to be intentionally hiding these websites because they're nearly impossible to find without using an alternative engine like wiby.me or search.marginalia.nu.
I was just going to post a comment similar to this. We've swung towards walled gardens of piles of content instead of graphs of individually curated links.
Exactly that "surfing" or "webring" or "stumbleupon" style of actually browsing in a larger content rather than searching or push-promote within that pile of content.
The walled gardens are better for many of the internet's main uses.
If I need to find out what vodka to buy I Google with site:reddit.com and pick the post that's obviously written by alcoholics. The small web can't touch that.
I don’t think Google hides small sites as much as people are really good at SEO for Google specifically.
Like my blog has literally 0 SEO and you’ll never find it, but a friend of mine has a blog where he does not post very often, but spends a lot of effort on SEO and it’s very easy to run into his blog.
It's impossible to say this is something they do, but it's worth noting that Google also has an economic incentive to mostly show commercial/ad-ridden results, as leading users to blogs with no adsense on them make them less money; so it would at least be in their interest to let the search results look like they do.
To fully understand Google you need to look at them not as a service that brings websites to people, but directs people to websites.
Contains the quote above and “The goals of the advertising business model do not always correspond to providing quality search to users.”
So what we observe in the deterioration of Google search was predicted by its creators, who made the deliberate decision to let this happen by accepting advertising.
Google went public in 2004, after that I don't think any amount of founder idealism could have saved it from shareholder pressure. If anything it's remarkable they held out as long as they did.
It’s so unfortunate that no one inside Google is taking any decision to clearly make things worse. It’s simply the structure of their business that is fundamentally wrong, and their founders had correctly identified the problem right at the start.
It's not like it's impossible to combat search engine spam. The by far most effective tool is to just go after affiliate links and ad-heavy websites. Penalize those websites and 99% of the search engine spam vanishes.
Though as noted, this is not in Google's economic interest to do.
ooh.directory is fantastic, I particularly liked the stance that only add a few sites a week are added, which allows to "digest" these sites. Sadly, no new sites have been added since nearly three month. I assume this is just an instance of "Life happens" - it is a single person venture after all - but if there were a dozen similar attempts at handpicking and cataloguing the "good web", it would not hurt.
You can read the blogs from anywhere, even terminal (no JS needed).
No need to sign up or log in to try it out. I haven't officially launched, but if you'd like to start blogging now, I'll be happy to share an invite code.
Seems like a small web deserves a small client. Why use a "big web" client to read the small web. "Big web" clients are funded by advertising or advertising companies.
Bias disclosure: I have used a text-only client for the last 30 years.
It's basically all the sites and feeds I follow daily with the Hey Homepage built-in RSS reader. You can browse the list and click around, or download it as an OPML file.
RSS = Really Social Sites;
OPML = Other People's Meaningful Links
There was a push during Covid on Gemini pages. I did that for awhile, but the lack of real formatting and not being able to cross link articles became a stopper.
You can see get to some of them here
Collaborative Directory of Geminispace: gemini://cdg.thegonz.net/
i've been publishing things as html2 pages, but not interconnected in any way. so each page (or sometimes group of pages) will be dedicated to an exploration of a single subject. i then send those pages to people who i think might be interested in them. that's all, they otherwise don't see the greater internet. of course people are free to add them to link aggregators, etc. but i don't police this practice. i simply don't care for my output to be consumed by general public, or by llms, or by corporate media, or by whomever who is not my friend or in my immediate immediate circle of friends
Thank you for this. It has inspired me to delete my Reddit account and create an HN account. This gives me hope that the web can survive the social media era.
The site and list of blogs is open source, growing steadily by about 10 each day (almost at 15,000 at this point).
Every recent post from sites in Kagi Small Web is indexed and given preference in Kagi Search results.
How it works: https://blog.kagi.com/small-web
edit: The project just had its one thousandth commit!