I’ve been unhappy with the inconsistent quality of web search results for quite some time now. Wanting to do something about this for myself, I started experimenting with the way I save my notes/bookmarks.
In all of my trials, two things seemed to work more than all others and proved to be useful in the long term. One: saving my good search result URLs. Two: saving links to discussions on those URLs from Reddit, HN, Lobsters, etc. because in my opinion, community feedback is a (relatively) better proxy for URL quality.
This combined with the Apple notes search feature became a very simple but effective personal index. For the main topics that interest me, I often found myself searching my index before searching the web. Once this setup started working well for me, I thought it was probably time to add a feature which lets users vote on URLs and host a version of this online, so others can both contribute to and benefit from it.
With that in mind, I started speaking to more people who were unhappy with the current state of search. Turns out the most common workaround people use is restricting results to communities (mostly reddit) either by just appending the name or using site: operator (example - bone conduction headphones reddit). This was further confirmation that the so-called “power users” had already moved on to relying on using communities as a proxy for their search results.
And so I built a basic version of Ninfex with all those features: community-curated index, votable urls, forum links and search. It’s an early stage prototype and I’m slowly populating the index with URLs from my personal wiki.
I look forward to your feedback and I’m open to all kinds of suggestions.
The trick there was that it was originally a "personal bookmarking" service, so there was a selfish reason for individual users to submit + curate links. Search got layered on top later.
I suspect that coupling community + search from the start, might result in different outcomes (both positive and negative) as opposed to say just community.
Take reddit for example, it's all about the community, the search is abysmal. But it's fun to imagine what if reddit was mainly about search and the community was a bonus. Surely a balance needs to be struck.
I effectively automated this by tracking people and what they link.
One UI nit:
I clicked on Mathematics and the top submission at the time was: https://ninfex.com/item?id=sVfpxepq4rNQ
The search UI says: "5 days ago."
I thought it sounded odd since I missed the discussion here 5 days ago and clicked through to discover the link leads to an HN discussion from 2014.
So, that "5 days ago" refers to when it was "submitted" to Ninfex instead of any relevant time information on the actual submitted material.
Felt a bit misled, and finding relevant current information over higher ranked older material is a painpoint for me in other search products.
Happens when I am searching for information on a library and the most popular hits refer to information from years ago and the library from multiple major versions ago. Sometimes the answers are incompatible with the current state of the library and its API.
These are different issues, but they feel orthogonal.
You state this like an obvious, well know fact or a law of nature. I don't think it is any of those.
Furthermore there is more advanced solutions like connecting to other accounts like at least one service did (name slips my mind): paste this proof in your Twitter (or HN, or reddit or stackoverflow or whatever) bio and type your twitter (or whatever) handle here and we will verify it.
Yes, people can sell their accounts but there are only so many active 10 year old twitter, HN, reddit accounts ib good standing (and once they are used they are used).
Once it actually get a little bit hard people will start to care more about their accounts. Oh, and for a single individual it gets hard fast.
If it is supposed to be a bookmarking tool, look at pinboard. That's a proper tool, rooted in the solo experience. Save bookmarks between devices, great. The community aspect comes a distant second.
If this is supposed to replace googling "reddit x", it has to add something to googling "reddit x"! There is a constant flow of people googling "reddit headphones", opening the top couple links from r/headphones, then leaving. To capture some of that traffic you need to improve upon that experience.
I really like the backlinking from reddit/HN/etc. Try expanding on that - make your site a destination for AFTER people visit the reddit thread. Reddit threads lock after 6 months or whatever. If you add comments and the ability to merge similar links, I could see it working as an afterparty site.
One thought we had that was different was to let users create communities but auto populate the communities with links that we've crawled, to solve the link gaming issue. And then let them choose how things would be ranked and curate the links we auto populated.
Happy to see, that I am not alone
How to build the next Google: all good results these days are within
communities, and Google search has become useless for most of these
So don't build a search engine: build a "rotten tomatoes
for X" where the sources for each X are "the top N
subreddits/communities/editorial-sites/forums for X".
Now it's a generic Chinese domain parking page:
Sorry to disappoint you :)
HN discussion: https://news.ycombinator.com/item?id=26429942
Million Short and DDG both proxy Bing results. YaCy result quality tends to be unusable, but YaCy can be useful for intranet searches.
The DuckDuckBot scrapes data from select websites for some Instant Answers and it grabs favicons for other sites.
You can either optimize a search engine for a specific vertical or you can try to build features that generalize across the Web no matter the existing verticals or future ones.
While Google is certainly looking at the problem as an algorithmic one, it very much relies on human input to decide what is important and relevant.
Though you may be right about certain features being similar. For now, everything that anyone finds interesting is acceptable.
That being said I like the idea, maybe submission could be made easier with a browser extension? Or you could crawl popular form feeds and become a sort of form aggregator as a form of crawling.
On the other hand how is this different than Reddit? You are basically aggregating links, except instead of providing a form you are linking to multiple forms. I'm not sure if that is better or just an inconvenience. Of course maybe just providing a functional search is enough to distinguish you from Reddit.
The description contains a link to the source code.
- Meorca (https://meorca.com/). It's mostly built from user-submitted sites.
- Search My Site (https://searchmysite.net/). Built entirely from user-submitted sites. Optionally allows paying for additional features.
Many web directories, such as Curlie (formerly DMOZ), IndieSeek, and iWebThings, are also built by users. Some web directories also integrate rudimentary search functionality.
What is it that makes this a search engine, though?
Right now the UI is very search centric, but there's not many results yet. If you look at wikipedia in 2001, search was at the bottom of the page, and it was more focused on adding content and browsing. https://www.webdesignmuseum.org/gallery/wikipedia-2001
* An idea would be to show contributors what others are searching for on the home page, so that they can add links to that search. I just searched for 'email' and found one link. Someone could see that and contribute their favorite related pages.
* To recreate wikipedia, you'll need some feedback mechanism for contributors to feel its worthwhile to contribute. This could be as simple as having a profile that tracks 'karma', 'likes', etc.
What about adding a functionality to Ninfex so it automatically includes links to posts on well-known communities such as HN and Reddit that link to the URL submitted by the user? There's a Chrome extension called Kiwi Conversations that does something like this:
Anyhow, it seems like it'll end up being abused.
You could probably also do something for site owners, like domain verification and then pulling in the site's RSS feed if they have a history of quality content?
The last one is just spitballing, something I was thinking about while making my test submission.
Edit: fixed url