Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: An experimental, people-powered search engine (ninfex.com)
215 points by zeeshanqureshi on May 28, 2021 | hide | past | favorite | 63 comments



Hi HN, I’m the creator of Ninfex.

I’ve been unhappy with the inconsistent quality of web search results for quite some time now. Wanting to do something about this for myself, I started experimenting with the way I save my notes/bookmarks.

In all of my trials, two things seemed to work more than all others and proved to be useful in the long term. One: saving my good search result URLs. Two: saving links to discussions on those URLs from Reddit, HN, Lobsters, etc. because in my opinion, community feedback is a (relatively) better proxy for URL quality.

This combined with the Apple notes search feature became a very simple but effective personal index. For the main topics that interest me, I often found myself searching my index before searching the web. Once this setup started working well for me, I thought it was probably time to add a feature which lets users vote on URLs and host a version of this online, so others can both contribute to and benefit from it.

With that in mind, I started speaking to more people who were unhappy with the current state of search. Turns out the most common workaround people use is restricting results to communities (mostly reddit) either by just appending the name or using site: operator (example - bone conduction headphones reddit). This was further confirmation that the so-called “power users” had already moved on to relying on using communities as a proxy for their search results.

And so I built a basic version of Ninfex with all those features: community-curated index, votable urls, forum links and search. It’s an early stage prototype and I’m slowly populating the index with URLs from my personal wiki.

I look forward to your feedback and I’m open to all kinds of suggestions.


This looks very much like the old site del.icio.us (which was bought by yahoo and discarded). Don't bother going there but see [wikipedia][1].

The trick there was that it was originally a "personal bookmarking" service, so there was a selfish reason for individual users to submit + curate links. Search got layered on top later.

[1]: https://en.wikipedia.org/wiki/Delicious_(website)


Just read through the features of del.icio.us on wikipedia, you are right about the overlaps.

I suspect that coupling community + search from the start, might result in different outcomes (both positive and negative) as opposed to say just community.

Take reddit for example, it's all about the community, the search is abysmal. But it's fun to imagine what if reddit was mainly about search and the community was a bonus. Surely a balance needs to be struck.


Also quite a lot like the original Yahoo right?


I love the idea! I have a UI suggestion: There is way too much wasted space. Information density is crucial for a search engine. It should be possible to see 20-30 results at a time without scrolling, at least on desktop. Right now only 2.5 results fit on my screen. Use the design of HN as a model!


You are right about info. density. I will try to make it better. Thanks for the feedback.


Howdy, you might find this interesting --

I effectively automated this by tracking people and what they link.

https://insideropinion.com/


Interesting project.

One UI nit:

I clicked on Mathematics and the top submission at the time was: https://ninfex.com/item?id=sVfpxepq4rNQ

The search UI says: "5 days ago."

I thought it sounded odd since I missed the discussion here 5 days ago and clicked through to discover the link leads to an HN discussion from 2014.

So, that "5 days ago" refers to when it was "submitted" to Ninfex instead of any relevant time information on the actual submitted material.

Felt a bit misled, and finding relevant current information over higher ranked older material is a painpoint for me in other search products.

Happens when I am searching for information on a library and the most popular hits refer to information from years ago and the library from multiple major versions ago. Sometimes the answers are incompatible with the current state of the library and its API.

These are different issues, but they feel orthogonal.


These are a valid issues that I will certainly resolve, thanks for pointing them out.


I like the idea very much. First question that comes to mind though is: how gameable is this system?


Thanks for the feedback. Honestly, I'm not very sure myself and I'll have to wait and see. The votes could be gamed perhaps (multiple accounts etc.), but I could restrict voting capabilities to say, email/phone verified accounts, aged accounts or minimum karma (or a combination of all those). Like I said, we'll have to see how it evolves.


Aged accounts won't stop anything, and Karma probably won't stop anything (afterall a person can buy karma-farmed reddit accounts), and I'm not going to give a phone number to try the service out. I hope you figure it out, and I wish you success.


> Aged accounts won't stop anything

You state this like an obvious, well know fact or a law of nature. I don't think it is any of those.

Furthermore there is more advanced solutions like connecting to other accounts like at least one service did (name slips my mind): paste this proof in your Twitter (or HN, or reddit or stackoverflow or whatever) bio and type your twitter (or whatever) handle here and we will verify it.

Yes, people can sell their accounts but there are only so many active 10 year old twitter, HN, reddit accounts ib good standing (and once they are used they are used).

Once it actually get a little bit hard people will start to care more about their accounts. Oh, and for a single individual it gets hard fast.


Site looks good, but feels unfocused. Asking people to submit and tag a bunch of links doesn't feel like a viable jumping off point, and I don't think people are seriously going to use your custom search on your custom pool of links.

If it is supposed to be a bookmarking tool, look at pinboard. That's a proper tool, rooted in the solo experience. Save bookmarks between devices, great. The community aspect comes a distant second.

If this is supposed to replace googling "reddit x", it has to add something to googling "reddit x"! There is a constant flow of people googling "reddit headphones", opening the top couple links from r/headphones, then leaving. To capture some of that traffic you need to improve upon that experience.

I really like the backlinking from reddit/HN/etc. Try expanding on that - make your site a destination for AFTER people visit the reddit thread. Reddit threads lock after 6 months or whatever. If you add comments and the ability to merge similar links, I could see it working as an afterparty site.


This is awesome, we were doing the same thing for a bit at first but ultimately moved away from it in favor of focusing on a product more in our wheel house, but I think this has a lot of potential to be very big. Very much rooting for you!

One thought we had that was different was to let users create communities but auto populate the communities with links that we've crawled, to solve the link gaming issue. And then let them choose how things would be ranked and curate the links we auto populated.


Thanks for the feedback. Interesting thought!


Year ago, by the same reason, I started to work on project http://ontol.org/

Happy to see, that I am not alone


That's awesome!


This is a really nice project. Along with the crowdsourcing nature an open and FOSS approach is a good fit. Do you consider open-sourcing?


Thanks rapnie. Someone else asked about that too. I'm not sure, I haven't thought about it. In your opinion, what would be the pros vs cons of having it either way?


If you don't plan to sell access to your search engine, having it open source means people can help you improve it. Open source is also better to get initial users (who will come from here, Reddit, etc. mostly technical people who care about software being open source).


Oh that clarifies it. Thanks for explaining.


Great idea. Reminds me of a previous HN comment [1]:

    How to build the next Google: all good results these days are within
    communities, and Google search has become useless for most of these
    searches.

    So don't build a search engine: build a "rotten tomatoes
    for X" where the sources for each X are "the top N
    subreddits/communities/editorial-sites/forums for X".
I’d be curious to know how you plan on covering costs.

[1]: https://news.ycombinator.com/item?id=24714546


Do this for products and get that referal $$$. This week I tried to find good cover case for Iphone and currently with google it is almost imbosible.


Thanks! Interesting comment.


Maybe it is just me, but "people-powered search engine" made me think the search queries are sent to humans who look for an answer for you, like an operator.


Remember ChaCha? It was basically that, but you could also do queries via SMS.

https://web.archive.org/web/20100428182234/http://www.chacha...

Now it's a generic Chinese domain parking page:

https://chacha.com


You or anyone else can send me a query, and I'll respond if and when I feel like it. My email is my last name at gmail.


I read on HN or maybe reddit don't remember that there are people on the internet who solve captchas all day and night for you for like $0.01. It could make more sense to have them do this instead. It could be actually useful for people with disabilities, or people who type urls in google, etc.


People-powered as in user submissions, votes.

Sorry to disappoint you :)


There are alternatives to google. I'm not talking only about duck or millionshort[0]. I'm talking about p2p-like search engine like searx[1] and yacy[2]. If some of these independent search engines could collaborate implementing a API to allow cross service search sharing, I think the google search monopoly would soon be threatened.

[0] https://millionshort.com/

[1] https://en.wikipedia.org/wiki/Searx

[2] https://yacy.net/


I've cataloged a bunch of engines with their own independent indexes: https://seirdy.one/2021/03/10/search-engines-with-own-indexe...

HN discussion: https://news.ycombinator.com/item?id=26429942

Million Short and DDG both proxy Bing results. YaCy result quality tends to be unusable, but YaCy can be useful for intranet searches.


> DuckDuckGo's results are a compilation of "over 400" sources,[45] including Yahoo! Search BOSS, Wolfram Alpha, Bing, Yandex, its own web crawler (the DuckDuckBot) and others

https://en.wikipedia.org/wiki/DuckDuckGo#Search_results


That refers to their Instant Answers. The regular link results ("organic results") are proxied from Bing. You can compare them side by side; the order of the results may vary slightly because Bing varies things a little, but the links are the same.

The DuckDuckBot scrapes data from select websites for some Instant Answers and it grabs favicons for other sites.


Prime DDG nonsense this, they're pretty much Bing all t h e

w a y

d o w n


The foundation of PageRank is counting incoming links to a website. Linking from your website to another website is largely a human decision and PageRank aggregates all these decisions across the Web.

You can either optimize a search engine for a specific vertical or you can try to build features that generalize across the Web no matter the existing verticals or future ones.

While Google is certainly looking at the problem as an algorithmic one, it very much relies on human input to decide what is important and relevant.


Hey, it’s dmoz? I like this user focused version, not good for search, but good for browsing. An adjunct to Wikipedia. Probably need a lot of guidelines as to what is acceptable.


Dmoz and other directories of old were (rigid?) pre-defined category based classifications of urls. I don't intend to replicate that.

Though you may be right about certain features being similar. For now, everything that anyone finds interesting is acceptable.


First thing I thought too... "Someone's rebuilding DMOZ?"


While I love the concept it seems somewhat flawed. If every URL needs to be manually submitted it is far easier for spammers than actual users. Especially if you need a handful of form URLs, is this intended for readers or authors? Because a reader likely just wants to see it or post it on their favourite form, I can't see them going around to a handful. If it is expected that the author does it than that is somewhat encouraging spamming forms.

That being said I like the idea, maybe submission could be made easier with a browser extension? Or you could crawl popular form feeds and become a sort of form aggregator as a form of crawling.

On the other hand how is this different than Reddit? You are basically aggregating links, except instead of providing a form you are linking to multiple forms. I'm not sure if that is better or just an inconvenience. Of course maybe just providing a functional search is enough to distinguish you from Reddit.


Thanks for the feedback. The forum URLs aren't mandatory. But they do add value to search results that are either pages or blogs. It is hard to illustrate that value with example search queries. I hope that changes with time and a larger, more diverse index.


I wrote a browser extension that could be used or adapted for this purpose.

https://addons.mozilla.org/en-US/firefox/addon/send-tab-url/

The description contains a link to the source code.


I do agree there would need to be a way to reduce spam, but on the contrary I think there is a huge opportunity to cut out the parts of the internet that serve no useful purposes - for example those spam sites that rip off the text of stackoverflow answers with tons of ads.


Interesting. Other "people-powered" search engines out there include:

- Meorca (https://meorca.com/). It's mostly built from user-submitted sites.

- Search My Site (https://searchmysite.net/). Built entirely from user-submitted sites. Optionally allows paying for additional features.

Many web directories, such as Curlie (formerly DMOZ), IndieSeek, and iWebThings, are also built by users. Some web directories also integrate rudimentary search functionality.


The idea has been floating in the air for quite a while and I agree that something like this is really needed. What I'd personally want to see is a network of anon users hand picking interesting pages with a search and a news feed on top of that. I'd hand pick all these users and kick out anyone who clutters my feed with junk pages. Vouching for pages should be as easy as clicking the FB like button (which is blocked by my uBO), ideally without JS. Later, when this thing earns some trust, I'd install a browser extension. It should be easy to change the anon userids while optionally retaining the network I've been subscribed too. Saving the network can be as simple as copying userids in my list, maybe with some annotations. It should be easy to share my id and discover others, e.g. I came across an interesting page on Wikipedia, saw that mr4123 liked it and added that id to my list. I may want to have multiple userids with different type of pages that I wouldn't want to mix together. I'd pay $5/mo for that, maybe more if I notice the network brings interesting stuff I wouldn't find myself. From a higher level, such service has sound economic value: instead of hundred people spending an hour to find something, the service lets everyone except one to save that one hour. That's why people come to HN, after all.


Reminds me of DMOZ. https://dmoz-odp.org/

What is it that makes this a search engine, though?


I think it’s pretty much the same concept. Blast from the past


Neat idea. You may want to crawl and farm some data from hackernews, reddit, lobste.rs, metafilter, stackoverflow etc. Maybe you could even use wikipedia reference links.

Right now the UI is very search centric, but there's not many results yet. If you look at wikipedia in 2001, search was at the bottom of the page, and it was more focused on adding content and browsing. https://www.webdesignmuseum.org/gallery/wikipedia-2001

* An idea would be to show contributors what others are searching for on the home page, so that they can add links to that search. I just searched for 'email' and found one link. Someone could see that and contribute their favorite related pages. * To recreate wikipedia, you'll need some feedback mechanism for contributors to feel its worthwhile to contribute. This could be as simple as having a profile that tracks 'karma', 'likes', etc.


Good suggestions. I don't think I'll be inclined to crawl or farm though. Thanks for taking the time to check the site out.


How do you prevent click-farms of promoting entries? All user-based decision making problems can easily be rigged by fake users/sock puppets. As long as there's a monetary incentive to promote/demote items with votes, people will pay for it.


Pro tip: make something that is useful without any data. If I can use the tool just my own data, then I’ll be more encouraged to contribute my own data. At some point you have enough data to produce something more useful for new users.


This is very similar to something I envision but am too lazy/unknowledgeable to implement, so I wish you lots of success.

What about adding a functionality to Ninfex so it automatically includes links to posts on well-known communities such as HN and Reddit that link to the URL submitted by the user? There's a Chrome extension called Kiwi Conversations that does something like this:

https://chrome.google.com/webstore/detail/kiwi-conversations...


I have no idea how well it'll take off. I just tested it. It looks like one could be dedicated and fill it up with a list of their site's links. (I tested - and am the owner of linux-tips.us, so you can remove my test if you'd like though I tried to make it a legit submission to not sully your site.)

Anyhow, it seems like it'll end up being abused.


Thanks for your feedback. I saw your submission :)


If it gets bigger, you may need to do email verification - at the minimum - to avoid people filling it up with junk submissions.

You could probably also do something for site owners, like domain verification and then pulling in the site's RSS feed if they have a history of quality content?

The last one is just spitballing, something I was thinking about while making my test submission.


This is a very promising idea, thanks for making this! I was just wondering if you were planning on making the site open source, or will this a commercial, closed-source product, or somewhere in between? Not pressuring you into open sourcing it - it can be a lot of extra work - just wanted to know your eventual plan.


Thank you for the feedback. I haven't really thought about it yet. Too early to tell if this has legs.


Hmm, if I search for the word "a" it seems to crash for some reason.


Hey. Thanks for reporting this.


similar to https://diff.blog or https://wiby.me ?

Edit: fixed url


The second one is a parked domain. Diff.blog looks interesting though, thanks for sharing. I might be adding it to my daily list of sites to visit.


Sorry, mis-typed: https://wiby.me


Nice. I am building this too. Looking forward to exploring your vision of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: