Hacker News new | past | comments | ask | show | jobs | submit login
Searx – A privacy-respecting, hackable metasearch engine (searx.me)
120 points by crxro on July 25, 2016 | hide | past | favorite | 48 comments

Interesting project. I looked around a bit, and this is apparently a public installation of the open source project searx: https://github.com/asciimoo/searx

It aggregates searches from other search engines. You can specify a specific search engine to search with a !-shortcut, e.g. "!go your search query" to just search Google.

More info: https://asciimoo.github.io/searx/

Onion link for searx.me: http://ulrn6sryqaifefld.onion/

I have been using https://startpage.com which also bills itself as a "Truly private search engine". It uses Google for search results but sends your queries without any associated IP/cookie info. They also don't log your searches.

I love the idea behind their search engine, though I wonder how Google doesn't block them. It looks like they make money by charging for StartMail, which supports 'one-click PGP encryption', which I also love the idea of, but I can't imagine how that's secure. I haven't investigated, but I expect it involves pushing your private key to their servers.

In any case, thanks for the link; I'll give StartPage a go for a week and see if it sticks.

It's also what I use. I've tried to use ddg is the past several times but I kept coming back to Google many times a day because the results were so bad. I moved to StartPage almost a year ago and I use that exclusively since then (except from times to times when searching images: for that I sometimes go back to Google because they offer better search options and filters like size, color, license, etc.).

Me too, but I added a button (with TamperMonkey) to switch back to Google. Some searches are not good enough on StartPage, but at least I don't default to Gg any more.

I do the same (sort of) with duck duck go. Just add "!g" to the beginning or end of your search and you'll get actual Google.

Would the most truly price one be more like yacy [1]?

This has been a keen interest of mine the last few months, and I will be releasing my own open source search engine which basically a collection of linked docker containers (solr, scrapy, Django, workers).

Certainly the hard road ahead will be relevancy and broad crawling.

[1] - http://yacy.net

I assume you've heard of http://commoncrawl.org/ - hope this helps!

Anyone know why i'd want to use this over duckduckgo? Is it more private or something? Their about page doesn't seem to offer anything compelling over duckduckgo.

DuckDuckGo is a search engine, like Google/Bing. This is a tool that uses those search engines and then combines the results. It's useful because Google sometimes has way better search results (imho) than DuckDuckGo, and this site gives you a way to search Google while being tracked less.

DuckDuckGo is also mostly just a meta-search engine. They do have their own webcrawler, but most results are still sourced from Yahoo!, Bing and the like.

You can search Google from DDG as well. In fact by using the same style bang prefixes as you do in Searx

For me that just redirects to a search on google.com

Well, DDG is owned by Yahoo... Also, it is not open source completely. So you cannot know if it is really tracking you or not.

You can't know if a public installation of an open source project is actually just running the code that you see either. Hosted open source projects have no security benefits over proprietary services.

And duckduckgo is not owned by yahoo. It has partnerships with yahoo, bing and yandex to use their search databases.

I agree with you. However, you can run your own instance of searx if you don't trust anyone. In case of DDG you don't have this option.

I am not sure that partnership or ownership really changes anything. But I hope I am wrong about it. :)

> I am not sure that partnership or ownership really changes anything.

It changes everything. DDG is a Yahoo customer and, as adrusi wrote, sources search data from them among other providers. Yahoo doesn't control DDG, its privacy measures, data collection, or product direction as they would as a parent company. The relationship is completely different.

I do agree with you about Searx having the optional self-hosting advantage. DDG claim they won't track you, but there's really no way to be certain.

I don't this this is correct. I believe they have a partnership/relationship of some kind. Could this[1] be what makes you think that

[1] - https://duck.co/help/company/yahoo-partnership

Which is now owned by Verizon.

Ah, brings back memories of Metacrawler (https://en.wikipedia.org/wiki/MetaCrawler), which was once, pre-Google, my first choice of search engine.

I don't see the API limits posted anywhere? Any limits on the number of requests?

And since they're aggregating from other search engines, won't they eventually reach a point where they're running out of API requests?

If API keys are required how is this truly private? Either you get your own keys and loose privacy or you use someone else's service which brings you right back to what you get with any other search engine.

Fair point. This occurred to me as well after doing my initial post (follow up below).

It appears the linked searx.me is just a deployment of the open source searx project, and probably will reach the limit.

It's interesting, if their claim of " while not storing information about its users" is totally true, I guess request limits would be impossible without at least tracking the originating IP?

They could limit requests by IP without storing them explicitly.

They could use something like a count-min sketch to store how many requests a certain IP has executed, and clean the sketch every minute, for example.

The limit would not be exact, though.

It does not seem to be limiting incoming requests by IPs. So IPs does not need to be stored anywhere.

Nice, but in the current environment this will turn into another wack-a-mole game since obviously Google &c don't want to be used this way.

Somewhat relatedly, I wonder when the right to digitally remember what the browser has seen (and create one's own database and automation and share synthesized results) will become a thing for individuals.

> Nice, but in the current environment this will turn into another wack-a-mole game since obviously Google &c don't want to be used this way.

I've wondered about this (specifically in the context of DDG and Bing, but more generally). You say "another wack-a-mole game" (emphasis mine), so I guess that it's happened before, but I don't know of any occurrences. Has it happened that search engines have blocked this sort of large-scale aggregator/re-director type access?

There are so many metasearch engines now, has anyone considered created a metasearch of the metasearch engines?

We need to go deeper.

I used to use Turbo search as the most meta- and all-encompassing search I could get. Hit the major non-meta and meta engines plus a ton of specialized and obscure ones with ability to configure or filter all that. Long time ago. I'm not even sure the one in this link is it as it's long gone.


Turns out Wikipedia has a nice list of search engines. Didn't realize there's so many but not surprised given lower barrier-to-entry.


That sounds like a hyperdimensional snake eating his hyperdimensional tails.

How easy it is to add new search engines? What if I want to add for example internet archive search engine?

Looks pretty straightforward - as an example, here's a provider for the Arch Linux Wiki:


There's also a list of possible engines to support on their wiki:


The Internet Archive is on there - I'm sure they'd appreciate a pull request adding it!

Since it scrapes Google, how does it manage to not get your server IP blocked?

It doesn't. The idea is that you'll run your own personal instance, and traffic will be low enough to not set off anyone's defenses. You'll see this discussed if you look at the issues, and in the code you can see that it supports using proxies and multiple IPs on hosts which have multiple IPs.

If you run your own, then it's not private any more. Your server IP will identify your searches, and only your searches.

Perhaps one could tunnel it through Tor, rotating identities regularly. There is the issue of dealing with Google's aggressive reCAPTCHA challenges to connections made through Tor, however.

Historically, I've found Google search borderline impossible to use via Tor because it'd usually give me an error page after solving the captcha. Occasionally you'll find a window where they're not nulling Tor IPs, but it's rarely worth the time.

>> Open result links on new browser tabs

This plugin requires Javascript apparently - surprised they can't use target="_blank"?

Ever heard of usability and user experience? Using target="_blank" is strongly discouraged since at least 2005. All links should normally open in the same tab unless you use your browser options to open them in new tabs. You can click on them with the middle button of your mouse or hold CTRL while you click, for example.

Sorry I wasn't clear - they have a plugin in the options that allows you to open all the links in a SERP in a new window/tab - but it says under the plugin that it requires javascript which seemed strange to me!

Inaccessible in Chrome 53.0.2785.21 dev-m (64-bit)

Unsupported protocol

The client and server don't support a common SSL protocol version or cipher suite.


Probably due to the fact that they only have DHE ciphersuites.

from Chrome's dev console:

      www.searx.me/:1 This site requires a DHE-based SSL cipher suite. These are deprecated and will be removed in M52, around July 2016. See https://www.chromestatus.com/feature/5752033759985664 for more details.

Some of you guys might be interested in Qwant (https://www.qwant.com) , a european "truly private" search engine.

Disclaimer: I used to work there.

I like it; It'll be my default search engine;


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact