
Searx – A privacy-respecting, hackable metasearch engine - teddet
https://github.com/asciimoo/searx/tree/master/
======
zeratax
I've just posted about this in the thread about Peekier, I've been hosting my
own instance behind a vpn for quite sometime and am really happy with the
search results. It supports a great amount of search engines, even yacy (the
complete decentralized search machine) and shows even support for duckduckgo
instant answers, though that seems to be very work in progress for now.

Surely this isn't as great as yacy would be in theory (the results currently
are usually very bad) and isn't really future proof as if this gets too many
requests, google and the rest would definitely do something against it, but
for now it's pretty decent.

~~~
lnalx
What are the pros to keep running your own instance rather than using an
existing public one ?

~~~
teddet
In short, you have no trust on public instances. Check the documentation for
details: [https://asciimoo.github.io/searx/user/own-
instance.html](https://asciimoo.github.io/searx/user/own-instance.html)

~~~
notheguyouthink
On that note, i wonder if you could build a secondary search engine at your
home, but only index a small part of the web you typically use?

This immediately sounds useless, but for me i feel[1] like i primarily search
a handful of sites. If i can index the sites i mainly care about, and fallback
to "normal" engines automatically or with a flag, then for the queries i care
about suddenly i get 100% accurate[2] and 100% private searches.

This _seems_ like a good idea to me. My only question, is how much bandwidth
and storage are needed to even index something like reddit? If it's too much
to run on a moderate home computer, then what's the use?

[1]: I don't have any data to back up this claim, though. Purely a hunch. [2]:
_edit_ , well i guess 100% accurate depends on the search implementation, but
it's still 100% private :)

~~~
garysieling
If you only use a handful of sites, you can also solve this problem by
building yourself a site that handles specific types of queries.

I'm doing something like this, by building a site to search lectures
([https://www.findlectures.com](https://www.findlectures.com)). The "fallback"
for me is to just use youtube, etc instead.

Rather than trying to index everything on Reddit or Youtube, if you just index
"good" parts it's a lot easier, since there is a lot of low quality material
either way. I think you're more limited by bandwidth, for what you can get
into your own index.

A search index is basically a mapping of hashed search tokens -> urls, so it
can be pretty efficient to store locally (e.g. for a video search engine, you
just need unique words in the transcript/title, not the entire video)

------
tenken
All I want to know is there an easy way to search results say 1 year old? In
Google this is the qdr search parameter I'm the url... And the Google API
doesn't support it anymore so far as I can tell.

~~~
tenken
Oh joy, an unofficial clone of the project supports upto Year range filter for
search engines has merged a commit that adds support for it as of 3 days ago.

I find it so odd something so critical for software development can be so
overlooked in such a tool.

~~~
arijun
Out of curiosity, what do you use it for? I can't remember ever needing that
feature.

~~~
tenken
To look for recent answers to programming tasks that are not too out of date
due to API versioning.

------
olalonde
Looks like this scrapes results from Google directly [0]. Doesn't Google
detect and IP block Searx instances? I'm fairly sure it's against their ToS.

[0]
[https://github.com/asciimoo/searx/blob/master/searx/engines/...](https://github.com/asciimoo/searx/blob/master/searx/engines/google.py#L89)

------
devoply
One really useful feature would be the ability to submit all the links you
visit to this directly via a browser extension and be able to search those
links. That would be very useful. Plus a bookmarking feature. BTW does it
support Altavista, WebCrawler, HotBot, or Lycos :)

~~~
verandaguy

         BTW does it support Altavista, WebCrawler, HotBot, or Lycos
    

I haven't heard of most of those, but AltaVista was acquired by Yahoo! and
integrated into their platform in 2003. The brand has been all but defunct for
over a decade now...

~~~
devoply
That's the joke. Those were meta search engines back in the day.

~~~
mcbits
My favorite was Metasearch or Metacrawler or something like that. It would
show you what other users were searching for in real time.

~~~
extempore
That was Metaspy, part of Metacrawler.

Source: I wrote Metaspy. That was almost 20 years ago!

~~~
devoply
w00t! big fan of metacrawler back in the day.

------
unicornporn
So, does anybody remember Seeks?
[https://beniz.github.io/seeks/](https://beniz.github.io/seeks/)

------
zitterbewegung
Man this makes me think of back in the day using metasearch engines back when
Yahoo and Ask were the only ones in town. It appears that Dogpile is still
available.

------
scott_ni
This seem pretty cool! Are there browser plug-ins available?

I currently use DuckDuckGo. I'd switch to this if they implement the bang
syntax, which I use constantly.

~~~
rvern
You can use bangs:
[https://asciimoo.github.io/searx/user/search_syntax.html](https://asciimoo.github.io/searx/user/search_syntax.html).
Instead of redirecting you, it changes the engines that are used for the
search, so you still get anonymity. Also, you can use smart bookmarks[1][2] to
get essentially the same functionality as bangs, but done directly by your web
browser when you type keywords in the address bar, which skips a unneeded
request to the search engine.

[1]:
[http://kb.mozillazine.org/Using_keyword_searches](http://kb.mozillazine.org/Using_keyword_searches)

[2]:
[https://en.wikipedia.org/wiki/Smart_Bookmarks](https://en.wikipedia.org/wiki/Smart_Bookmarks)

------
amelius
Could this commoditize search?

And if so, why should Google play along, and allow their search API to be used
like this?

Just wondering.

~~~
gravypod
If you're running one of these you're probably using ad block so google
already doesn't care about you. What google does care about is agregate
searches made. With this in hand you can easily find some interesting data
that could be used to identify information that would be valuable.

There's always a way to make money and google has enough currently that it can
explore all of these venues

