
YaCy: Decentralized Web Search - chatmasta
http://yacy.net/en/index.html
======
chipsy
I looked at YaCy not too long ago. It's an interesting technology but needs an
economic incentive to work as a Google replacement, I think. If system
activity was (carefully and judiciously) tied to cryptocurrency payouts it'd
probably experience hockey-stick growth.

~~~
Oblouk
I'd love to see something like this. I would definitely contribute.

~~~
corv
This is a brilliant idea. I'm sure a lot of people would contribute resources
if they were compensated.

------
hysan
This has been around for a long time.[1] Has there been any noteworthy
changes/improvements that prompted the making a new submission? I've always
wondered if I should run this but ever since DuckDuckGo came out, I never went
through with setting up a self hosted search engine.

[1]
[https://news.ycombinator.com/item?id=3288586](https://news.ycombinator.com/item?id=3288586)

~~~
atmosx
Hm, it's computationally expensive thing to do and the crawler must be feeded
and "crawl" 26/7... I mean either is run by a _community_ running many many
nodes or an individual with huge resources, otherwise isn't worth it.

~~~
zkhalique
I'm not sure why a federated search engine much be running 26/7 ... what's so
prohibitive about federating it?

------
chatmasta
Google derives 98% of its revenue from advertising. I'm not sure how
advertising revenue divides between content and search, but given the much
higher prices on search advertising, it seems revenue distribution is heavily
weighted toward revenue derived from search ads.

This is a huge liability for google. Any company with 98% of its revenues tied
to one product necessarily creates a fundamental liability for itself. I'm
sure google execs realize this, and that's why they are pouring so much
revenue into "moonshots" and subscription services (not to mention government
contracts). They are actively diversifying google's revenue because they
realize it comes from a near single source, search, which may represent a
fundamentally unsustainable product.

Can Google control the search market forever? Are people (specifically
technophiles and "early adopters") not growing increasingly frustrated with
them? The same groups of users that Google once relied upon as an initial
source of traction are now abandoning the company's products in search of more
open pastures.

Decentralization is an unstoppable trend with momentum across product
verticals. File sharing was the first "mainstream" invocation of
decentralization technology, and blockchain/Bitcoin is the most recent.
(Interestingly, Bitcoin itself is a meta-enabler of decentralization as it
introduces the possibility of an automated payment layer.)

Is it inevitable that that a decentralized search engine will replace the
current centralized model that Google requires for its sustained business?
When will it happen? How can such a movement gain momentum?

If I were an executive at a google competitor, I would be _actively_ exploring
these questions, and finding ways to push decentralized search products into
the mainstream. Mozilla, Yahoo, are you listening?

~~~
ddorian43
I think they want a bigger piece of the pie, not destroying the pie ?

~~~
chatmasta
Mozilla currently has no search product. However, they do have a browser.
Google also has a browser. Destroying Google search would destroy Google
revenue, which would degrade browser quality. Long term, Mozilla should be
interested in destroying Google search.

~~~
logn
I can't imagine that declining revenues would hurt Chrome. There are lots of
other ways for Google to cut costs before limiting their investment in the
browser.

And Mozilla has always been happy to ride the coat tails of Google Search.
Declining AdWords revenue just means declining Mozilla revenue. Even if the
two aren't partnering together going forward, Google helped Mozilla negotiate
a high bid with Yahoo.

Long term, I think Mozilla wants to see Firefox OS compete with Android.
There's a lot more revenue potential there and the best chance to rapidly grow
their user base.

------
jszymborski
I tried this a (long) while back... and it just wasn't practical as a search
engine.

Would love to see adoption of this though. It would be great if there was a
web interface for it for folks afraid to / can't install a fat client. It
wouldn't contribute to the network, but more people would use it, which might
lead to people with cash supporting it.

------
andrei_
The idea for this came to me a couple years ago, but after a quick search I
found that YaCy had already attempted it. I've been recently thinking of going
for it anyway though. There's enough room for innovation where a really clever
design could potentially lead to a useable search engine, and not just a
novelty.

------
haney
This is such an interesting project!

I didn't see a way for anyone who wasn't operating their own server to use the
service though. It seems like a lot of the quality control / model building
power of google has to do with the volume of in bound queries (they're able to
see what user satisfaction with results is based on clicks). It seems like
having a public facing server that interacted with the peer to peer network
might help with adoption from less technical users.

EDIT: Nevermind found this demo portal
([http://search.yacy.de/](http://search.yacy.de/)) although it's not very
prominent

------
bpetersen
I've started a similar project some months ago, however, my idea was to put
the crawl- and search-software on a server so that the search engine can be
used even on mobile devices, tablets and so on. On server installation could
serve all devices in a household or in a small company then.

The software is far away from being ready, esp. the kernel, the distributes
search is not really implmented, the last months I had not much time to
contribute, however, what is done is available here under an open source
license: [https://github.com/r10s/gosearch](https://github.com/r10s/gosearch)

------
zo1
It definitely has a nice feel to it, even if the search results aren't (at the
moment) perfect.

On the left hand side, after a search it allows you to "refine" your search by
categories. Stuff like author, site, filetype, language, with each one having
a count next to it to display the potential amount of hits with that
filter/category activated.

Additionally, looks like it has some sort of "stealth" mode. It appears to
limit the search to your own peer, or what I'm guessing is already on your
local index. That could be handy by itself, if properly configured, to a local
or personalized search.

------
chatmasta
I've been thinking about this a lot since posting the link.

What are people's thoughts on combining social graph + blockchain +
decentralized search? The idea is that your searches will be somewhat similar
to others in your community, so the crawl index is sharded/partitioned to
optimize for social graph proximity. If you want to index pages non proximate
to you, you can get paid Bitcoin to do it.

This could be implemented with xmpp (lookup socialvpn/ipop project) for social
layer, chord dht for search, and Altcoin with modified pow for incentive.

~~~
Rarebox
There are (at least) two hard(ish) problems involved: 1\. Preventing dishonest
results. 2\. Not disclosing your queries to your friends.

------
benoliver999
I like the idea and ran a node for a year or so, but the truth is I never used
it to actually search - the results were astoundingly bad 80% of the time.

The rest of the time they were just about ok and I'd find what I wanted on the
2nd/3rd page.

It was a good way to throw up some left-field results though, and for that
reason it's worth keeping a node going if you can. Might get it up on the new
VPS.

------
no_gravity
Maybe an economic incentive could be integrated into Yacy via ads?

We might get to a pretty efficient system that way. We see this with bitcoin:
Mining is done with the cheapest power available and highly optimized
algorithms and hardware.

If the advertising income flows back into the spidering - It could become kind
of the perpetuum mobile of distributed websearch.

------
anewhnaccount
On the technical rather than practicality side, how does prevent garbage
data/advertising being added to the collective crawl?

How about using this type of technology for something Google can't (for legal
reasons)? Say for example full text search of the library genesis archive?
Over Tor or somesuch?

------
Animats
The search page for YaCy is at

[http://yacy.net/en/Searchportal.html](http://yacy.net/en/Searchportal.html)

but all it searches is the forums for the Free Software Foundation, Europe.
That's a job that could probably be handled by one server.

------
walterbell
[https://metager.de/en/](https://metager.de/en/) is a German metasearch engine
that includes Yacy P2P results.

------
_almosnow
What a twist! I remember when YaCy was more like a Tor alternative.

