
Turning web traffic into a Super Computer - anaxag0ras
http://ben.akrin.com/?p=5997
======
noahdesu
This is a great direction for a certain class of problems, but to call this a
general approach for building a super computer ignores the biggest challenge
of all: communication performance.

The leadership-class super computers have large budgets for network
interconnects like Infiniband, because the hardest problems require ultra low-
latency, high-throughput RDMA message passing between processes.

~~~
xtreme
Absolutely true. Embarrassingly parallel problems are generally not the most
interesting, because you can just throw compute power at it until it is
solved. Most applications running on actual supercomputers are communication
bound.

------
chroem-
I actually made this a few years ago for an ill-fated application to YC (I was
still a sophomore in college).

The first big problem is that you can't trust any of the results that you
receive from the workers. You can alleviate this somewhat by not accepting job
results until they've been replicated by at least several other workers
(lowering efficiency of the system), but then you're still vulnerable to sybil
attacks.

The second issue was the total lack of privacy for the data and computations.

~~~
ikeboy
Check 10% of submissions randomly - if some are wrong, blacklist that source.

You can send the checking to other nodes and check those randomly as well.

~~~
chroem-
That's what you would initially think, but a clever adversary can produce
arbitrarily many identities, so even if results agree, there is a non-zero
chance that all of the agreeing nodes are compromised. Even if you begin
blacklisting nodes, an attacker can simply spoof new ones.

So then you need a proof of work scheme to defend against sybil attacks, but
even then an attacker could choose to only produce incorrect output when they
detect that a quorum group is made entirely of their compromised nodes, and
otherwise produce correct output. That would once again render blacklisting
ineffective.

I'm not saying it's impossible, but securing this sort of browser-based grid
computing scheme is a nontrivial task. And that's coming from someone who is
very fond of this technology.

~~~
ikeboy
Use homomorphic encryption to ensure attackers can't tell when they get the
same thing twice.

IPs are not so cheap, anyway. If you have enough legitimate traffic it would
be tough to spoof enough.

~~~
Cyphase
You're thinking in IPv4.

~~~
ikeboy
Most traffic will be IPv4, if you only use them for work you'll do ok

------
hexscrews
Ok, this is similar to Gridcoin ([https://gridcoin.io/](https://gridcoin.io/))
, Golem
([https://cryptoslate.com/coins/golem/](https://cryptoslate.com/coins/golem/))
, SPARC ([https://sparc.network/](https://sparc.network/)) , et al. I could
keep going, but the point is, something like this exists. And we are seeing
the issues of what happens when you embed a program that takes advantage of
processing power. You get things like Coinhive
([https://coinhive.com/](https://coinhive.com/)), which were meant to be
benign, turned into botnets.

~~~
huhtenberg
Tangentially related -
[http://www.gomezpeerzone.com/](http://www.gomezpeerzone.com/)

This was one of the first attempts to _monetize_ spare CPU capacity and to
build a business around massively distributed computing. It seems to be in a
zombi mode now, but it was launched almost 10 years ago -
[https://web.archive.org/web/20080529034258/http://www.gomezp...](https://web.archive.org/web/20080529034258/http://www.gomezpeerzone.com)

------
tsneed290
For those having trouble viewing the site:
[https://web.archive.org/web/20180303173013/http://ben.akrin....](https://web.archive.org/web/20180303173013/http://ben.akrin.com/?p=5997)

------
clon
This will only work if you can cheaply validate the work performed. Or send it
to several distinct visitors and compare results, driving down the likelyhood
that the results are bogus.

~~~
noahdesu
good point. my hunch is that the only way would be to duplicate work and check
for consensus. it seems like there isn't any way to cheaply validate results
for any interesting work.

------
0xFFFF0000
Something like this could be a way to get finally rid of advertisements, and
help finance free services in a novel new way if the CPU isn't over committed.
I thought the same around crypto mining... Many people might prefer the
resource providing/computation option compared to ads.

~~~
Dylan16807
Good luck getting people to not double-dip.

Also good luck making it so I don't spend 10 cents of electricity or have my
phone die early to give a site <1 cent of value.

------
redka
> Unlike a regular computer cluster, the nodes are very ephemeral (as website
> visitors come and go) and can’t talk to each other (no cross site requests).

Not sure how that would actually help but they somewhat _can_ talk to each
other through WebRTC. I wonder if that would change anything here

------
nukeop
This could be a good alternative for SETI or that protein folding project but
if it somehow turns into a plague of attempts to shift the computing
requirements of running the backend onto the users, our content blocking lists
will have to grow a little bit longer again.

------
AtomicOrbital
opportunity here to create cloud computation platforms where people submit
work which gets executed on other people's machines who are getting paid to
allow their browser to contribute compute ... eventually price of AWS / google
compute engine will drop ... win win

~~~
fwip
The big problem I see to be solved with p2p computation is trust. In the
example, the problem to be solved is "Do any of these strings hash to a known
value?" An adversary can accept the job, sleep for a second, and report "Nope"
and collect their payment, without ever doing any work.

In contrast, I trust that if I run code on AWS, it'll actually run the code I
give it.

These are solvable problems, but important to think about.

~~~
archgoon
A bigger issue than lying (and randomly failing) clients is the fact that
you're farming out your users data to third parties. You might be willing to
trust a single entity like Amazon, Microsoft, or Google, who have a reputation
to maintain / can be sued (for potentially a lot); but if your farming out to
essentially anonymous individuals, there's no repercussions for, say, using
their credit card information.

Now, there are definitely use cases where you don't care; which would be a
cool application for this; but unless we figure out how to do homomorphic
computing, this doesn't seem immediately likely to impact cloud computing
prices; even if it was cheaper to simply use excess compute capacity (which
I'm not convinced it is).

------
antoineMoPa
I should code a fractal renderer that combines the work of many clients...

------
WalterGR
What are the performance numbers for work distributed like this?

