Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So how many times do we get this story? Interestingly Herb Sutter wrote about it here:

http://www.gotw.ca/publications/concurrency-ddj.htm

And then he updated that here: http://herbsutter.com/welcome-to-the-jungle/ and we commented on it here: http://news.ycombinator.com/item?id=3502223

Then the Extreme Tech guys rip off the graphics and write about it again. (personally they should have given Herb a link but whatever)

Both Herb's article(s) and this one strike me like someone looking at an approaching storm front and detailing how much water is likely to to come raining down when it breaks. That is accurate information but ultimately useless.

One answer is that existing architectures will work better on new materials (carbon for example) because they can dissipate more heat (so keep your eye on the research about doping graphene wires into silicon or creating diamond substrates.

"Web 2.0" is all about new ways of computing which exploit parallelism. And while I don't see a lot of benefit in Google inventing a new language to express it (Go), the challenge is real. Folks have been designing chips which are essentially parallel at the transistor level. Little of that research has yet to percolate into the software architectures being proposed.

Its one thing to say "The gas tank is about 3/4 empty, start looking for gas stations." and another to just go on and on about all the ways the remaining gas in your car is going to be consumed down to fumes :-)



You wrote, '"Web 2.0" is all about new ways of computing which exploit parallelism.'

I've seen many descriptions of "Web 2.0" but that's a new one to me. Or to quote Inigo Montoya, "You keep using that word. I do not think it means what you think it means."

Here's Tim O'Reilly's explanation: http://oreilly.com/web2/archive/what-is-web-20.html


Web 1.0 was bigger and bigger SMP servers, Sun was that 'dot' in "Dot Com" and "Grid" was the thing.

Web 2.0 is Beowulf type and other shared nothing clusters where the only 'fabric' between processes is the network and the parallelism and the service API is an emergent property of the cluster of servers not of any one server.

Web 2.0 is where you can run a web application locally in some co-location facility that is pulling its data off the S3 cloud at Amazon across the country.

Web 2.0 is the difference between a SQL server that creaks under the load of a million queries per day and a noSQL cluster that does billions of queries a day.

A looooooong time ago I challenged Sun's executive management with the question "What are we going to do when a 'big yellow hose'[1] runs through everyone's living room?", Eric Schmidt (who was the senior director of the Systems Group at the time) felt the challenge was a bit over the top since getting 10 megabits of network bandwidth to everyone's house wasn't really on anybody's road map (and we had just done a deal with AT&T which thinking that maybe in 10 years 30% of the households would have an ISDN line). I had just come up for air after looking at how to build a network service that, like the Andrew File System, didn't exist on any one server, it existed on all servers. That was one of those 'oh wow' moments, sort of a 'we are all under-estimating this' kinda thing.

So for me the explosion of bandwidth was the fundamental moving force behind the evolution of the web, you could assume that data could be on the far side of the country and you'd have a chance of getting it to show to the user before they died of boredom. And when that is true, what were the boundaries of the system then? What were the invariants?

Working at Google, and now Blekko, is hugely exciting because the friction between data sets is so much lower you can do awesome things. So my 'backplane' can be 4,000 sq ft of data center and I can fit a whole lot of machines into that 4,000 sq ft, and I can easily give everyone of them a small piece of a problem. Or the same problem where different things are assumed to be true. And they can all return their answers and those answers can be correlated, evaluated, formatted, and outputted in the blink of an eye. That I contend is Web 2.0.

[1] At the time the long haul version of 10 megabit ethernet was a large diameter yellow cable with marks where you could install vampire taps.


Web 1.0 was the web as a publishing platform with a sprinkling of limited applications that do all of their heavy lifting on the backend. Web 1.0 was about cgi-bin, it was about lycos and yahoo and mapquest, it was about geocities and "under construction" pages and Fortune 500 companies with static websites consisting of a handful of pages, none of them terribly interesting.

Web 2.0 was a dramatic reshaping of what was possible with the web. It was about the web as a full-fledged application and communication platform. Web 2.0 was about AJAX, and web-mail, and wikis, and blogging and commenting, and google maps, and sites with social features, etc.

In the Web 1.0 era if you wanted to share information with the world you bought some hosting and you set up your own site where you put up a handful of hand-edited html pages. If people wanted to have a conversation with you on the web they would have to email you or make a comment on their own site. In the Web 2.0 era you turned to one of many platforms (blogger, wordpress, web forums, livejournal, etc.) and you started blogging, or making podcasts, or making web comics, or doing whatever suited your fancy. And to have a conversation people used the same medium, they commented on your blog posts or they talked to you on a forum, or they commented on your flickr photos, etc.

Fundamentally web 1.0 is about static data from a handful of authors, web 2.0 is about dynamic data from a myriad of contributors.

Most of Web 2.0 runs on ordinary PHP and MySQL servers, the SQL vs. NoSQL division doesn't play a part in it.


I think Web2.0 was when many things started going super-linear.

1 customer isn't just another customer, they are more SEO, more content, more "Likes", more network effect.

It's also about standardization (which is also kind of super-linear) - if you code to a standard interface, you no longer need to code to every interface.

I'm not such a fan of AJAX. You can make a punch-the-monkey game in AJAX, and it's just web 1.0 all over again. You could be something like Facebook with little more than static HTML. AJAX is a tool, not a revolution. It's a good tool, but that's it. The same goes for NoSQL servers.


Your points are domain e.g. search specific and the fact that you can get away with delegating the "SM" of the SMP to a cluster in the backend in "Google, and now Blekko". That's basically an economic model and for your domain, it works. It doesn't apply to everyone. The concern is that for general purpose computing the free lunch is over.

"Web 2.0 is the difference between a SQL server that creaks under the load of a million queries per day and a noSQL cluster that does billions of queries a day."

Right. No one is gonna sue you if your search results are affected by eventual consistency. The world is not just about "queries". Some happen to care about "transactions" at scale ..

I do agree that all the recent tech-pop fretting over this is a bit of a johnny come lately phenomena.


Hmm, I don't see it quite that way. Can you say more about "general purpose computing" ? Let me give a shot at how I think about it and tell me where I jump the boat.

Lets say you have an algorithm A which runs in O(n) time. We can define a property 'e' called 'entanglement' as follows:

The entanglement of algorithm A is defined to be the requirement of how much of A[n-1] must be computed before you can compute A[n]. An entanglement of 1.0 means that all of A[n-1] must be computed, and an entanglement of 0.0 means that the output of of A[n] is independent of A[n-1].

Algorithms with low entanglement are considered to be 'highly parallelizable' and algorithms with high entanglement are 'sequential'.

Amdahl's law shows that the performance improvement of an algorithm is limited by both its level of entanglement and by the cost of handling the partitioning.

As a systems architecture, if you can partition the problem into partial computation, you can sidestep Amdahl's law by running multiple copies of the same algorithm with the assumption that each partial computation will come out in one of many possible ways.

The simplest example I can contrive of this is binary division.

In general, to divide two numbers requires computing the partial remainder of each step of the division until you reach a partial remainder that is between 0 and the divisor. Each step 'n' depends on the step 'n' - 1 to get its results. Lets say you were dividing a 16 bit value by an arbitrary 8 bit value. You can create an alternate form of the problem using a 48 MB memory (16MB x 24 bits). Using the address pins A23-A8 to hold the numerator, and A7 through A0 to hold the denominator, and having the contents of the memory be the 16 bit result, and an 8 bit remainder.

The way I think about this solution is that you've created 16 million partial solutions, and the address lines tell you which of those solutions will be the one you are looking for.

I expect general purpose computing in a world where there are thousands or millions or billions of cpus available will evolve algorithms like the look up table. However instead of addressing read only memory, the relevant initial conditions will be passed to many partial computation engines, those engines will either respond with a value or not because they compute that their speculative computation would not happen with those initial conditions. And the responding engines may feed an subsequent layer of engines and they will respond or not.

You need look no further than the Map/Reduce work, or existing biological systems like immune response to get a grasp on how such a system exploits a sea of resources to surface a solution set of viable outcomes more efficiently. Some of the early work in constraint logic languages points this way as do some hardware description languages.

My belief is that having a highly connected sea of general purpose compute engines, and some additional tools to factor algorithms along their entanglement borders into partial computation fragments will radically change the way things get done.


Thank you for your thoughtful reply. Note that you were not accused of jumping the boat; merely that your view is partially biased by subjective economic considerations.

Re. Amdahl, various concerns raised by physical distribution present the proverbial hair in the ointment:

- CAP applies. If the requirement is for a highly-available and highly-consistent system, then scaling up is preferred as we can do away with the 'P' concerns altogether.

- Higher latencies are a given. Realtime distributed e.g. map/reduce algorithms are beyond the reach of most.

- Certain e.g. graph constructs are difficult to partition. A single node compute engine that can scale up may turn out to be the more economically attractive option.

> My belief is that having a highly connected sea of general purpose compute engines, and some additional tools to factor algorithms along their entanglement borders into partial computation fragments will radically change the way things get done.

Right, so we're in agreement here (and I personally love to geek out thinking about that stuff) but with the caveat that such tools do not exist and it is not clear at all that the interface that will be provided to the end-user (read: your average programmer) will remain accessible.

1 - Not everyone can "[think] like a vertex" so Pregel is very nice indeed, but who will be coding for it? Where are these programmers being cranked out and how much will it cost me to hire and retain them?

2 - Not everyone can (afford to) manage a realtime m/r infrastructure (like Facebook). [Oddly enough, there was recently a cry to revolution here by PG regarding the chokehold of "Hollywood" which imho effectively boils down to distribution -- any one can make content these days.]

3 - And a subset of above will not want to trust a 3rd party provider to manage the required infrastructure.

4 - (Subjective) I remain sceptical of the computation efficiencies of the current models which rely on high degree of redundancy and lots and lots of boxes. You must know quite about this (per your HN bio), so I would like hear your thoughts on that. Assume that tomorrow, the cost of running the HVAC and powering up the boxes goes through the roof. What approach to computation will provide the most efficient compute-engine? Let's flip this to physical delivery systems (to highlight the given /fixed cost/ of running a unit-engine) and consider what is more efficient: having a huge number of small cars deliver goods, or using trucks? (You may counter that "well, it is cheaper to train and hire n truck drivers than N mini drivers, but driverless cars will turn that equation upside down". If so, then see 2,3 above.)

In sum:

Not every company is Google/Facebook/Amazon -- with the attendant wealth of capital and human resources of these giants.

Further, not every computation need can be trusted to corporations. Some of us haven't given up on data privacy and insist on it. There are critical "social" applications that need to be written and most certainly Google, Facebook (and may be even) Amazon are not trustworthy enough to host them.

Solving this at the hardware/OS level, imho, will be the more democratic way forward and will take it out of the user/devop land (where it is being addressed at the moment). After all, wasn't that what the PC revolution was all about? We did, after all, already have mainframes and dumb terminals way back when. And this back to the future of cloud and running m/r batch jobs is not democratic at all.

So I am personally rooting for the (future commodity) H/W solution to this problem. Naturally, my 'biases' are clear per above.


I think we're basically in agreement, the parts where we differ are:

"- Certain e.g. graph constructs are difficult to partition. A single node compute engine that can scale up may turn out to be the more economically attractive option."

The unstated bias is what I might call 'conservation of compute' which is to say most of the work that has gone into partitioning such problems does so assuming that you want to find the answer in the fewest number of compute cycles because that will lead to the quickest result.

One of the things I got to witness at Google was the notion that you could relax the constraint on minimum number of cycles if those cycles could be run in parallel. This is done in a small scale in current micro-architectures where the processor speculatively continues computing past a branch on the 'bet' that the branch won't be taken, only to discard the results of that computation once it is known how the branch will go. This has shown to increase performance even if your branch prediction is only 50/50 because you avoid the pipeline stall when you're right.

My assertion is that the number of 'cores' and hence the number of compute engines you can apply to the problem, if you can decompose it into many speculative copies, will let you essentially compute many possible solutions in parallel and select the one you want through a 'fingerprint' of which branches would be or would not be taken. (the path through the graph as it were).

And this one:

"Not every company is Google/Facebook/Amazon -- with the attendant wealth of capital and human resources of these giants."

This is true, but both Google and Amazon have been making their infrastructure available on a pay per use basis. I expect this to continue. Then you will get domain specific portals into that infrastructure in which a middle layer of semantics sits between you and the infrastructure. Sort of like WolframAlpha putting Mathematica between you and WA's server farm, or S3 putting a storage layer between you and Amazons hardware.

I too root for some of these problems to be solved at the OS/hardware level but from the perspective of a companies willingness to put their data on another company's gear, at some point if you're being toasted by a competitor who has made that choice then the choice gets effectively made for you.

I'd love to write code on the bare metal of an upscale ARM chip with a graphics processor but as anyone who has tried to do this on an OMAP or Broadcom ARM chip can tell you, that isn't going to happen until you design your own ARM chip with GPU and build it yourself. I see this as an example of the choice I would like to make being denied me by forces outside of my control.


I took it to mean one of the side effects of Web 2.0 is increased familiarity with massively parallel computation.


The Extreme Tech guys do actually credit and link to the GOTW article in the caption of the first graphic. So they got that right.

The graphic itself grates on me. Firstly, it is wildly out of date for a current article (no data for 2010 onwards and the last data point on perf/clock series being from ~2007). Secondly, using ILP as a measure of perf/clock seems quite off (the latency of various instructions do change quite a bit between generations)


There's some kind of rule that commercial sites are allowed to copy blogs because they're providing exposure or something.

I agree that this article isn't strictly useful for its readers since there's nothing they can do about it, but it is educational.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: