Hacker News new | past | comments | ask | show | jobs | submit login
Today I saw the future (brendaneich.com)
224 points by davidascher on May 4, 2013 | hide | past | favorite | 48 comments



Clarification from the comments: "Perhaps I should wait until there’s more info on this announcement, but I just wanted to clarify: the key couldn’t-be-done-before technology here is the HD codec running in Javascript. The remote desktop technology is nice and cool and everything, but a completely separate thing that just happens to be possible because of the codec."

Frankly, 25% better compression than h.264 for the same content and signal to noise ratio would be such a newsworthy accomplishment that I am very skeptical... doubly so because that's not described as the innovation, but the fact that it runs on Javascript and WebGL is supposedly the strong point.


That's a question, not a clarification you quote, they forgot a ? which makes it ambiguous. Like that questioner I think we're all keen to have the actual information distilled out from the marketing BS so we can figure out what the hell they are even claiming to have built.


h.265 claims double the compression. So 25% is pretty modest in comparison.


Something is fishy in the press release, TechCrunch is carrying a story claiming where Jason Kincaid says:

"At one point, Jason says "one GPU core per instance." But actually we're told that the technology scales so efficiently and cost effectively that you can allow anywhere from a minimum of 10 users per GPU, up to 100 depending on the application"

10 users per CPU? 100? This sounds like wishful thinking and vast hyperbole to me. GPU's don't really like so much context switching, but more than that, a modern game tends to saturate system performance, not just RAM, but VRAM, GPU cycles, and CPU. The idea that you're going to virtualize a couple of instances of Crysis or Call of Duty onto a single GPU seems preposterous.

http://techcrunch.com/2009/06/16/videos-otoy-in-action-you-h...

OnLive had similar problems overhyping the scalability and economics of their service.


http://www.nvidia.com/object/grid-vgx-software.html

So Nvidia can do 25 users per GPU at the moment with their "hypervisor aware" VDI cards. That's first gen.

The real trick will be doing texture swaps in conjunction with compression scaling to account for changes in network latency. That shit will be fucking magic.

But even if you get this perfectly scaled, overall latency/contention will make remote gaming over WAN suck for a mass roll-out with today's broadband and thus limits the addressable market.

edit for tenses, i haven't slept in 36 hours.


"...texture swaps in conjunction with compression scaling to account for changes in network latency."

What does this mean?


I thought the issue OnLive ran into wasn't excessive hype, it was overestimating the market while poorly promoting the service. I recall Perlman stating that the coffin's nails were from investing way too much in infrastructure that went completely unused.


http://en.wikipedia.org/wiki/Onlive

"A web browser based demo service is also available for Windows PCs and Intel-based Macs running Mac OS X 10.5.8 or later enabling trials of games to be played without the need to download the OnLive Client."

So I guess this is like that only delivered "natively" without the need for a flash or silverlight plugin? Is that the big game changer? I'm a bit confused.

The most interesting part of the OP was the comment of how CPU work can remain local while the GPU work is offloaded to a remote GPU cloud. That is a very intriguing idea and smacks of Plan 9 a little.. however, I've been waiting for years for external Lightpeak GPUs to appear so little laptops could become gaming laptops when docked to it. But alas.

And yet here we have.. latency, on top of all the other problems. Somebody enlighten me.


I believe OnLive uses Flash or some other plugin to stream content.

As far as I can tell, the breakthrough that OTOY is claiming is this:

- They've developed a new video compression format and implementation that provides better compression ratios than h.264 [which implementation?].

- Their new format can be encoded very efficiently using a GPU-based encoder.

- Their encoder is very low-latency.

- They have a decoder written in JavaScript that can run in the browser.

- Their decoder is very low-latency.

All the cloud/streaming/remote desktop stuff is somewhat ancillary to this.


Or possibly vice-versa: they appear to have been touting a desktop client app with remote rendering tech including a "codec" called orbx for a couple of years. So this could just be compiling the decoder via emscripten like Broadway and Route9 did for H.264 and VP8.

However, one of the Mozilla guys is quoted as saying it was built from the ground up for browsers (and the vague descriptions of the old codec, don't sound like a normal codec) so maybe they're just repurposing the name.


I had seen something similar 10 years ago. Java was just catching up. Wow 3D shooter in pure Java. Wow pure Java can play video. Wow 3d desktop in pure java....

Get over it. JS is slow, takes ages to load and drains battery in minutes. Just try to open large PDF in Firefox with its pure-JS viewer.


Yea, it's interesting we kind of dumped Applets because they were a massive pain and now JS is trying to turn into the bytecode of the browser. Let's hope it works out better this time.


IMHO, the "massive pain" was that applets were a literal black box. There wasn't a proper DOM API back then, so Sun were forced to do it like that to get something that worked in every browser.


Java and Flash are being dropped because they are proprietary and have a single maintainer, in both cases companies which shirk their support duties. JavaScript is open and has multiple implementors who are all - even Microsoft - aggressively improving performance, features, and security.


How is Java proprietary anymore? OpenJDK is the official Java now! Java has a specification and popular technology like JavaEE has multiple open-source implementations, unlike JavaScript where you have only two.


tck test suite isn't release for apache. the only official oopen source is OpenJDK. Any spec to the language is run through a consortium of companies but is rigged heavily in Oracle's favor. The whole drama and fall out of Apache Harmony (?) sort show this. Plus Sun microsystem promise Apache the TCK test but never delivered. So sun and oracle rule Java with an iron fist.


How many people actually run a different JVM, though? Market share is really unbalanced, which is unhealthy but due to the stranglehold on official compatibility certification which is the hallmark of a one vendor world.


Whenever I see mention of displaying apps locally that are running in the cloud, I also see pretty sloppy language that could mislead people who don't read carefully. PC apps running in the browser? MS Office on the iPad? Not really. It makes me wonder if the only way to sell this concept is with misdirection.


Isn't this just the NetPC or Network PC from the 90's repackaged? Wasn't it Java based technologies that were suppose to usher in "thin client" platforms that everyone would give up their desktop boxes for? The technology is better for something like this to occur these days, and maybe it's time. However, it took us quite awhile to go from something like the Newton to where we are today with the iPhone 5. There was quite a bit of iterating and consumer experimentation going on between then and now. The only consumer oriented success stories for cloud or network computing are remote desktop and backup services, which are more business oriented and certainly aren't threatening standard PCs.

Edit: I guess the main gist of this product is that it's remote desktop in your browser, without any plugins, and it can provide you with a full desktop experience, while boasting a light bandwidth footprint even with HD video applications. It also seems to allow you to tap in to cloud based computing infrastructure at the same time. While neat, it seems niche, and unlikely to rule the desktop. It also doesn't seem efficient to have an office full of computers streaming a remote desktop of MS Office, instead of just running it natively.


Amen! We have an embarrassment of riches when it comes to computing power, even in our smartphones. I think the future we should work toward goes like this: local computing, remote storage with smart local caching, and ever more sophisticated protocols between the local client and the remote server (i.e. not HTTP/REST as we know it today).


It's interesting where Mozilla draws the line on openness: proprietary code is fine as long as it's not part of the browser. Likewise proprietary services. Not our problem.


Not sure what you mean. Of course browsers run websites containing proprietary code all the time. They wouldn't have much to do if all they did was run just open source websites (no proprietary websites would mean no Google, no YouTube, no github, etc.).


I wouldn't advocate Firefox trying to block proprietary sites. But it's just interesting that Mozilla fought H.264 for years and now he's endorsing something that's even worse (it's not even documented or RAND licensed). Is Mozilla interested in freedom for users or freedom for Mozilla themselves?


I'm not sure he's "endorsing" it or not. All I see is him saying it's cool technology - which it definitely is. It opens up a lot of possibilities, and proves a whole different set of approaches is possible to a large set of problems. That's exciting.

Will it end up being good or bad for users overall? It's hard to say with any new technology, but this does look promising in some respects.

But how is this "worse than H.264"? We don't know anything about it yet. (When it launches on the web, perhaps it will be documented? Licensed? Who knows.) Unless you have additional information not in this article?


It seems to me that we hackers/nerds ought to work at not being drawn to new technologies because they're "cool", and instead think about their broader social implications, especially those implications beyond immediate convenience.

In this case, I think wmf is right to point out that what the OP is talking about is a proprietary, as-yet unducmneted video codec. Surely the inevitable outcome of such things is to concentrate power, rather than decentralize it. Is that the kind of thing that anyone associated with Mozilla wishes to encourage?


> I think wmf is right to point out that what the OP is talking about is a proprietary, as-yet unducmneted video codec

Is it? They haven't yet officially announced any details of that nature that I can see. Unless I missed that part?

> Surely the inevitable outcome of such things is to concentrate power, rather than decentralize it.

If it is in fact proprietary, then that is not great. But, that it shows that a downloadable codec can be comparable to a builtin one is more important than this specific codec. If they can do it, others can too.

And if their proving it is possible opens up a new industry of downloadable codecs, that run in all web browsers regardless and on all OSes, then that sounds like something good.

Because it's compared to the H.264 world where no open source browser can ship the codec. And sadly soon with EME we will have DRM in HTML that again, cannot be shipped by any open source browser. Whereas if the codec is downloadable - just another website - then both of these problems are averted.

I agree with you that we should and must consider the broader social implications. It seems to me that this product has positive potential there. But again, it is far too soon - we don't have enough technical details nor enough legal details.


They fought h.264 because it would prevent open source browsers from competing because the browser maker would have to pay for a license (or it might have evolved in that direction). If the codec is distributed as javascript (or some other patent-free open format), then any open source browser that implements a fast javascript engine can run it. The answer to 'who pays?' gets moved from the browser maker to the video distributor (where in all fairness it belongs).


Whether something like this ever becomes widespread depends on WebGL becoming popular, which in turn depends on WebGL becoming sufficiently secure.

In an interview on the Debug podcast, Don Melton describes hardening WebGL, which involves hardening the whole stack down to the hardware level, as a significant challenge.

http://donmelton.com/2013/03/25/im-on-the-debug-podcast-this...

I wouldn't hold my breath.


Can you elaborate? Current Chrome and Firefox users have WebGL right now - is this not the metric you mean by "popular"?


IE is never going to support WebGL. They want to be the gate keeper of video games.

They lost it with application and the cloud but they're holding on to the game segment pretty tightly. That's a theory a few programmer have with microsoft and why they chose to create directx.


There is apparently evidence to the contrary in a leaked copy of the next version of Windows: http://withinwindows.com/within-windows/2013/3/30/blues-clue...


http://caniuse.com/webgl : 54% of which 21 is "partial". So no, Chrome+Firefox isn't enough for "popular".


Interesting coincidence. One of my own 'seeing the future' moments also came from OTOY. We were sitting in the Techcrunch house late on Friday afternoon and there wasn't a lot going on. We got an email in our tips mailbox from Jules, the founder and CEO of OTOY explaining that he was in San Francisco up from LA for the weekend presenting at a conference, where he had a booth setup.

He sent us some info on what OTOY was working on, something about graphics processors in a server setup that could stream games to any client. We didn't really understand what he was talking about, since 'cloud' and all that wasn't well defined at the time (this was in '08, IIRC), but there was enough there to pique our interest. We emailed back asking if we could see him that afternoon, and he said that shouldn't be a problem - he was setting the booth up that evening before the conference proper started on the weekend, and it might be the best time since it was quiet there.

We drove up, found his tiny booth - the smallest of the lot, in a dark corner, next to the much larger rooms being setup by ATI, NVidia etc. He had one screen, one PC, a couple of phones and a pocket PC (believe it was an iPaq). He plugged a playstation controller into the PC, opened up the OTOY gaming application and scrolled around a half-dozen or so games on the screen while he explained what was happening.

NVidia had built a server-side GPU cluster which they were trialling out. It was a number of servers each with a dozen or two GPU's built into them which would directly stream their output over the internet (compressed) to the client. He fired up Grand Theft Auto 3 and we watched him while he walked around the city, getting into cars, driving around and then hitting top speed over hills etc. while the framerate didn't miss a beat.

The debug stats in the top corner showed that he was using 60-80KB/sec of bandwidth while playing. He explained that each frame was being sent down over the internet from a co-lo center. The ping time was 15ms, meaning there wasn't much lag. The servers were processing the input, playing the game, and then sending the output back. The codec in the prototype was nothing more than a series of JPEG images sent at 15-60+ frames per second. They said then that they would have something much better than that at some point, and I guess this Javascript + WebGL is what he was referring to.

We still weren't super impressed, since he had a controller and a big screen, so he just as well could have been running a playstation or other console underneath the table.

But then he picked up one of the phones, opened a Java applet and connected to the same game. This time I held the phone while he controlled the player and we had this tiny phone that could not have had more processing power than an early 386 playing one of the newest computer games on the market at the time in full color, at full screen and at full speed. Totally and completely blown away. He switched in and out of other games within seconds.

The next demo was opening up a web browser, pointing it to the web address, then watching the same game render in ~30 fps as the browser pulled down JPEG images and refreshed with JPEG (again, the very simple 'codec'). Utterly amazing.

We ended up spending hours with him, while he explained their plans and how the future of gaming will all be server-side. Since then, I have seen the OTOY name come up again and again related to breakthroughs in cloud-based gaming. These guys are really smart, and really ambitious.

Techcrunch ended up writing up an intro to what we saw a couple of months later (the conference was a private conference, IIRC, and our meeting was off-the-record, if I remember correctly).

The reason why we didn't publish everything right away was because I was very skeptical about some of the claims. He was a very good salesperson, and it is always difficult to distinguish those who are smart and sales savvy but have substance, and those who are just full of shit. Jules and OTOY were definitely the former.

In terms of my questions, I was most concerned about the lag issue and how this would work with users around the world who couldn't be near servers. I went back and forward on tech issues for a while on email until I was very convinced that not only was this possible, but it was almost certainly the future of gaming (why have all these GPU's that can't be upgraded on desktops and in consoles when you can place them all in the cloud - better utilization, better upgrade path, better pricing - eg. all the cloud benefits).

Here is the later story Techcrunch wrote:

http://techcrunch.com/2008/07/09/otoy-developing-server-side...

Here are all the OTOY stories at Techcrunch from the past 6 years, you can see the things they have been working on:

http://techcrunch.com/2008/07/09/otoy-developing-server-side...


Both OnLive and Gaikai tried this, but I don't think the lag nor the economics really works out.

Consider the following: Let's say the server gets 60fps, that is a frame is rendered in 16ms. If the compression takes another 16ms (charitable!), and the ping time is another 16ms, plus 16ms to decode on the client (plus 16ms to display!), then by the time you press a the 'fire' button to shoot, until you see the output, 5-6 frames have going by, or about 80-96ms, and that's assuming some pretty optimal best case.

OnLive was measured in the range of 150ms to 250ms input lag. This is just not acceptable for twitch gaming IMHO.


Now that's some data Carmack's not going to like ;-)

> The debug stats in the top corner showed that he was using 60-80KB/sec of bandwidth while playing. He explained that each frame was being sent down over the internet from a co-lo center.

I'm wondering what kind of miracle makes a video game display streamable at that speed with any form of acceptable quality (not even talking about 1080p@60Hz), especially given the following:

> The codec in the prototype was nothing more than a series of JPEG images sent at 15-60+ frames per second.

A single acceptably lossy typical JPEG game shot at 1080p is north of 600~800kB, while even one at 720p (typical of a '08 laptop) come up being 150~300kB. Below that you start to get noticeable artifacts.

Even if the game engine/platform knows what has been updated and only send deltas of sort (again taking the series of JPEG images case), you still have to account for the worst case, that is basically the whole screen changes at once, sixty times per second.

And indeed, in the Steam video, you can see throughput data at the top: when nothing's moving the value is as expected very low, but you immediately notice the peak at 9Mbps when the L4D video ends[0], with typical values during the video at 4~6Mbps, which is much more inline with a 1080p stream. Even the fading Source logo produces a 2.5Mbps stream (which produces very small diff areas), at 85% quality. There seems to be some cap at 48Mbps.

You can also see various timings, and the adaptative compression kicking in on the 9Mbps section, where you can read:

    capture: 2.196ms
    encode: 4.811ms
    decode: 6ms
    quality: 70%
There seems to be another timing value at the beginning, with an "encryption" label, hovering at 9ms.

BTW, what's with the monotonically increasing with time stream = ${p}% value? That makes it look like a streamed video, not an interactive one. Not being suspicious but it felt weird.

[0]: https://www.youtube.com/watch?v=FRtBuP2-_pA&feature=play...


Uhm - I'd be inclined to hope that a cluster of GPUs (or just the high end ones being used for something like this) would be capable of rendering at rather higher than 60fps - whilst screen refresh rates may not match that, I would expect each frame to be rendered in rather less than 16ms.

Similarly, by using a decently parallelisable image compression codec (JPEG2000?) it should be possible to compress with the advantage of the many GPUs, too.

16ms transfer time is certainly kept generously low, but beyond that, 16ms decode is something I won't comment on due to the variability of implementation speeds (though it would be done in native code, of course) but 16ms further to display... 32ms to decode and display a JPEG? I'm pretty sure I've seen MJPEG streams at higher than 30fps before!

Back of a napkin they might be, but I think the figures in this post might require further thought.

Edit: again, to be clear, I'm not suggesting that they are using all of these advantages right now, but the idea that this can't reasonably be done for twitch gaming, even today, strikes me as bizarre, when they are trying to set up a system with whatever custom technology is required to make it work.


A GPU (or a cluster of GPUs) might be able to process, say, 10,000 frames in one second. This does not mean that the same GPUs can process one frame in (1/10,000) of a second.

Even with an infinite number of parallel GPUs, there will be an amount of latency required in copying memory to the GPU, running a job, and copying it back. After the frame is compressed, sent over the network, and picked up by the client, further delay (possibly tens of milliseconds) is added on before pixels appear on the screen.

See the discussion around John Carmack's superuser post: http://superuser.com/questions/419070/transatlantic-ping-fas...


These numbers dont add up. If the prototype codec essentially was just a series of JPEGs, i dont see where the bandwidth number of 60-80KB/s is coming from because it doesnt make sense. Each frame JPG would then have a size of 1-1.5Kilobyte considering a Framerate of 60.

OnLive and Gaikai worked quite good for what they were, but they had much higher bandwidth requirements and still didnt feature really high res.


I am probably misremembering, the comment above prompted me to try and work out what the rate could have been, it is likely that it is much higher. I tried to find the details but had no luck. They have since moved on from that prototype codec.

We did try out the web application from the office later on. It worked ok, but you could tell that bandwidth and lag were going to be issues.

Note to anybody else reading this as well: don't take my numbers as gospel, I probably shouldn't have included them since I was not 100% certain of them. What I do know for certain is that the method was just refreshing JPEG's and you could tell the quality was low (it looked great, but at times the picture was the equivalent of a full color photoshop photo exported at a quality level of 2 or 3).

The compression definitely wasn't working across frames, just each frame.


They were using 60-80KBs? I'm skeptical since just sending reasonable-sounding sound takes 128kB/s or so, video much more.


I don't know, one of the main usecases seems to be gaming and i really really really can't see how this works out on action based games, because of latency. No way. I can see the usecase for Office, etc. but that's what VNC/RDP/Citrix and all the others are about.


Thin client computing in JS - very nice, but why do you need a browser at all? Why is this not part of the OS that you need to run the browser in the first place?

You can get streaming games and video today with RemoteFX on Windows Server, even streaming to ARM Windows RT devices over Remote Desktop. GaiKai has an implementation ready to go for PS4. Other than the clever engineering required to get a decent video codec going in JS, I fail to see the benefit here other than not needing to install anything.


If it supports both Holleywood and DRM proponents, a big thumbs down from me.


A future without flash?

Hell yeah.


Look, I love the web and what it represents, but good god, the overhead of GPU in the cloud and decoding in JS and WebGL? Moore is great, but just how great?


The amount of buzzwords in that blog article is too damn high!


Not really.


Oops, I thought he saw a gay wedding.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: