Hacker News new | past | comments | ask | show | jobs | submit login
HTML5 Accelerator Card (twitter.com/_ninji)
190 points by LeoPanthera 5 days ago | hide | past | favorite | 102 comments

The ARM instruction set was expanded to better support JavaScript[0], so why not?

[0]: https://stackoverflow.com/questions/50966676/why-do-arm-chip...

That instruction is super overrated. It has next to no architectural cost - all it does is specify a constant set of rounding and overflow flags instead of using the current fpu state. The only real win is a code size reduction.

JavaScriptCore didn’t even use it when it was first made available because that operation is relatively uncommon and the cost is dominated by the fpu logic that precedes rounding - so it essentially does not matter for real world js. It may be possible to make a micro benchmark that it helps, though I doubt it.

I remember when Twitter was having a field day thinking that this instruction was the reason why the newest iPhones benchmarked well. There is no "magic instruction" that gets you a 40% increase in benchmarks. None. It would literally need to be multiple times faster and executed like half the time to get that increase, which is clearly not the case for a JavaScript floating point conversion. It took someone from the JavaScriptCore team replying with "we don't even emit that yet" for people to see sense.

Awesome. This must be the basis of that “HTML 5 super computer” from Nikola [1]. I’ve been waiting to catch a glimpse of this.

[1]: https://www.truckinginfo.com/330475/whats-behind-the-grille-...

> "The entire infotainment system is a HTML 5 super computer," Milton said.

> "That's the standard language for computer programmers around the world, so using it let's us build our own chips. And HTML 5 is very secure. Every component is linked on the data network, all speaking the same language. It's not a bunch of separate systems that somehow still manage to communicate."

Thats how almost all news are like. It only sounds stupid because we have a deep understanding of the terms eg. what HTML is.

Evidence please?

It's a know phenomena. Whenever you read something written by a reporter (eg. someone who is not an expert), in an area that you yourself is an expert in, it will be weird or just plain wrong. Then on the next day you read an interesting article (written by a reporter, eg. non expert) on a subject you yourself have little experience with you happily accept it as the truth.

I think I'm supposed to hate this, but I think it's great. How is this different (apart from extremity) from adding hardware support for AES?

There are a lot of differences. AES will never change and will be recommended for decades. AES is very compute-intensive so a hardware implementation is much more efficient than software.

HTML/CSS/JS is ever-changing, control-intensive, and memory-intensive so it's probably not a good candidate for hardware acceleration.

> HTML/CSS/JS is ever-changing,

DOM level 3 core became a recommendation in April 2004. DOM4 went last call in 2015. The fundamentals- I feel- are quite fixed, although many auxiliary systems do change.


> control-intensive, and memory-intensive

latency to remote accelerators can be problematic for some control workloads. ideally the control plane can hopefully offload itself onto the accelerator too.

I don't see memory intensive as a barrier. The 8 vdom+ processors probably come with sizable multi-megabyte caches. Perhaps they could be early on-ram processor architectures? after all it seems they have a fixed function diff pipeline. icd also suggest that perhaps the hardware representation might be very effective at using low bit-depth encodings, saving gobs of memory. keep text offboard & encode attribute values via some columnar representation & this could be a high throughput HTMLElement slinger & differ!

So? We have ever-growing standards that we do hardware acceleration on. Some of the newer standards don't get hardware accel until new hardware comes out.

E.g. AV1, HVEC, H264, etc etc. All of these either have or are about to have hardware acceleration. Why not JS?

> control-intensive, and memory-intensive

The ideal "accelerator" for these kinds of jobs is a CPU with a big cache.

Video encoding has well defined control loops and data paths that don't arbitrarily interfere with each other, so it's a good candidate for custom hardware.

That is, you have a high bandwidth, highly parallel fast-path between framebuffer memory and functional units that compute FFTs and do motion vector operations, and a control plane that looks at a small handful variables in order to decide which data plane operations to schedule and how to glue together the final result.

To run JS, you need a pile of functional units and lots of memory, and data for every operation needs to be able to come from / go to anywhere in memory. That's... just a general purpose computer.

All of those video standards are designed from day 1 to be hardware accelerated, they have a completely fixed pipeline with no control branches, with very few data dependencies (stuff like A+B=C C+A=D).

JS/CSS/HTML5 are not, they essentially have an open ended and infinite amount of branching and data dependency and I'm very skeptical a card could achieve much.

This is before we start talking about stuff like latency to the main CPU, CPUs are EXTREMELY fast in comparison to access to buses like ram, and especially PCI-E, I would not be the least bit surprised that even if some theoretical infinitely fast HTML5 accelerator card existed, it would still not be worth using due to latency of fetches from the card.

It's already not worth offloading things like cryptography to accelerator cards, and every major crypto algorithm was designed to run fast in hardware. And this is before we start talking about stuff like AES-NI.

> JS/CSS/HTML5 are not,

Just for the sake of argument, I should point out that the bulk of the work done by JS/CSS/HTML involve primitive operations over a tree data structure. Conceptually, this paves the way to opportunities in hardware acceleration, similar to how the extensive use of polynomials in number crunching applications led to the addition of a fused multiply-add instruction set.


None of the compression methods you just listed will ever change. They are static, well defined, so making custom silicon to speed them up makes sense. Javascript is a massive mess of ever changing spaghetti (useful, delicious spaghetti, but still spaghetti), so custom silicon for it does not make sense.

...or Java, for that matter: https://en.wikipedia.org/wiki/Jazelle

The difference with Lisp machines is that they're the counterpart to Von Numen machines. C is probably the closest abstract representation of a Von Numen architecture.

Aside from all the great little jokes--there should definitely be a NIST standard Minion meme collection--the silliest part of this is probably that it's a dedicated PCI-Express card.

Well, that and the idea of permanently burning the, uh, unique design choices made by the web platform into hardware is a little horrifying.

still, few other hypertext models that are worthy of note

> How is this different (apart from extremity) from adding hardware support for AES?

For starters, is any part of javascript CPU-bound?

It’d probably be slower transferring the data to it and back than just using the CPU

Is that possible? Aren't there hardware video decoders that deliver, well, full video at high frame rates? I'm not sure how those things work. Maybe they have a direct route to the GPU and CPU just decides how things should get composited.

Hardware video decoders were... complicated.

However most of them worked by using the video-overlay feature on cards where the hardware video decoder injected its output directly to the GPU's output (after the framebuffer) via an internal header - or even injected themselves into the GPU's output VGA signal using a D-Sub-input on the back of the card.

For a very brief time in the late-1990s there were partial MPEG-2 decoder cards that hooked themselves into DirectShow to do the bulk operations needed for DCT and/or Motion Compensation but not rendering the entire MPEG scene - they'd feed their results back to the CPU rather than the GPU... IIRC.

I remember a time where Windows Media Player would sometimes do video playback by just drawing a very specific "close-to-black" RGB color in it's window and then let the hardware impose the decoded video on to that part of the screen with that color. I could then open up MS Paint and draw my own custom shaped "window" into my video playback with that color by laying it on top of the WMP window.

Yes, that's Video Overlay: when the video is not rendered to the framebuffer but actually to the output signal directly. It was replaced by VMR (Video Mixing Renderer) which allows video to be rendered to each parent windows' offscreen DWM buffer.

The funny thing is that overlays are coming back (soon, I hope!) - not for performance reasons, but because using a compositing window manager like the DWM introduces an additional frame of latency, but if a foreground window is being displayed 1:1 on the desktop then the GPU can simply overlay it directly to the output signal and thus eliminate that frame of latency. Some Linux WMs support it already, and Microsoft said they're working on it.

Any idea how the Apple Afterburner card works? It's PCI-Express card that can decode a bunch of 8k ProRes streams in realtime.

Simple put: Afterburner is an FPGA you can add to your Mac Pro

But the question is how does it deliver video to the GPU.

PCI-Express bus sharing, and custom Radeon firmware - or it could use DMA to main memory and copy it in the background.

It depends how often you have to round-trip between the CPU and the accelerator. If it's never or once per frame (as in video decoding) it's not bad, but if the JavaScript core ended up blocking on measurements from the layout core multiple times per frame you could end up losing a lot of performance.

The idea is quite realistic. (Apart from the GCs) But given how html is changing, the behaviour would become obsolete soon.

It could be great though for smaller parts. If someone can make a super fast font renderer accelerator, it could help in general. Alternatively we could adopt the GPU accelerated one created for servo.

You can update FPGAs over-the-air, the are reconfigurable silicon. You already have this for your CPUs whenever you install an "Intel microcode update".

Same. Want one.

I saw this and honestly couldn't tell if it was real or not.

I decided no due to the 256MB of Emoji-Cache...

My sarcasm detector is broken...or maybe its just the web..

Didn't Flask started as a bad sarcastic joke gaining momentum and taking up a life of its own?

What was the joke that created Flask?

EDIT: http://mitsuhiko.pocoo.org/flask-pycon-2011.pdf ugh pdf

I actually sized up the glyph cache & thought 256MB was probably pretty reasonable for high dpi hdr. hopefully it compresses down though!

This is JavaScript adjacent, so I think you can be forgiven for thinking this crazy waste of space might actually be a real thing. I mean, some of the most popular and useful projects in JavaScript started out as such, so there's a long tradition. ;)

I hate to break it to you, but you’re probably using one of these to read this comment.

If you don’t believe me, disable hardware acceleration on your machine (force the video card to VESA or framebuffer mode or something), and try to read the news, use web apps, etc. Compare them to native 2D apps, which should still generally work just fine.

In my experience, a modern, headless 24 core xeon with 128Gb of ram and 2x10GBit nics can’t even use jenkins and jira comfortably at the same time.

News sites with JavaScript enabled and no ad blocker are just not usable at all on such a machine.

I'm not sure whether this is sarcasm or not. This comment (and most other web pages) work fine without any hardware acceleration enabled. Test software: firefox on windows guest os, with 3d acceleration disabled in virtual machine settings. AFAIK browsers in the early 2000s didn't even have gpu acceleration for rendering.

This is not sarcasm. I've encountered news sites that do over 15,000 requests and load chains and waterfalls of tracking, invasive garbage, stuff that is illegal in Europe, from over 250+ domains on a single pageview.

> I've encountered news sites that do over 15,000 requests

I'm not sure if you're joking or not, but in the event you were being serious I should point out that the performance impact of doing a lot of requests is not due to the CPU but time wasted while waiting for the request to arrive.

You'd be hard pressed to find a hardware-based strategy for the client-side that would make servers send their replies faster.

And while the client waits for a reply, their CPU just idles.

I'm not joking, but the real problem here is all the hot steaming dogshite that they choose to load, not really the number of requests, IMO.

Many of those requests are also running GPU code to fingerprint clients. Canvas and WebGL. To "prevent clearing cookies to bypass paywall" fraud and ban scrapers.

> I'm not joking, but the real problem here is all the hot steaming dogshite that they choose to load, not really the number of requests, IMO.

Neither making requests not transferring data around are CPU-bound activities.

> Many of those requests are also running GPU code to fingerprint clients. Canvas and WebGL. To "prevent clearing cookies to bypass paywall" fraud and ban scrapers.

Even taking these statements at face value, considering that javascript is single-threaded by default it still sounds like a perceptual performance problem (not real performance problem) resulting from a poor software architecture. Expensive tasks are expected to be offloaded out of the main thread to avoid getting it blocked.

GPU hardware acceleration is not the same as a dedicated accelerator card.

I'm amused, scared, and a little tempted by this vision.

Prior art (kind of): XML Accelerator XA35 - http://soasecure.com/xml-accelerator-xa35/

> The XML Accelerator XA35 is a highly efficient XML processing engine that makes use of purpose-built features such as optimized caches and dedicates SSL hardware to process XML at near wire-speed.

> The appliance can be used inline in the network topology, not as a coprocessor that hangs off a particular server. A popular use for the appliance is to receive XML responses from servers and transform them into HTML before forwarding the response to the client.

This just gives me a "Wordfence" 503 for no apparent reason. Here's an archive.org link for the benefit of anyone else who might want to read the product information despite the site owner's arbitrary restrictions: http://web.archive.org/web/20180829042526/http://soasecure.c...

Thanks! I suppose the site is "secure" by blocking an overly broad set of IP addresses..

The whole thing is kind of "last century", but amusing to know that dedicated hardware exists for XML.

I wouldn't be surprised if we start seeing such hardware for web view rendering, or JavaScript execution.

Wow these are very interesting and amusing. Very nice find!


Or we could just use html, css and native javascript without using libraries.

Sure it’d make coding harder but the user experience would probably be better.

Or maybe I’m just talking out of my ass here.

> Or maybe I’m just talking out of my ass here.

Only half so. It’s not that much harder.

The problem I run into sometimes, is when I use only HTML or CSS or native javascript to accomplish something. Someone will always ask "Why aren't you using the <library's code> here?" Answer: "Because it's easier and/or more efficient to accomplish the same thing doing this instead.

I get it, sometimes you want to keep the syntax the same for future maintainability, while other times Ij just want to use something I already know that I know will work just as well.

Ah well.

It should also add a few dozen GB of ram for all those chrome tabs.

I can't tell if that card would be a great thing or a terrible thing if it really existed. On the one hand, encoding decisions in hardware might slow down the pace at which the web shifts around. On the other hand, web page complexity would expand to fill the available processing power, so they would only be fast for the web devs and anyone else who has the expansion card.

this is going into some real "blursed image" territory.

Tell me experts, why is this not practical. Can you put a HTML renderer on an FPGA?

Getting data to and from the FPGA, and the workload being significantly pointer-chasing based, would probably nullify any advantage an FPGA could have.

Realistically, you'd use the FPGA as your NIC.

Because web rendering is already fast and hardware driven. The slowness comes from unnecessary frameworks and bad developers.

For example that card graphic mentions Virtual DOM, which isn’t a technology. It’s bullshit that comes with the React framework.

Don't modern browsers already use GPU acceleration?

Do they acceleration DOM parsing, or just media that already has acceleration hooks provided by the OS?

DOM parsing is an almost perfect job for a CPU to do and not really something for dedicated hardware.

Firefox uses the GPU via WebRender[0]. I don't know what specific things it's used for, but I'm pretty sure it's not any of the DOM parsing. The GPU is used for the actual graphical rendering down the line -- compositing and such[1].

[0] https://github.com/servo/webrender

[1] https://hacks.mozilla.org/2017/10/the-whole-web-at-maximum-f...

HTML renderer? Maybe not. But perhaps you could make a FPGA that's good at running whatever bytecode your JavaScript engine lowers scripts to.

It's not April 1st but I'm not opposed to this.

>Junk Free Scrolling.

Apple's Mac Pro, AirPods Pro or iPhone 12's webpages, Facebook Feed, etc etc....

High Quality Image with Animation and Jank Free Scrolling is still not done right ( or not even possible ) by any major tech companies in 2020. And that is just Web Pages, not even Web Apps.

And preferably doing so without my Quad Core MacBook Pro with GPU accelerated Browser ever warming up my lap.

This would be an appalling thing to happen to the web, so therefore it absolutely will be a real product in the next 5 years.

they're called chrome casts, independent html/browser hardware systems.

you'd need a bunch of those $15 hdmi-in (to usb) adapters to use the chrome casts accelerators but the idea is very much there; hardware that does the web.

Considering how slow modern webpages have got, especially without adblocks, people would actually buy this.

Or maybe we could just come up with a way to test and catalog web pages that are actually fast and not bloated. Like a browser extension with a test suite and database or something.

Maybe we need a mainstream search engine that would actually put their users’ interests first and heavily penalize websites that are bloated/have ads/autoplay video/etc?

2040: World’s first Electron.JS Silicon, your Slack will never be the same.

Will this support all the electron apps that have taken over my computer?

I feel like I'm crazy based on how much I loathe electron apps.

Don't get me wrong it's a great concept, I like the idea of portability everywhere, but I can't get past the fact it's basically just a stripped-down Chrome browser with the "app" effectively being plain-ole HTML/CSS/JS. It just seems pointless when you can run the exact software in the web browser you likely already have running, with less overhead to boot.

The music streaming service Tidal really highlighted some of these issues for me, and started my hate-train. Their desktop application is electron-based, which supports HiFi, as does Chrome. The crazy thing is Chrome is the only browser that supports HiFi, and has been this way since the service launched in 2014, despite countless requests from FF users to add support. If Tidal is going to spend 6 years ignoring everything but Chrome for the sake of their electron app, IMO other companies are going to follow suit and continue the march towards Internet Explorer 2.0

I totally understand. Electron is cancer. I don't want a "native app" to come in a 80~150MB executable, bundled with, not only all its dependencies, but a full fucking chrome engine.

Electron is cancer. I hate it and I'm not backing down off that.

Is 150 MB much nowadays? What would you have otherwise done with that disk space?

> It just seems pointless when you can run the exact software in the web browser

This isn’t necessarily true though. Electron gives you file system access (among other things). Most electron apps I’ve used at least do something that a browser cannot do. Though definitely not all.

Also, being able to alt/cmd + tab to they application you want is often convenient.

It's not so much the tech but yes, having to ship an entire browser for each app... This isn't sustainable, as many cheap devices today are sold with only 64/128 GB SSD...

One alternative is PWA but no interaction with the OS since sandboxed and of course, different platforms == different browsers which more or less support for PWA, so not fit for a Git desktop client for instance.

Check the FAQ for specific electron app support.

This is gotta be a joke. How would you even sell any of these?


however, they're messing up by making it a PCI-e card. They need to make it USB-C or with a lightning connector so that people can use it on laptops or mobile devices. No need to upgrade your phone when you can plug in this accelerator!

How does this work together with my normal graphics card?

it sends a webrtc stream which your video card's decoders & overlays draw

It is a joke, yet it is being debated seriously here. This strongly suggests, to me anyway, that our field is becoming a joke.

I work on Firefox and I cannot believe how seriously people are discussing this. At its core, people outrageously under-estimate how complex browsers are. Nevermind "how fast the web is moving", but even Servo cannot reliably render the wide web correctly, so good luck even writing a clean-slate implementation matching a snapshot of web specs taken today!

"It's one browser, Michael. What could it cost? 10 engineers?"

A great deal of complexity is due to technical debt, notably HTML/CSS technical debt because "don't break the web". Well maybe we should "break the web" once to allow new browser engines to be easier to craft, after all, the web started rather simple 30 years ago...

It’s not necessary. CSS 2.1 makes up most of the modern web without other modules and it’s not too hard to implement. It’s just that no one writes compositors.

Partial implementation can exist as valid user agents, and most developers don’t even seem to realize you can have a compliant browser that doesn’t render to CSS 2.1 box model specifications.

The thing is, if you can make it run the top 10 websites (google, facebook, youtube et all) and convince the maintainers of those websites to always test on this thing before each release that would already bring a giant win for most users in the planet, and websites that have features outside the scope of this thing would fallback to standard rendering; in such world (and if it gains momentum) developers would slowly move to avoid features that this doesn't support in order to get the performance benefits it brings.

The sentiment isn't, "hardware acceleration presents genuine technical advantages for HTML and CSS", it's "the web is so slow and bloated it feels like playing Crysis without a video card."

Have you actually tried rendering CSS 2.1 yourself? It’s not hard. Writing a tiling compositor is more difficult.

Firefox aside, I think anyone who took that joke as a serious product idea needs to brush-up on how computers work, how software works, and wha good engineering looks like.

I feel like this should be an indication that html is going in the wrong direction.

The image is satirical, so yes, that's the point.

I would still love to buy one though.

>no water cooler

Obviously fake, could not possibly make React tolerable.

This is satire, but parallel HTML rendering has untapped potential for both speedup and power savings (as you distribute single-core workload over a larger number of lower-voltage cores).


(PDF of a Servo talk from 2014)

I wonder if this could be one of ARM Macbook's killer features: a web browser with a Servo-like parallel engine, written for custom silicon with a number of small-ish, low-frequency ARM cores. Or would that be excessive?

Probably not worth it to add dedicated cores for web processing, but iPhone already takes advantage of special ARM instructions for faster JavaScript execution [0]. That will almost certain be used on the new ARM Macs.

[0] https://news.ycombinator.com/item?id=18163433

That is not true. As it turns out, Safari didn't use that instruction [1] at the time and even when it did. 99% of those performance gain in Speedometer had nothing to do what that instruction .

[1] https://twitter.com/saambarati/status/1049202132522479616?s=...

Ah, thanks for the correction. I had not seen the follow up since that was originally posted.

I guess not really your fault. We have KOLs that put out information on their site and channels ( in this case DaringFireball and Twitter ) and they, for whatever reason never really correct themselves. ( Even when they knew they were wrong )

Sounds perfect for a Chromebook.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact