For VLC there are a bunch of out of bound reads and heap buffer overflows.
f2b1f9e subtitle: Fix potential heap buffer overflow
611398f subtitle: Fix potential heap buffer overflow
ecd3173 subsdec: Fix potential out of bound read
62be394 subsdec: Fix potential out of bound read
775de71 subtitle: Fix invalid double increment.
The Kodi issue was a zip archive path traversal (i.e. no protection against zip files extracting files to parent directories).
The fact that it's multiple, independent vulnerabilities makes me feel a little better. I've used Kodi and OpenSubtitles before while watching a movie to search and download subs for the movie without ever leaving Kodi. When it works, it's nothing short of magical.
Yes, those are very different issues.
From what I understood, one is an XSS (popcorn-time), one is a heap-based buffer overflow (VLC), and one is a zip-transveral (Kodi).
And tbh, I don't see how you can exploit the bug for VLC (with ASLR and HEASLR).
So it becomes a game of luck getting some users exploited.
Also, you've posted many uncivil and/or unsubstantive comments. We ban accounts for that too, so please don't do that either.
I also didn't check for executable heaps at the time but given that all heaps are non executable (which they really shouldn't be executable in VLC) again I don't see how RCE is possible. Maybe there is some way to validate and therefore brute force addresses? I don't know. But there was no VLC POC and I'm sure they would have made one if they could have.
Use VLC it's the most secure media player I've seen.
Having ASLR is not bullet proof to remote code execution, e.g. iOS has ASLR for a long time and can still be jailbroken (which usually involves a code injection etc). The key is info leak, e.g. if you somehow can reliably find the memory location of open() syscall, the memory location of the whole libc can be inferred, and libc is usually large enough to construct a ROP chain. (I haven't work in security area for a long time so correct me if I'm wrong).
The researcher unable to provide a POC for vlc could simply mean it's hard due to ASLR, but it's not impossible.
Also: I believe ASLR is a compiler option (with a supported OS), it should be relatively easy for Kodi and Popcorn Time to start using ASLR.
Scriptless 0day RCE is still possible in a ROP+ALSR world, but exploitation is a real bitch. Ex : https://scarybeastsecurity.blogspot.fr/2016/11/0day-exploit-...
Also, the security researcher did not provide a demo for the VLC exploit. Their demo is only on Kodi and popcorntime.
But anyway, security issues means releases.
Address space randomization is not "protection". It's a form of security by obscurity. The odds of an exploit working are reduced, at the expense of more crashes due to exploit failure.
It helps developers ignore bugs, since they can no longer reproduce them.
"Only" security by obscurity is the best we can get in the c/++ world without compiling for a virtual machine.
This is somewhat akin to saying "Randomly generated passwords are not 'protection'. They are a form of security by obscurity."
If things are random enough that an attacker is significantly hampered in most cases, that's one measure of security, no?
Does modern ASLR increase costs (time, difficulty, money, skill, etc.) necessary for exploitation and decrease benefits (privs, chances of success, etc.)? If yes, then it's a protection. Any security engineer will tell you unequivocally ASLR is a protection. And one of the most successful ones to date.
Still you're perfectly right that ASLR does not provide perfect safety, but merely makes exploitation way harder.
Well, it would crash, so they can reproduce it, no?
More related to the article, you would think that subtitles are literally the easiest file format in existence to safely handle. It's incredibly well-defined in terms of textual data and times.
Well, which one of them. There's nearly a hundred different subtitle formats, and each one has a whole set of variants. Just Timed Text alone (XML) can have more layouts than one could count, specially since it's meant to be able to replicate technically all previous industry formats.
On the other hand, images and videos are likely to be handled using some library, which might be better at safely handling the files.
Even the DVD subtitle format, which is just a mostly transparent image overlaid on the picture? In XML?
Depends on the format. SSA for instance can have embedded font and image files, which presumably have much more complex decoders.
> Fix potential heap buffer overflow
> Fix potential out of bound read
> Fix invalid double increment.
The exploit would presumably involve structuring your data so that the excess increment skips over a terminator of some sort. If it's scanning until it hits a zero byte, and you get it to skip over the zero byte, then you have a buffer overflow.
Do we have to build it from source?
Maybe we should stop random people from contributing to complex C projects?
(*(psz_text + 1 ) ) == '~'
psz_text == '~'
Also on a more personal note, if you're going to be putting things inside parentheses with whitespace, make it symmetrical.
The problem is that boring stuff can also be very security sensitive.
Yes I do, this is internet after all!
> seems you know your stuff,
Now you lost me :)
I wish :)
All those projects are under-funded, done by volunteers, on countless platforms, doing very low-level stuff, and supporting many formats.
This has nothing to do with one project or another.
What if there's a bigger security fix you need to push to people asap?
We usually let between 24hours and a few days before doing an upgrade, seeing the possible regressions.
From tag to release to updates can take only 4hours, if we want enough mirrors.
id3 -d *.mp3 ; id3 -2 -d *.mp3
I don't do this for security - I just don't like mp3 metadata competing with metadata in the filename and most mp3 metadata is laughably bad anyway so I just wipe it.
 /usr/ports/audio/id3mtag on FreeBSD
 Misspellings, First Last instead of Last, First, ALL CAPS ALL THE TIME and using special characters/unicode that always breaks car stereo implementations.
Yes, please - that sounds fantastic.
For example if MP3 genre field is 999 bytes long cut it down to 32 bytes.
I figured that subtitles were an obvious place to start because you can download them in small files, play them back alongside a video, and they are designed to be "timed out" to synchronize with a video already.
I looked into it for a bit but never really found a way (within my abilities at least) to do anything like this from within a .srt file or similar. I'd be interested in hearing if anyone else has more info on how you might do more with that "framework" than displaying text on screen.
Is there any more clarity around the situation now?
There is no excuse that these kind of applications are not completely sandboxed. All you need is some kind of DLL, raw data in, raw pixels out. In case of hardware accelerated codecs, raw pixels in, surface pointer in, nothing out. There is no need to be able to access the filesystem, etc.. To render subtitles on top of the video it's the same.
I wish a fraction of the energy we put into DRM would go into sandboxing instead.
So, let me share some light on the sandboxing for multimedia (I work on VLC).
If you sandbox an application like VLC, in the current way of doing sandboxing, which we've done for macOS, WinRT/UWP, and snaps, you still need a lot of permissions.
- you need to be able to open files without user interactions (no file picker), in order to open playlist, MXF or MKV files;
- you need the same if ever you have a database of files (media center oriented);
- you need raw access to /dev/* to play DVD, CD and other optical disk (and the equivalent on Windows);
- you need ioctl on such devices, to pass the MMC for DVD/Bluray;
- you need raw access to /dev/v4l* for your webcams and be able to control them;
- you need access to the GPU stack, which is running in kernel-mode, btw, to output video and get hw acceleration;
- you need access to the audio stack, also in low-level mode;
- you need access to the DSP acceleration (not always the GPU);
- on linux, you have access to x11 for the 3 above features, which is almost root;
- you need access to /etc/ (registry) for proxy informations, fonts configuration and accessibility;
- many OpenGL client libraries need access to the /etc too;
- you need access to the network, as input and output (think remote control);
- you need access to the system settings to disable screensavers, and adjust brightness;
- you need access to mounts to be able to see the insertion of DVD/Bluray/USB/SD cards and such;
- you need to expose an IPC (think MPRIS on Linux);
- you need to unzip, untar, decrypt, decipher and so on;
- you need access to the fonts and the fonts configuration (see fontconfig).
and I probably forgot one or another case.
The point is, all those features have good reasons to exist and very good use cases; but the issue is that for a media player, it will request almost all permissions except GPS and address book.
And quite a few of them are very close to kernel mode.
So, what is the solution?
Probably do a multi-process media player, like Chrome is doing, with parsers and demuxers in a different process, and different ones for decoders and renderers. Knowing that you probably need to IPC several Gb/s between them.
I've been working on such a prototype, but it's a lot of work... I accept donations :)
Other points may be more tricky, and it's a good list of potential issues, but we can start chipping away some stuff right now. There's a lot we can fix without fixing everything at the same time.
Not on Windwows or on macOS.
> new thread that will read from the existing FD and only send you simple, time sorted messages over a shared IPC/pipe is not that crazy.
Of course that does not solve anything, because your demuxer|decoders|output needs access to the FS, have access to kernel-mode and those are the dangerous parts.
> Not on Windwows or on macOS.
It's a shame, then, that Windows & macOS are holding back security improvements for software running on Linux. I understand (& even agree with!) your desire to have a sandboxing mechanism which runs acceptably on all supported systems; it's just sad that this security mechanism in the Linux kernel can't be taken advantage of in vlc.
The demo is on Windows. The goal is to do a sandbox that works on most OSes.
And, it will not solve the decoder issue, since it is on the decoding side, which still has access to the GPU/Aout and the kernel.
> Read access to an existing FD is not the same as full FS access, and there's no demux involved here.
You're totally missing the point here. The issue is demuxers/decoders/output, not really the access.
Reading from an FD or not would not solve the buffer overflow exploitation (if it was actually exploitable).
Feels kinda pointless, since all threads in a process share the same memory protection.
It's more coding, certainly, but it's possible. Security is an option if we wanted it.
The shared memory segment can be a GPU image buffer, so I don't think that's true.
Either you need to have multi-process and correct IPC, or you need to copy.
That's not actually how Chrome's renderer sandboxing works. Both Windows and OS X allow you to share a GPU-resident texture between processes (DXGI shared surfaces and IOSurface respectively), so there's no need to copy any video data.
The last part is just one of the issues, very far from all of them.
Seriously, stop thinking that noone has given a thought to the question...
As for isolating decoders from video filters and chroma conversion, I'm not sure why that would be necessary, since those shouldn't require any additional privileges. I understand that retrofitting an existing program to use a multi-process sandboxing model is far from easy, and I'm definitely not volunteering to do it, but I don't think there is anything specific about a video player that is harder to sandbox than a web browser.
Yes, that's the core of the issue.
I will refrain from answering to such attacks. As you seem to know better, I'm waiting for your patches.
That doesn't change the fact that none of the things you listed are unsupported by Chrome's sandbox model, and if you only need to establish a barrier around the video pipeline (and not e.g. VLC's ability to notice device status or interact with webcams) you don't even need 3/4 of what Chromium's sandbox has implemented. Like I said, I've actually walked the walk when it comes to using their sandbox for Windows and Linux with a process that needed to access certain user files, the GPU, and even each platform's font server equivalent, so this isn't me just spitballing about some theoretical solution.
Then with 40k60 + HDR, displaying is quite a lot of bandwidth.
Yeah, you can come up with high bandwidth scenarios like stereo VR 144 Hz 4k HDR running on barely capable hardware. But 99% of users don't require such tricks and never see any upside from the performance-over-security compromise.
Even if you decide basic IPC is not fast enough, a shared memory buffer for raw frame data is reasonably secure too.
I am only interested in these features:
Is there a lighter version where these features are cut?
I don't have a remote, so I'd like it to be disabled by default. I don't need any access to the network.
etc. etc. etc
> I don't need any access to the network.
90+% of what I use it for comes from my NAS or the Internet.
> I don't have an optical drive
Most of the rest is from optical discs.
> I'm perfectly fine with a default / embedded font. [...] I'm fine opening a subtitle file myself.
It's _fine_ but far from ideal. Both are useful quality of life features.
> Why would I need to unzip anything?
Non-essential, but being able to play video from a ZIP is a useful feature.
* Arbitrary number.
But when it comes to the desktop everything runs at $user and that's the end of it. While this makes sense for multi user "mainframe"-style systems, for modern desktops it's an anti-pattern almost. I wish I could run my browser as its own user, my password manager as an other, my code editor/toolchain in an other, the closed source spotify client in a third etc...
It's kind of doable today but it's not exactly friendly to setup. In particular Xorg is not exactly designed with client isolation in mind as far as I can tell, preventing one window from overtaking an other without being too cumbersome is left as as exercise to the reader.
But really, at the OS level I feel like we already have all the functionality we need and we just completely ignore it. On a desktop the critical account isn't really root per se, rather it's the user account that contains all of my data.
Maybe we've just been doing it wrong the entire time and we should just log into our single-user desktop computers as root and then spawn our shells and other applications as various unpriviledged users as necessary (this could easily be scripted in launcher scripts). I wonder if anybody has attempted to do that, but again I don't expect that Xorg would work very well in this configuration.
Replace '...as various unprivileged users' with '...as completely isolated virtual machines' and you've got the gist of what QubesOS does. I haven't tried it personally, but it sounds really interesting.
Using VMs sound a bit more heavy handed than what I had in mind, but I guess on modern machines with good hardware support it should be pretty workable.
It's dockerized applications (as close as a VM as possible).
Woah, what a sense of entitlement! What's your excuse for not having submitted a patch years ago?
"There is no excuse for ___" definitely crosses the line from criticism to entitlement. :^)
A correct statement might have been "there is no technical reason that prevents sandboxing, given a ton of work".
And realistically once you do that you have another component out there with all that complexity and permissions to exploit. That's exactly what happened with Android. The apps have a clean sandbox, so all the exploits target the mediaserver process instead.
The most dangerous areas are the demuxers and decoders so they have to be sandboxed.
So yes, you're right, but this doesn't solve the problem of moving the buffers between processes.
And the performance is not easy to obtain.
I'm also not sure where you'd lose much performance. If you hand the file handles/sockets and backbuffer to the renderer, you only need enough IPC to synchronize the drawing. Sending small messages on the order of 100 times per second between processes is not going to be a bottleneck.
It is not easy. If it was, people would have done it already.
Do any of those components need unrestricted/unpredictable file access? Because if they don't you can just open the files in the main process that handles the UI and send them to the sandboxed process via IPC. None of Windows/OSX/Linux do permission checks when file handles are read from, they only check when the file is initially opened.
Don't try and start doing weird things with something like subtitles and we are fine.
Why does VLC or one of the other programs feel the need to do anything more than that, resulting in gaping security vulnerabilities? Is there any good justification? Or is this again about some overflow with unexpectedly long strings or something like that? (In such case it is the not so careful programming on VLC side that is the problem)
Furthermore the subtitles are often inside the video graphical data itself. I've actually never used a subtitles file. I tried a few times, but every single damn time they were off, and not only off but exponentially off, which made it impossible to get the correct text for all play positions in the video. If you ask me, so far all the subtitle files I tried for any movie suck anyway.
(This is ignoring any subtitle file specifications, which might exist.)
I think the option to have positioning information was a good idea.
Embedded web engines should probably have a minimalistic safe mode.
 Obvious problems, like unplayable, etc. Minor problems (chroma placement error, transfer function, etc) seems to occur very frequently.
That's one of the reasons these days there's a tendency to use text-based representations like JSON, but of course anything size-sensitive such as images and movies is still generally binary.
What I really don't understand is Acrobat Reader. It has a "Protected View", which is the first WTF - .pdf-s are read-only, so there should be absolutely zero active code running anyways. What's the next, much bigger WTF WTF WTF is that you need to exit protected view to print the document.
How can the program read and render the document on screen, but not print it?! How is this even possible?
PDF is an old complex format with a lot of features used in a lot of special cases that go light years beyond looking at a simple text file. It's the reason for all the issues, but keeping it useful as it is and magically waving away all issues is not really easy.
HTTP or HTTPS does not change that.
The ingenuity that goes into RCE exploits never ceases to amaze (and terrify) me. Can't wait for more details to be released.
Rust isn't the only alternative to write native code safer than C will ever allow.
Ada was and still is a quite modern language, designed for software development done by large teams, where I can several years later still understand what I wrote.
On the downside, if you want to call it that, is a more prominent syntax (keywords instead of curlies, upper-case keywords, etc).
On the upside it lacks any unsafe operations, except for dealloc. In addition, it has actual modules in lieu of includes, hence it's blazingly fast to compile and/or recompile. It'a a pity it didn't catch on, the language lacked a company to back and promote it. AT&T promoted C, Apple promoted Objective C, Microsoft promoted VB...
Actually Apple promoted Object Pascal, but then they decided to cater to the growing UNIX market and replaced the Mac OS SDK with C and C++ (PowerPlant) one.
Then, however, some dipshit decides to extend the format by adding tags for things like bold, italics, underline etc. This is completely unnecessary for subtitles because the emphasis can be inferred from the dialogue. The unnecessary complexity increase the potential for vulnerabilities.
Then some total dickhead decides to add an HTML5 tag, for no reason whatsoever, and it all goes to hell.
This is illustrative of the problem with most software: the absence of a clear-headed benevolent dictator to say, "no; you are an idiot; we're not doing that."
This is completely unnecessary for subtitles because
the emphasis can be inferred from the dialogue.
For example: I recently watched the movie "The Handmaiden" which includes both spoken Korean and Japanese. The language the characters speak in any given situation is relevant to the story. If all the subtitles were the same I would not have noticed this destinction.
Emphasis of an entire line can be inferred, but how can emphasis within a line be inferred when you don't know which utterances within the line correspond to which words in the subtitles (which, if you need subtitles because you don't know the language being spoken, you won't)?
While uncommon, I've occasionally seen font variants used for emphasis on professional subtitles for that reason.
Or even more extreme, if you need subtitles because you are deaf.
Fonts that have a virtual machine in...
That doesn't solve the problem. http://langsec.org/
I vote for SUB-DURAL HEMATOMA
Well, last years exploits against iOS, Android and Ubuntu where all related to media metadata processing. It is only natural that the same folks screw up this one too.
Plus you're dissing some very complex projects. I think you're underestimating the complexity of the work these "same folks" are doing.