Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Dangless-malloc – Safe dangling pointer errors (master thesis) (gaborkozar.me)
66 points by mettamage 5 months ago | hide | past | web | favorite | 11 comments

This thesis is from a friend of mine. In the following discussion [1] someone said that this warrants its own post.

I asked him permission to post this. He reads HN but is a bit shy in commenting, which is why I am posting it as show HN.

[1]: https://news.ycombinator.com/item?id=18212312

Errata: I asked him permission means I asked my friend permission.

That isn't really an error, at most you need to add the word between him and permission. Most English speakers should have no trouble parsing the original though.

> at most you need to add the word between him and permission

I think you're missing the word to be added... did you mean to suggest "add the word 'for' between him and permission"?

Alternatively, "I asked his permission..." would also have worked well.

Thanks. I wish there was a website like lang-8.com but then for people who know English quite well, except in its finer element. The fact that I am not sharp in languages in general (even Dutch, my native language) and am not really refined in English feels like I am prone to these type of mistakes when the edit functionality of a comment is gone.

Front page? Front page! Seeing this was the perfect moment to ask my friend Gábor some questions that were asked in the previous thread about it [1]. Again, in my opinion he is quite shy in commenting and he told me to go have fun :D What are my credentials on this? Not much, other than that I followed two courses with him on security from Herbert Bos (VU Amsterdam) and filmed his presentation -- other than that I'm more a mobile/web app person. He did say: "Tonight I'll add a link to the HN thread to the project website."

Here is what Gábor has to say about it. I edited his answer slightly (mostly I rewrote "we" as "I" since I -- Mettamage -- know that he worked more or less alone on this, aside from getting some academic advice from time to time). I added some YouTube links for computer system memory concepts (e.g. the EPT [6] and TLB [4]) for people who are foggy on the concepts. Anyways here he is.

__About performance__

Basically, TLB misses are the main reason for performance overhead.

Regarding performance cost, there's a perf analysis here [2, 3]. The benchmarks are for SPEC 2006, with the exception of two benchmarks that crash: 400.perlbench and 447.omnetpp. It seems like the machine runs out of identity-mapped EPT, but I'm not sure exactly why that is. The hardware is smarter than I thought.

It's not exactly clear to me either why it's reasonably low. We expected the Translation Lookaside Buffer [4] to be trashed. To be fair, existing publications on big conferences, like Oscar, also skim over the question of TLB pressure. So ¯\_(ツ)_/¯ Presumably, because they don't have a response either if you look at Oscar's numbers [5], they also have some extremely low overheads (and some extremely high ones as well).

__About dangling pointers themselves and potential new security threats__

Regarding fulafel: indeed, dangling pointers still exist, but because virtual memory is never re-used, any accesses through them are guaranteed to crash. So they no longer pose a security threat.

A potential new security threat is the following: because the application now runs in ring 0 (though virtualized), it has direct access to e.g. page tables. So if the application is compromised, the attacker can potentially modify the page tables, or the interrupt descriptor table, or fun stuff like that, basically gaining full control over the application. With that said, it's still all just in the virtual environment, so the threat is not significantly greater than the same thing happening in userspace (ring 3).

Not the Extended Page Table, nor the host page tables are vulnerable, of course [6].

__Other methods of achieving the same result with different performance__

The closest technology to mine that I know of is Oscar, which basically does the exact same thing as me, except without virtualization/Dune. So they have to do syscalls, which makes Dangless much, much faster generally. To respond to DenisM on this: sure, you can do that. That's what Oscar does, but that's very inefficient, due to syscall overhead.

__Questions that were unknown to me__

Intel APX: haven't heard about it, don't have time to look into it

Regarding the difference with Page Heap: I'm not familiar with the techniques referenced here, but the main novelty of this approach is that I use a light-weight virtualization to gain direct access to page tables inside the virtual environment. This allows for very efficient virtual memory management (remapping and invalidating of pages). Traditionally, this would require system calls (e.g. mremap() and mprotect() on Linux), which are expensive.

[1] https://news.ycombinator.com/item?id=18212312

[2] https://dl.gaborkozar.me/dangless/perf_diff_with_oscar_analy...

[3] https://dl.gaborkozar.me/dangless/perf_diff_with_oscar_analy...

[4] https://www.youtube.com/watch?v=95QpHJX55bM

[5] https://github.com/shdnx/dangless-malloc/blob/master/papers/...

[6] https://www.youtube.com/watch?v=Vw1B-U0Frws

will some long running programs that allocate / free a lot eventually run out of virtual addresses and then be either forced to terminate or forced to reuse virtual addresses thus removing the use after free protection?

like if you are using 4096 byte pages and 48 bits of virtual address space then you have 2^36 virtual pages before you run out. like it seems feasible that some programs will make > 70 billion allocations during their lifetime.

its very cool the tricks some people are doing with virtual memory aliasing.

I haven't spent all that much time taking a look at this (quick feedback: the PDF on that website is a bit annoying to read, since the "viewport" is really small), but how does this compare to a solution such as zeroing weak pointers in ARC?

In a language like C and its conventional implementations, how do you intend to find all the pointers in order to zero them?

Mike Ash's article goes into more depth into how something like this is implemented in Objective-C: https://www.mikeash.com/pyblog/friday-qa-2010-07-16-zeroing-.... To put it short, references are kept in a big dictionary and zeroed out on deallocation.

What would be the work involved to port this to Windows? Does it already have everything provided by Dune in some way?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact