I asked him permission to post this. He reads HN but is a bit shy in commenting, which is why I am posting it as show HN.
I think you're missing the word to be added... did you mean to suggest "add the word 'for' between him and permission"?
Alternatively, "I asked his permission..." would also have worked well.
Here is what Gábor has to say about it. I edited his answer slightly (mostly I rewrote "we" as "I" since I -- Mettamage -- know that he worked more or less alone on this, aside from getting some academic advice from time to time). I added some YouTube links for computer system memory concepts (e.g. the EPT  and TLB ) for people who are foggy on the concepts. Anyways here he is.
Basically, TLB misses are the main reason for performance overhead.
Regarding performance cost, there's a perf analysis here [2, 3]. The benchmarks are for SPEC 2006, with the exception of two benchmarks that crash: 400.perlbench and 447.omnetpp. It seems like the machine runs out of identity-mapped EPT, but I'm not sure exactly why that is. The hardware is smarter than I thought.
It's not exactly clear to me either why it's reasonably low. We expected the Translation Lookaside Buffer  to be trashed. To be fair, existing publications on big conferences, like Oscar, also skim over the question of TLB pressure. So ¯\_(ツ)_/¯ Presumably, because they don't have a response either if you look at Oscar's numbers , they also have some extremely low overheads (and some extremely high ones as well).
__About dangling pointers themselves and potential new security threats__
Regarding fulafel: indeed, dangling pointers still exist, but because virtual memory is never re-used, any accesses through them are guaranteed to crash. So they no longer pose a security threat.
A potential new security threat is the following: because the application now runs in ring 0 (though virtualized), it has direct access to e.g. page tables. So if the application is compromised, the attacker can potentially modify the page tables, or the interrupt descriptor table, or fun stuff like that, basically gaining full control over the application. With that said, it's still all just in the virtual environment, so the threat is not significantly greater than the same thing happening in userspace (ring 3).
Not the Extended Page Table, nor the host page tables are vulnerable, of course .
__Other methods of achieving the same result with different performance__
The closest technology to mine that I know of is Oscar, which basically does the exact same thing as me, except without virtualization/Dune. So they have to do syscalls, which makes Dangless much, much faster generally. To respond to DenisM on this: sure, you can do that. That's what Oscar does, but that's very inefficient, due to syscall overhead.
__Questions that were unknown to me__
Intel APX: haven't heard about it, don't have time to look into it
Regarding the difference with Page Heap: I'm not familiar with the techniques referenced here, but the main novelty of this approach is that I use a light-weight virtualization to gain direct access to page tables inside the virtual environment. This allows for very efficient virtual memory management (remapping and invalidating of pages). Traditionally, this would require system calls (e.g. mremap() and mprotect() on Linux), which are expensive.
like if you are using 4096 byte pages and 48 bits of virtual address space then you have 2^36 virtual pages before you run out. like it seems feasible that some programs will make > 70 billion allocations during their lifetime.
its very cool the tricks some people are doing with virtual memory aliasing.