Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> In io_uring's case sharing a userspace mapping with kernel space is simply always going to be dangerous.

Out of curiosity, have you ever used a syscall which writes or reads a userspace buffer?



It's not the same thing. io_uring remaps a contiguous chunk of pages in both the kernel and userspace vas. For read/write the data is copied from userspace into kernelspace.


You're swapping terms around to draw distinctions where there are no real differences. There's no such thing as a "kernel page" and a "userspace page". There's only pages of memory which are mapped into one or both. All pages accessible to userspace are also mapped in the kernel. That means that the ring buffers live in perfectly ordinary, mapped-in-both-places userspace pages. There is zero basis in fact for your claims that these are "kernel pages" which have been mapped to userspace. You have taken the exact same phenomenon, pages accessible to both userspace and the kernel, and called it by a new scary name "kernel pages accessible by userspace", instead of the other safe and ordinary name, "userspace pages". Now, the addresses used for the kernel mapping may be different than normal, but (a) that is completely inconsequential to security and (b) as you point out yourself, kernel address space is inaccessible to userspace so it could not possibly matter where the kernel maps the shared pages.

Now it is in general a real security bug when the kernel operates multiple times on data mapped into userspace instead of taking a one-time copy of that data and using this copy for multiple operations. However, the whole point of a ringbuffer is that it specifically operates in such a shared environment. Moreover it will be just as necessary for the kernel to perform the snapshot copy out of an SQE (and other shared structures) before beginning a sequence of operations on them. The only difference with io_uring is when is that copy performed: at the time of a syscall or during other operations. That too is completely inconsequential to security.

To sum up: It is correct to state that the kernel must be careful with shared mappings. It also would have been correct to state, had you reached this far, that io_uring is moving the userspace-copy boundary "deeper" into the kernel rather than isolating it at the syscall layer. It is, however, incorrect to state that io_uring contains a new, more dangerous kind of shared mapping. It is incorrect to state that the shared mapping used by io_uring is itself in any way a threat to kernel security.


Let's go over what I said:

io_uring remaps a contiguous chunk of pages in both the kernel and userspace vas.

This is a true statement - io_uring makes a compound page and calls remap into the usersapce vas in mmap, and I did not say that the pages were kernel or userspace page. However, you've said "userspace pages" in your own argument which by your own admonition is incorrect. You are correct in saying that pages are just pages, because a page is just a chunk of physical addresses assigned to a pfn and has no meaning in userspace or kernel space without a vma.

There is a difference between kernel and userspace mappings, and mapping userspace virtual address to point to direct mapped kernel addresses that the kernel is manipulating is dangerous and there are many CVEs that have taken advantage of these types of command buffers on other kernels.


Does the distinction between sharing VA mappings and copying buffers to/from kernel matter from a security perspective? (I assume it does, but I don't know why.)


Yes, you're looking at kernel pages through userspace virtual memory mappings, this isn't the case with copy to user. You're just copying data from a userspace page to a kernel page, but only in kernel mode. You don't get to "see" kernel pages and in fact post spectre/meltdown the kernel is unmapped in userspace.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: