This would mean that every container has its own buffer cache, you can no longer have intentional shared state (K8s secrets, shared volumes, etc.), and must construct block overlays instead of cheap file overlays. You’re definitely losing some of the advantages a container brings.
There are other advantages — low fixed resource costs, global memory management and scheduling, no resource stranding, etc. — but the core intent of gVisor is to capture as many valuable semantics as possible (including the file system semantics) while adding a sufficiently hard security boundary.
I’m not saying moving the file system up into the sandbox is bad (which is basically what a block device gives you), just that there are complex trade-offs. The gVisor root file system overlay is essentially that (the block device is a single sparse memfd, with metadata kept in memory) but applied only to the parts of the file system that are modified.
There are other advantages — low fixed resource costs, global memory management and scheduling, no resource stranding, etc. — but the core intent of gVisor is to capture as many valuable semantics as possible (including the file system semantics) while adding a sufficiently hard security boundary.
I’m not saying moving the file system up into the sandbox is bad (which is basically what a block device gives you), just that there are complex trade-offs. The gVisor root file system overlay is essentially that (the block device is a single sparse memfd, with metadata kept in memory) but applied only to the parts of the file system that are modified.