Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, on that part I was referring more specifically to the vdev mess and raidz, how you cannot just add a disk to the file system. Which is pretty silly, considering that at the time there were plenty of ideas how to do it much better, although not for disks, but for nodes in distributed systems. But apparently no one on ZFS team had any idea about any of it.



ZFS doesn't work that way: it works on pools of storage. Disks are added to the pool.

Filesystems (or datasets) are carved out of a pool, not out of physical disks directly. There are vdevs on top of physical disks.

That's like people who are stuck back in the '80's because of ext3 arguing that ZFS is crap because it doesn't come with fsck or "some sort of repair tool" -- they just can't wrap their head around the ZFS concepts.


That was a pretty common pattern at Sun. Engineers there did a lot of great work, no denying it. Like many other projects, ZFS solved some pretty difficult and important problems while ignoring others. Which ones get talked about ad nauseam? We all make mistakes, of course, but some of those ignored problems were well known at the time to be worth solving. Did those self-professed "complete engineers" learn anything from those times they missed the target? The fact that Sun doesn't exist any more suggests an answer.


The problem is they cannot all be solved with just one single thing. Jeff Bonwick and his team knew that. They knew what was left on the table and they had to make that call. A completely fault tolerant, distributed system which can correctly arbitrate a write to a single shared resource (usually a file or even a block) from multiple nodes at the exact same time is a problem which has not been solved correctly yet, to the best of my knowledge and belief. I myself have spent considerable amount of time (20 years, to be precise) trying to solve that particular problem correctly for all edge cases. That problem is terra incognita in computer science and yet to be throughly researched, let alone understood. All attempts at solving that particular problem 100% correctly have not been successful, by anyone, but if you know of someone, please do tell, I'd love to be corrected on this and start using such technology immediately.

The ZFS team was well aware of this and that's why they made the trade-offs they did. They picked a subset of problems and they solved them. Even in hindsight, to me at least, it's obvious why.

But to suggest that Sun Microsystems doesn't exist because the company's engineers picked only a subset of problems they knew they could solve is... either terribly ignorant or wilfully malicious, or both, since malice often stems from ignorance. Sun Microsystems lost because they were too expensive, more expensive than garbage intel personal computer servers and because Solaris wasn't open source code in 1993. It's that simple.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: