If you think that was bad, just listen to what Theo de Raadt had to say. "NFSv4 ...

wahern · on June 21, 2022

Trash or not, the demand for the features is there. OpenBSD enjoys the luxury of simply telling people who need more sophisticated features to piss-off, at least until the time a protocol or interface has been hashed out and settled into a static target.

Notably, OpenBSD has an IPv6 and IPSec (including IKE) stack second to none. If OpenBSD developers actually had a need for the features provided by NFSv4, I'm sure OpenBSD would have an exceptionally polished and refined--at least along the dimensions they care about--implementation. But they don't. What they do have is a relatively well-maintained NFSv3 and YP stacks (not even NIS!), because those things are important to Theo, especially for (AFAIU) maintaining the build farm and related project infrastructure.

bgm1975 · on June 21, 2022

Yp is NIS. It was renamed by Sun due to the trademark on Yellow Pages. Maybe you’re thinking of NIS+ (which was an abomination). TBH, they are both horrible for their own reasons.

wahern · on June 21, 2022

Ah, yes, NIS+. Thank you for the correction.

I also had in mind that OpenBSD deliberately and rigorously only refers to "YP" ("Yellow Pee"). Google "OpenBSD" and "NIS" and most of the hits you'll see directly from the OpenBSD project are from commit logs for patches removing accidental usages of "NIS" in initial YP-related feature commits. I'm not quite sure why they do that. I've kind of assumed it's to make clear that they have little interest in addressing vendor compatibility issues, and to emphasize that YP support, such as it is, is narrowly tailored to supporting the needs of the OpenBSD project itself. That's quite different from IPv6, IPSec/IKE, and even NFSv3, where cross-vendor interoperability is a concern (within reason).

DonHopkins · on June 21, 2022

Speaking of YP (which I always thought sounded like a brand of moist baby poop towelettes), BSD, wildcard groups, SunRPC, and Sun's ingenuous networking and security and remote procedure call infrastructure, who remembers Jordan Hubbard's infamous rwall incident on March 31, 1987?

https://news.ycombinator.com/item?id=25156006

https://en.wikipedia.org/wiki/Jordan_Hubbard#rwall_incident

>rwall incident

>On March 31, 1987 Hubbard executed an rwall command expecting it to send a message to every machine on the network at University of California, Berkeley, where he headed the Distributed Unix Group. The command instead began broadcasting Hubbard's message to every machine on the internet and was stopped after Hubbard realised the message was being broadcast remotely after he received complaints from people at Purdue University and University of Texas. Even though the command was terminated, it resulted in Hubbard receiving 743 messages and complaints, including one from the Inspector General of ARPAnet.

I was logged in on my Sun workstation "tumtum" when it happened, so I received his rwall too, and immediately sent him a humorous email with the subject of "flame flame flame" which I've lost in the intervening 35 years, but I still have a copy of his quick reply:

    From: Jordan K. Hubbard <jkh%violet.Berkeley.EDU@berkeley.edu>
    Date: Tue, Mar 31, 1987, 11:02 PM
    To: Don Hopkins <don@tumtum.cs.umd.edu>
    Subject: re: flame flame flame

    Thanks, you were nicer than most.. Here's the stock letter I've been
    sending back to people:

    Thank you, thank you..

    Now if I can only figure out why a lowly machine in a basement somewhere
    can send broadcast messages to the entire world. Doesn't seem *right*
    somehow.

                                        Yours for an annoying network.

                                        Jordan

    P.S. I was actually experimenting to see exactly now bad a crock RPC was.
    I'm beginning to get an idea. I look forward to your flame.

                                                Jordan

Here's the explanation he sent to hackers_guild, and some replies from old net boys like Milo Medin (who said the program manager of the Arpanet in the Information Science and Technology Office of DARPA Dennis G. Perry said they would kick UCB off the Arpanet if it ever happened again), Mark Crispin (who presciently proposed cash rewards for discovering and disclosing security bugs), and Dennis G. Perry himself:

    From: Jordan K. Hubbard <jkh%violet.Berkeley.EDU@berkeley.edu>
    Date: April 2, 1987
    Subject: My Broadcast

    By now, many of you have heard of (or seen) the broadcast message I sent to
    the net two days ago. I have since received 743 messages and have
    replied to every one (either with a form letter, or more personally
    when questions were asked). The intention behind this effort was to
    show that I wasn't interested in doing what I did maliciously or in
    hiding out afterwards and avoiding the repercussions. One of the
    people who received my message was Dennis Perry, the Inspector General
    of the ARPAnet (in the Pentagon), and he wasn't exactly pleased.
    (I hear his Interleaf windows got scribbled on)

    So now everyone is asking: "Who is this Jordan Hubbard, and why is he on my
    screen??"

    I will attempt to explain.

    I head a small group here at Berkeley called the "Distributed Unix Group".
    What that essentially means is that I come up with Unix distribution software
    for workstations on campus. Part of this job entails seeing where some of
    the novice administrators we're creating will hang themselves, and hopefully
    prevent them from doing so. Yesterday, I finally got around to looking
    at the "broadcast" group in /etc/netgroup which was set to "(,,)". It
    was obvious that this was set up for rwall to use, so I read the documentation
    on "netgroup" and "rwall". A section of the netgroup man page said:

      ...
         Any of three fields can be empty, in which case it signifies
         a wild card.  Thus
                    universal (,,)
         defines a group to which everyone belongs.  Field names that ...
      ...

    Now "everyone" here is pretty ambiguous. Reading a bit further down, one
    sees discussion on yellow-pages domains and might be led to believe that
    "everyone" was everyone in your domain. I know that rwall uses point-to-point
    RPC connections, so I didn't feel that this was what they meant, just that
    it seemed to be the implication.

    Reading the rwall man page turned up nothing about "broadcasts". It doesn't
    even specify the communications method used. One might infer that rwall
    did indeed use actual broadcast packets.

    Failing to find anything that might suggest that rwall would do anything
    nasty beyond the bounds of the current domain (or at least up to the IMP),
    I tried it. I knew that rwall takes awhile to do its stuff, so I left
    it running and went back to my office. I assumed that anyone who got my
    message would let me know.. Boy, was I right about that!

    After the first few mail messages arrived from Purdue and Utexas, I begin
    to understand what was really going on and killed the rwall. I mean, how
    often do you expect to run something on your machine and have people
    from Wisconsin start getting the results of it on their screens?

    All of this has raised some interesting points and problems.

    1. Rwall will walk through your entire hosts file and blare at anyone
       and everyone if you use the (,,) wildcard group. Whether this is a bug
       or a feature, I don't know.

    2. Since rwall is an RPC service, and RPC doesn't seem to give a damn
       who you are as long as you're root (which is trivial to be, on a work-
       station), I have to wonder what other RPC services are open holes. We've
       managed to do some interesting, unauthorized, things with the YP service
       here at Berkeley, I wonder what the implications of this are.

    3. Having a group called "broadcast" in your netgroup file (which is how
       it comes from sun) is just begging for some novice admin (or operator
       with root) to use it in the mistaken belief that he/she is getting to
       all the users. I am really surprised (as are many others) that this has
       taken this long to happen.

    4. Killing rwall is not going to solve the problem. Any fool can write
       rwall, and just about any fool can get root priviledge on a Sun workstation.
       It seems that the place to fix the problem is on the receiving ends. The
       only other alternative would be to tighten up all the IMP gateways to
       forward packets only from "trusted" hosts. I don't like that at all,
       from a standpoint of reduced convenience and productivity. Also, since
       many places are adding hosts at a phenominal rate (ourselves especially),
       it would be hard to keep such a database up to date. Many perfectly well-
       behaved people would suffer for the potential sins of a few.

    I certainly don't intend to do this again, but I'm very curious as to
    what will happen as a result. A lot of people got wall'd, and I would think
    that they would be annoyed that their machine would let someone from the
    opposite side of the continent do such a thing!

                             Jordan Hubbard
                             jkh@violet.berkeley.edu (ucbvax!jkh)
                             Computer Facilities & Communications.
                             U.C. Berkeley

    From: Milo S. Medin <medin@orion.arpa>
    Date: Apr 6, 1987, 5:06 AM

    Actually, Dennis Perry is the head of DARPA/IPTO, not a pencil pusher
    in the IG's office.  IPTO is the part of DARPA that deals with all
    CS issues (including funding for ARPANET, BSD, MACH, SDINET, etc...).
    Calling him part of the IG's office on the TCP/IP list probably didn't
    win you any favors.  Coincidentally I was at a meeting at the Pentagon
    last Thursday that Dennis was at, along with Mike Corrigan (the man
    at DoD/OSD responsible for all of DDN), and a couple other such types
    discussing Internet management issues, when your little incident
    came up.  Dennis was absolutely livid, and I recall him saying something
    about shutting off UCB's PSN ports if this happened again.  There were
    also reports about the DCA management types really putting on the heat
    about turning on Mailbridge filtering now and not after the buttergates
    are deployed.  I don't know if Mike St. Johns and company can hold them
    off much longer.  Sigh...  Mike Corrigan mentioned that this was the sort
    of thing that gets networks shut off.  You really pissed off the wrong
    people with this move! 

    Dennis also called up some VP at SUN and demanded this hole
    be patched in the next release.  People generally pay attention
    to such people.

                                            Milo

    From: Mark Crispin <MRC%PANDA@sumex-aim.stanford.edu>
    Date: Mon, Apr 6, 1987, 10:15 AM

    Dan -

         I'm afraid you (and I, and any of the other old-timers who
    care about security) are banging your head against a brick wall.
    The philsophy behind Unix largely seems quite reminiscent of the
    old ITS philsophy of "security through obscurity;" we must
    entrust our systems and data to a open-ended set of youthful
    hackers (the current term is "gurus") who have mastered the
    arcane knowledge.

         The problem is further exacerbated by the multitude of slimy
    vendors who sell Unix boxes without sources and without an
    efficient means of dealing with security problems as they
    develop.

         I don't see any relief, however.  There are a lot of
    politics involved here.  Some individuals would rather muzzle
    knowledge of Unix security problems and their fixes than see them
    fixed.  I feel it is *criminal* to have this attitude on the DDN,
    since our national security in wartime might ultimately depend
    upon it.  If there is such a breach, those individuals will be
    better off if the Russians win the war, because if not there will
    be a Court of Inquiry to answer...

         It may be necessary to take matters into our own hands, as
    you did once before.  I am seriously considering offering a cash
    reward for the first discoverer of a Unix security bug, provided
    that the bug is thoroughly documented (with both cause and fix).
    There would be a sliding cash scale based on how devastating the
    bug is and how many vendors' systems it affects.  My intention
    would be to propagate the knowledge as widely as possible with
    the express intension of getting these bugs FIXED everywhere.

         Knowledge is power, and it properly belongs in the hands of
    system administrators and system programmers.  It should NOT be
    the exclusive province of "gurus" who have a vested interest in
    keeping such details secret.

    -- Mark --

    PS: Crispin's definition of a "somewhat secure operating system":
    A "somewhat secure operating system" is one that, given an
    intelligent system management that does not commit a blunder that
    compromises security, would withstand an attack by one of its
    architects for at least an hour.

    Crispin's definition of a "moderately secure operating system": a
    "moderately secure operating system" is one that would withstand
    an attack by one of its architects for at least an hour even if
    the management of the system are total idiots who make every
    mistake in the book.
    -------

    From: Dennis G. Perry <PERRY@vax.darpa.mil>
    Date: Apr 6, 1987, 3:19 PM

    Jordan, you are right in your assumptions that people will get annoyed
    that what happened was allowed to happen.

    By the way, I am the program manager of the Arpanet in the Information
    Science and Technology Office of DARPA, located in Roslin (Arlington), not
    the Pentagon.

    I would like suggestions as to what you, or anyone else, think should be
    done to prevent such occurances in the furture.  There are many drastic
    choices one could make.  Is there a reasonable one?  Perhaps some one
    from Sun could volunteer what there action will be in light of this
    revelation.  I certainly hope that the community can come up with a good
    solution, because I know that when the problem gets solved from the top
    the solutions will reflect their concerns.

    Think about this situation and I think you will all agree that this is
    a serious problem that could cripple the Arpanet and anyother net that
    lets things like this happen without control.

    dennis
    -------

Also:

http://catless.ncl.ac.uk/Risks/4.73.html#subj10.1

https://everything2.com/title/Jordan+K.+Hubbard

tptacek · on June 21, 2022

Immediately one of the all-time great HN posts.

dekhn · on June 21, 2022

thanks, always fun to read this history.

smarks · on June 21, 2022

OK, I read the Olaf Kirch article, and the "NFS Sucks" title is mostly clickbait. There are indeed a bunch of shortcomings in NFS that he points out, that are partially addressed by NFSv4. He also admits that (as of 2006) there isn't anything better.

Locking has historically always been a problem in NFS. Kirch mentions that NLM was designed for Posix semantics only. I frankly don't know if NLM is related to `rpc.lockd` which appeared in SunOS 4 and possibly even SunOS 3 (mid 1980s at this point) which well predates anything having to do with Posix. Part of the problem is the confused state of file locking in the Unix world, even for local files. There was BSD-style `flock` and SYSV-style `lockf` and there might even have been multiple versions of those. Implementing these in a distributed system would have been terribly complex. Even at Sun, at least through the mid 1990s, the conventional wisdom was to avoid file locking. If you really needed something that supported distributed updates, it was better to use a purpose-built network protocol.

One thing "willy" got right in his comment is that NFS is an example of "worse is better". In its early version, it had the benefit of being relatively simple, as acknowledged in the LWN article. This made it easy to port and reimplement and thus it became widespread.

Of course being simple means there are lots of tradeoffs and shortcomings. To address these you need to make things more complex, and now things are "ridiculous" and "bloated". Oh well.

Enderboi · on June 21, 2022

The funny thing is that we're running our large NFS servers on OmniOS (with Linux NFS clients), as plugins for a certain large PHP-based blogging platform loves to sprinkle LOCK_EX flocks all over the place.

Sadly with a Linux NFS server, lock state eventually corrupts itself to extinction but OmniOS can tick along past 300 days uptime without a problem.

Of course.... these issues only show up under production levels of load, and have never been able to be distilled into a reproducable test case.

FML, and FNL ( = fricking NFS locking :P)