Thanks to Microsoft's recent release of the MS-DOS 2.0 source code, we can now peek under the hood and confirm that Microsoft specifically intended for the DOS 2.0 file APIs to be compatible with Unix. From XENIX.ASM [1], the code that implements the new API:
;
; xenix file calls for MSDOS
;
TITLE XENIX - IO system to mimic UNIX
And the CONFIG.DOC [2] file discusses the 'AVAILDEV' option which lets the system mimic Unix even more:
AVAILDEV = <TRUE or FALSE>
The default is TRUE which means both /dev/<dev> and
<dev> will reference the device <dev>. If FALSE is
selected, only /dev/<dev> refers to device <dev>,
<dev> by itself means a file in the current directory
with the same name as one of the devices.
Finally, an example CONFIG.SYS file from the same document:
A typical configuration file might look like this:
BUFFERS = 10
FILES = 10
DEVICE = /bin/network.sys
BREAK = ON
SWITCHAR = -
SHELL = a:/bin/command.com a:/bin -p
I think it's pretty clear how Microsoft intended for MS-DOS to be configured, but alas IBM had other ideas...
I really look forward to the day NT's source is released. It's a truly fascinating kernel. Imagine somebody building an entire Unix on top of NT! Like WSL, but even beyond.
Apart from fork() and custom async io (epoll, iocp) aren’t they pretty compatible even without cygwin? GNU relies on POSIX (afair), so wouldn’t NT features be unused extra?
* It is (with a lot of ifs and buts) a microkernel
* it is optimized for integration with third party software from people who don’t have the source, so for instance the driver model is interesting
* the built in configuration system (the registry) and how it’s used throughout
* the (underused) personalities system you can use to show different apis to different binaries
* the security model is much more interesting, while in Linux you have ‘root’ and everything else, on Windows this is much more granular (unfortunately it’s so complex it’s basically impossible to use).
The architecture is really quite interesting, even though Microsoft didn’t make a lot of use of a large part of it.
If I'm not mistaken, the Win32 API is actually a subsystem to the NT kernel. You can call the kernel itself a layer below with functions beginning with 'nt*'.
Hardware is actually mapped as an object namespace, which is presented to the user as the drive letters. This was exposed with Windows XP booting in safe mode; during the boot process it would print file paths as object paths, not drive paths.
Much as I'm more in the *nix way now, there's plenty of curiosities to explore and tinker in Windows.
The big difference is when you're in a multi-computer (Active Directory / NIS / LDAP) environment. On UNIX all the IDs are smallish integers, so you have to be careful to ensure they're unique and non-overlapping. On Windows you have a "SID" which is variable length and (for users) usually a big random number.
Linux User IDs since decades are 32 bit integers; you can just use some mapping system to allocate them automatically and you’ll never run out.
The limitation is that there is one user ID, 0 which can do everything and all the other IDs can do almost nothing.
This has nothing to do with domains and everything with the distinction you describe between the Windows Administrator, local system or even more powerful trustedinstaller accounts.
Yes and no. Bear in mind, a large part of a Windows SID is a namespace - the actual id within that namespace is so far without exception under 32-bits. An entire Active Directory domain (read: single domain, not forest) is actually limited to 2^30 RID's being issued - after which no new accounts (including computer accounts) can be created, period. You can technically unlock an extra bit and issue 2^31 RID's starting with WS2012, but compatibility is a potential issue and MS's documentation says you should only use it while planning a migration to a new domain (and for good reason).
This does technically give Windows some advantage here as SID's are namespaced - you can have multiple domains in a forest, domain trusts, etc - but I don't think as far as realistic number of users accessing a network it makes much of a difference.
Where it does suck on Linux, however, is user namespaces. 32-bits is a lot when it comes to just giving out accounts, but it's nowhere near enough to give every user a 16-bit chunk of accounts for mapping the traditional 0-65535 (because nobody) ranges for use with unprivileged user namespaces. I'd really like to see a push for 64-bit uid/gid's for this reason.
And yet in practice the only problem with them is when mapping Windows SID is needed. Otherwise, they are fine.
Also, Windows SIDs are fixed-size 128 bit. They were supposed to be GUIDs, but they are not that random; user SIDs contain common prefix from the domain SID.
Almost every time I talked to pseudo nt gurus, IOCP was mentioned as something nt did and linux don't. Now, io_uring seems to have finally closed the gap.
No, like io_uring which Linux got in 2019 and still doesn't cover everything yet (e.g. currently mkdirat() is being added). It also often falls back to a kernel-level thread pool, since many e.g. file systems don't implement async IO.
>since many e.g. file systems don't implement async IO
If we're picking on Linux here we should also mention that this issue cannot exist on Windows because it doesn't support anything else besides NTFS and a couple of options dating back to the 18th century or so.
Oh, and WinFS, of course, F being short for future which is like the horizon, or the communism, always out there just a month or two away.
Sure, and there are third party drivers for ext* and btrfs and god knows what else… but we're talking about official support here, I think. You can find all sorts of craziness in out-of-tree patches for the Linux kernel.
“ n the mid-1980s, Microsoft and IBM cooperated to develop the OS/2 operating system, which was written in assembly language for single-processor Intel 80286 systems. In 1988, Microsoft decided to make a fresh start and to develop a “new technology” (or NT) portable operating system that supported both the OS/2 and POSIX application programming interfaces (APIs). In October 1988, Dave Cutler, the architect of the DEC VAX/VMS operating system, was hired and given the charter of building this new operating system.”
Is the fact that it's a very stable and robust kernel that evolved over different means, with different priorities and expectations not enough?
The kernel itself is very interesting, regardless of what happened in the Linux world. If all you ever look at is Linux, you get stuck in Linux ideas of how things should, or even can, be done.
What? Can you elaborate? I mean if you want non blocking IO from an fd in Linux, you can just do that. Not sure what defaults have to do with anything. Your code will still have to be written appropriately.
Default in linux is read()/write() and be blocked. Doing async is a lot more work and until relatively (to NT timelines) recently quite limited (select).
On NT the standard way is async, and doing things in the block-and-wait way is abnormal and unusual.
Defaults matter. Because that's what most people will do.
Linux use read(2)/write(2) for both blocking and non-blocking. Still don't understand what you mean by how NT does it. Either your code is written to accomodate Async IO or not.
In addition to its use as a path separator in DOS and Windows, the backslash character itself is also interesting because it is very likely a modern invention, with the first attestation in 1940s (!). Its original use in ASCII (1960s) was for ALGOL operator digraphs `\/` and `/\`. Its early use as the C escape sequence (1970s, replacing `*` in BCPL) suggests that it carried no significant semantics at that time.
It was designed in a time where keyboards around the world (German QWERTZ, Cyrillic JCKUEN/ЙЦУКЕН) and text encoding (remember that this is pre-Unicode, so we're dealing with ISO 646 and Eastern Asian character sets) has only the subset of Latin characters used in the US. Nowadays, it is strongly recommended to simply use the standard American keyboard in programming (outside of comments).
I still cannot decide whether it's better to comment in English or in my native language. English works, but it's a weird fit with words from the application domain. Those, I don't really want to translate, but I have to keep doing it because there is so much legacy code with the translations. Sure, I could do a rampage through the code base and fix everything up (would be done in an afternoon thanks to IntelliJ), but my team would almost certainly give me lots of flak for this. At the same time, it also feels weird to conjugate english words in my native language.
One of the comments on that blog post says the backslash was invented by Bob Bemer at the time he developed ASCII in the 1960s, also linking to a page that supports the claim:
I briefly checked comments and missed that. ;-) I would be more cautious to say that he invented backslashes, as it's unknown that he was aware of earlier uses of backslashes. (I've checked both the page and its citations and they gave no more information.)
> I would be more cautious to say that he invented backslashes
You're right of course, I was just relaying what the post there said. There was prior usage as early as in the 1940s as you mentioned. According to Wikipedia, [1] its origins are unknown:
> As of January 2021, Wikipedia editors have not been able to find the origin of this character nor even the purposes to which it was put before the 1960s. The earliest known reference found to date is a 1945 bulletin from the Teletype Corporation that lists it as a replaceable part for its Wheatstone perforator.
On the other hand, I think it's safe to say it would have remained an obscure character had it not been for Bob Bemer.
While it was included in early character sets it just wasn't used much - largely because the ubiquitous 029 card punch had no \ key (or [] ... one had to learn the multipunches)
I have to suppress the urge to correct people both when they refer to it as a backslash and when they use the term "forward slash". I've already lost all my friends by being that guy; I don't need to also earn the enmity of random strangers on the internet.
I live in Britain and don't have an issue with it, '#' is originally a symbol for pounds, a shorthand for 'lb', and we are used to the idea that pounds can mean weight, as well as our currency, so why not this symbol as well?
It's believed the 'pound' in pound sterling came from a pound of silver or silver coins in weight originally as well.
I believe it is related to them both being called 'pound', and it is very annoying. The two are not the same symbol, I don't know why they equivocated them. Technically in a distant root they are related, but they are distinct in usage and meaning.
That's not the same at all. Hash-tag is just a name for that symbol. It's not the original name, but it is a name. "/" is slash! "\" is a backslash! People who do not type windows paths have probably never actually encountered a backslash in their entire lives, but when they see a url they think that the symbol they see all the time in other contexts (/) suddenly has a different name!
A 'hash-tag' is a feature of internet apps like Twitter where you put a '#' in front of some topic name to relate your content with similar content. It's named after the character which is known as a 'hash' among other things.
Or are you saying 'hash-tag' is the name because although it's a mistake it's used so much now it's considered language?
Language isn't one big blob, even though among many people it could now be considered an alternative pronunciation, and eventually it could be adopted even in places where people would otherwise know better, right now among people in tech and certainly on this site 'hash-tag' to mean the character is incorrect and confusing.
But there isn’t another character that anyone calls “hashtag”. It’s like if everyone nontech called “:” a “semicolon” only when it’s preceeded by “http”. There is something else called semicolon and its not the thing in the url.
I remember it being the number sign when first learning to type in school. I then heard it refered to as pound sign. And once I started into the dev world, it became the ubiqutous comment. Then my favorite became the shebang when paired with the friendly bang/exclamation/pling.
Contextually, yes. Sometimes it matters, sometimes it doesn't. Same thing goes for the dash/underscore distinction. I imagine there must be similar grappling in the editor world between usages of dash vs. hyphen
In 2007 I joined a company that sold a derivatives trading platform and ran VMS on the core component, the order routing engine that dispatched trading instructions to the respective exchange.
It was really weird working on it at first after having used both Unix systems and DOS for so long. I was very familiar with DOS and even used CP/M way back in the day when I was a teenager (I'm that old), VMS was a weird amalgam of a pretty advanced OS with a very capable command line environment like unix, but with a lot of conventions that felt familiar from DOS. I was a ware of some of the history of VMS influence on DOS so it was fascinating to work on and I quite enjoyed it even though it was clearly a dead end at that point.
It was an unfortunate choice because \ is one of the 12 characters that vary between country versions of ISO 646 (of which ASCII was the US profile). This was why Japanese MS-DOS used ¥ instead of \ for the directory separator: it occupied the same code point in the Japanese profile of ISO 646 that \ did in ASCII, and Shift-JIS encoded that as a single byte with the high bit clear.
The complete list of the 12 was: # $ @ [ ] \ ^ ` { } | ~. Notice that / is safe.
They avoided the problem for all the Western languages by inventing their own 8-bit code (this was before the ISO 8859 standards) and always using ASCII in the lower half.
The person that made this decision was the author of the original DOS - Al (Allan) Alcorn. I had a conversation with him nearly 2 decades ago about this very subject. I remarked to him how using the "opposite slash" caused grief for untold numbers of developers. His reply was "yeah, I know. It was a poor decision, and I remember it clearly. I was trying to make DOS a bit unique, and that was all. It was stupid, in retrospect." From the author's mouth.
Funny, i've always imagined it was chosen only because it wasn't a regular slash like other systems. I never mentioned it though because i figured I was wrong.
All wrong. "/" was the options-switch-character in CP/M, which had no sub-directories. DECs and VAXes had nothing to do this decision, because only few humans had access to such machines. I even remember I had problems adapting to directory trees in MSDOS because files were kinda hidden and lost in a floppy directory.
It's true that DOS was originally a CP/M clone, but CP/M didn't have a defined standard option switch character, and in fact many command option switches weren't preceded by a switch indicator or differed as to what option switch character was used. If anything + and - were the most common prefixes for an option switch.
CP/M was surely the direct influence. Wikipedia says CP/M was itself influenced by TOPS-10, which I think also used the slash for options, so the ultimate origin may be with DEC.
Can anyone explain why was that a problem, from technical point of view? They already had paths starting with "driveletter:", not like unix with just "/". Why would it be a problem for parser to distinguish between filepaths and arguments switch?
Even worse, there was (and still is) no requirement for a command and its switches to be separated with whitespace.
"DIR/W" is the same as "DIR /W".
Which would have made it impossible to determine whether you want to invoke the command "DIR" in the current directory, or the command "W" in a subdirectory named "DIR".
The DEC operating systems (eg. RSX with its MCR shell and VMS with its DCL shell) also handled command-line switches this way (eg. PIP/LIST to list files in a directory).
Of course, they didn't use unixy paths for directories. A fully-qualified file name would be something like (and it's been decades so if I get it wrong forgive me) DRA0:[SYS.USERS.BREGMA.PROJECT.SOURCES]HELLO.C;1 and anyone who was sane would use logical names in DCL to make things readable.
Yes, and this is absolutely maddening when trying to do cross-platform work because some programs support / in filenames while others interpret them as arguments.
(Another key difference is that on UNIX the shell expands '*' before passing it to a program, but CMD doesn't so each program has to do its own globbing)
Pretty much all Windows programs support / in filenames.
The main issue is some cmd builtins like mkdir/cd/del which have weird parsing rules where every / is interpreted as starting a command-line switch, even if not preceded by a space.
But even there, you only need to use quotes (mkdir "c:/test") to suppress the interpretation as a switch and then you can use paths with slashes just fine in batch files.
Paths can drop the drive letter e.g. del \file will try to delete a file in the root of the current drive.
On modern Windows you can use slash, but you have to quote the argument e.g. dir "C:/windows"
You can even mix both types together e.g. dir "C:/windows\system32" which is convenient when using code modules that only output unix style paths. No need to clean them up.
Personally I find unix or whatever OS chose / as the bad choice. / is a commonly used character, at least in the USA as dates. It would completely normal for someone to want to name a file "Meeting 12/20/1980.txt" or "Budget Sep/12/1985.doc"
Backslash has no common use I know of outside of computer related stuff like regular expressions and escaping things.
Some of you might also have forgotten but on Mac pre OS-X, at least in standard Mac devtools provided by Apple the separator was colon :
RM/COS used . as a path separator. I think this is from OS/360. It makes sense to me in that member selection in C also uses '.'.
Yeah, from the OS/360 wiki: "The file naming system allows files to be managed as hierarchies with at most 8 character names at each level, e.g. PROJECT.USER.FILENAME. This is tied to the implementation of the system catalog (SYSCTLG) and Control Volumes (CVOLs), which used records with 8 byte keys."
My son used to use \ for dates when he was learning to write; to the point I wondered if he was somewhere above 0 on the dyslexia scale. He's 20 now, and I think has grown out of it.
>*nix defines hierarchical paths with a simple hierarchy rooted at "/" - in *nix's naming hierarchy, there's no way of differentiating between files and directories, etc (this isn't bad, btw, it just is).
I don't see the problem, but I guess it's just a personal preference.
During the time DOS 2.0 was in development, I visited Microsoft with a group of colleagues. (We were touring computer manufactures both in Seattle and in Silicon Valley (where we visited Digital Research, Intel and many others—in those days it was comparatively easy to visit these enterprises for a tour.)
We were at Microsoft quite some hours (and we seemed important enough to be fed lunch which consisted of very good sandwiches). One of the people who toured us around was a DOS developer (who in his 'spare' time also contributed to the Flight Simulator development). He spent considerable time discussing the new DOS subdirectory matter as it was a hot topic back then. Anyway, he received somewhat of a tongue-lashing from us about the backslash 'problem' and it was very clear to us that he too was not in favor of it although for obvious reasons he chose his words carefully.
This brings me to more ergonomic problems that Microsoft has never bothered to solve with its operating systems—DOS or Windows. The first I'll mention is the annoying reserved character problem, specifically: < > : " / \ | ? * cannot be used in a filename. I'm aware these characters are also deemed illegal in the filenames of other operating systems but I fail to see why after about 30 years that we still have to worry about avoiding them. If Microsoft had fixed the problem back then, then there would have been pressure for other operating system developers to also fix the problem. Just because other operating systems were behind the times, it didn't mean Microsoft had to be—after all, in the early days, Microsoft went to considerable trouble to please users in the useability stakes, even to the extent that it put security severely at risk in the process.
I fail to see why Microsoft couldn't have coded around this problem and allowed the use of these characters. It went part of the way by allowing spaces within filenames in Windows and it also allowed spaces to be entered into the command line filenames with quotes "My first Name.doc". The fact that these characters cannot be used has caused considerable trouble for IT staff over the years.
It'd hate to think how many thousands of hours have been wasted by both users and IT staff over the past three decades or so on what ought to have been a trivial matter to fix. Similarly, I hate to think how many times I've had to enter a ¿ into a filename just because the damn operating system will not let me enter normal question mark: ?.
Another major stuff-up is the maximum filename length/max path length of 254/255 - 260 when the path length could be potentially 32,767 characters—as it already calculates the path to this length internally (the exact length varies between O/S versions). These days, this limit is ridiculous. If, say, you have a file with a filename of say 245 characters long in directory \MyFiles then move the directory way down deep into nested directories then one automatically has a problem that one's not necessarily aware of until a cannot-continue crash occurs during a backup. Having to regularly run a Max-Path-Length utility across the disk to search for potential problems is a damn nuisance and it ought to be completely unnecessary.
Same problem occurs when saving web pages with long names, these often exceed the maximum filename length and the page cannot be saved without manual intervention. To say ≈255 characters for a filename is long enough is just not being realistic these days. Here's another instance: say one wants to save a book with a long title from the Internet Archive and to avoid confusion later over having a cryptic filename one adds the book's title to the already-cryptic IA filename, i.e.:
Many a time I've had the title combined with the IA O/S filename exceeds 255 characters, and sometimes it's by a large margin. Shortening the filename at this juncture wastes considerable time, especially if there are many files involved.
Oh, and there's another PIA worth mentioning: .MSI files cannot be loaded from a directory when the directory has a leading blank (space) in its filename whereas an .EXE file can. Now how did that come about (and it's never been fixed)? [Leading spaces in directories are useful as directories and files are automatically sent to the top of the file manager tree—which is a very useful technique I've adopted for years to highlight temporary work files or sorting directories, etc. Again, this is necessary due another operating system limitation, which is that neither DOS nor Windows has any way of allowing a user to order the file/directory structure to meet his or her needs.] Other obvious limitations are that we cannot highlight filenames or directories in that we cannot make them different colors or even have filenames with different typefaces. Why not?
As I've said for years, operating system developers don't care much about user ergonomics. If they did then by now we'd even have a new file system to replace the existing one which is truly antiquated. A new file system would include metadata extension(s) within files that OSes and programs would both understand (but that's a far too big a matter to discuss here).
When one thinks about it, we users really have been shortchanged by the likes of Microsoft and others over the years.
> If Microsoft had fixed the problem back then, then there would have been pressure for other operating system developers to also fix the problem.
I don't know about all of the other reserved characters, but the colon is the path-separator character in classic Macintosh APIs and I doubt they would have ever been able to "fix" that.
> To say ≈255 characters for a filename is long enough is just not being realistic these days.
This was a limitation of old APIs and has been possible using the Unicode-aware APIs for a couple decades now, but as of Windows 10 it's possible to use long paths via the traditional APIs as well if an application declares a special manifest flag. Check out "Enable Long Paths in Windows 10, Version 1607, and Later" here: https://docs.microsoft.com/en-us/windows/win32/fileio/maximu...
Otherwise, try using those long paths as e.g. "\\?\C:\Users\Lammy\Downloads\Books.<…>.with_very_long_names_are_common_on_the_IA_+_the_Internet_Archive_filename_abcxzy123.pdf"
My number one peeve for a long time was "Documents and Settings" instead of "Users" on Windows XP, but I've come around to that once I realized it was probably intentionally-annoyingly-named to force app developers to use modern APIs since it seems to intentionally break the 8.3 length convention and force you to deal with escaping the spaces.
"I don't know about all of the other reserved characters, but the colon is the path-separator character in classic Macintosh APIs and I doubt they would have ever been able to "fix" that."
The fundamental issues is that users should be able to type any characters including colons that appear in day-to-day use, whether it be a book title, report name, movie title or whatever without ever having to worry about it.
The fact that they cannot do so and that they deliberately have to transcribe a name to another or shorten it to accommodate an operating system's limitations wastes time and leads to errors and confusion (anyone who has ever run an IT help desk in a large organization knows this).
I'm aware of the Win 10 'long paths' fix and I've also seen the registry patch which some suggest possibly fixes earlier versions of Win 10 (I've not tried the patch). That said, the filename length limitations remains a problem for two reasons - the filename is still too short and that many programs are likely to crash if filenames were to exceed 255 chr$ (as they'd be unaware of it). To overcome this, the operating system would have to also present a shortened 255-character filename to the program in the fashion it did back in the early days for programs that only understood 8 * 3 filenames. As I see it, it's only a halfhearted too-little-too-late tweak and the job needs to be done properly.
"My number one peeve for a long time was "Documents and Settings" instead of "Users" on Windows XP..."
This is still a problem, in fact it's a real pain. Right, D&S was a pain but so too is 'Users' and 'ProgramData', both should be completely movable even to the extent of having them work from a USB stick. Whenever I set up Windows it takes me days to configure my programs so as they dump their data/save files to specific locations on other drives (this makes transferring data to Linux etc. much easier and it's much safer too if files can be kept in locations where they aren't expected to be found). In fact, this should apply to all user info including users' program files. As I see it, these limitations are just bloody-mindedness on Microsoft's part (where the user isn't an administrator, the administrator would be able to still lock user directories and files to suit the local policy).
Yes, there was an excuse for the problem 20-30 years ago but not nowadays. Unfortunately, this is a 'mindset' problem of programmers. They're so used to acronyms and shortening things that they don't realize or care that ordinary users do not understand why they just cannot type anything as they did on typewriters (and something I've not yet mentioned: why they cannot type between already-typed lines or within margins, as the old-timers continually claim they once could do with ease but can no longer do so).
Yes that's right, the versions to retain was configurable, and the whole versioning mechanism was really useful and a great miss from today's OSes. The VMS file and directory syntax otoh was a real pain, Unix definitely wins there.
The version number was used the same way in TOPS-20 (which I never used) and Tenex, its predecessor (which I did). Emacs still has a facility for numbering backup files in the same way.
At the time (in MS-DOS 2.0) you could set SWITCHAR=- in CONFIG.SYS to override this setting. @kiwidrew's post above even has an excerpt from the source that demonstrates it:
> This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.[26]
So much time, and therefore money, working around `\` escaping characters in file paths.
Imagine their horror when having to type AltGr+- to get a backslash. At least with Shift you can choose which of the Shift keys to use, but there is only one AltGr key (it is to the right of the spacebar, the US keyboard layout has the right Alt key there). It used to be you could use Ctrl+Alt+-, too, and at least those were available on the left hand side, too. But I do not know if Ctrl+Alt works nowadays.
Those that are too stubborn to use QWERTY, yes. Seriously, the german keyboard layout is horrible for programming and the few umlauts can easily be entered using compose keys / dead keys / alt combinations / whatever you fancy.
Another viewpoint: the problem is caused by Unix allowing any character in filenames (other than slash and null). Other platforms are more restrictive in what characters filenames are allowed to contain. If you banned the '-' character from starting a filename, the problem wouldn't happen.
I personally think Unix allowing almost any character in filenames was a mistake. You can put newlines and other control characters in filenames. That has very little legitimate use, and is a potential source of security and other bugs. There is a proposal to amend the Unix standards to disallow control characters in filenames. But it doesn't look like it is going to be successful: https://www.austingroupbugs.net/view.php?id=251
Shell scripting would be so much more sane (and safe) if filenames couldn't contain spaces (or control chars) and couldn't begin with a '-' character. Then the shell's default $IFS would work as intended in the presence of pathname expansion, and there would be no need to use '--' to delineate the filename arguments from the option arguments when executing commands.
> the problem is caused by Unix allowing any character in filenames (other than slash and null).
Most Unixes don't allow any character; they allow any byte other than ascii slash or zero. Turning bytes into characters is outside the scope of most Unix kernels and the filesystems therein.
> Turning bytes into characters is outside the scope of most Unix kernels and the filesystems therein.
All the major contemporary Unix(-like) kernels do have code in them to do file path charset translation. It is very important when dealing with removable media (ISO-9660, UDF), FAT filesystems, network filesystems (especially CIFS/SMB, but even some NFS implementations), filesystems defined in terms of Unicode such as NTFS, HFS+, APFS.
Traditional Unix filesystems don't do this, but they were originally designed at a time when few clearly distinguished the concept of byte from the concept of character.
The only solution to that confusion would be not allowing the options identifying character in file/directory names. Changing the character would just "change" the confusion.
Your comment is based on the premise that there is, in common use, a directory called “-rf” but not one called “/rf”? Why would the former be any less likely than the latter?
I like using rmdir even for individual directories that I think are empty just so that I don't accidentally delete any files that I did not expect to be there.
I use it whenever I expect a directory to be empty and want an error if for whatever reason it isn't. It's a bit niche, but good to have in your toolbelt.
Sounds a bit provincial. That might have been true before mainstream internet access, but the average user these days is likely more familiar with unix style paths via URLs than local filesystem paths.
Also, most non-unix operating systems that don't happen to be made by Microsoft also use the forward slashes for paths.
> the average user these days is likely more familiar with unix style paths via URLs than local filesystem paths
The average user ignores the contents of the address bar. "That's all tech gobbledegook". Increasingly, browsers even hide its contents from the user, just displaying the domain name, making the average user even less aware of it.
> Also, most non-unix operating systems that don't happen to be made by Microsoft also use the forward slashes for paths.
What are "non-unix operating systems that don't happen to be made by Microsoft". Non-Microsoft operating systems in common use – Linux (including Android), macOS/iOS/Darwin/XNU, *BSD – are Unix-like, and hence I wouldn't really call them "non-unix" (even if they are not strictly speaking certified as such)
If we look at non-Microsoft non-Unix(-like) operating systems (none of which are commonly encountered nowadays), we see a lot which use neither forward nor backslashes for directories. For example, OpenVMS and RISC OS both use dots, classic MacOS used colons. Stratus VOS uses the greater-than sign, which it inherited from Multics. The IBM mainframe operating system MVS (nowadays called z/OS) uses dots to separate the components of a dataset name – although those components aren't exactly directories. (It also supports forward slashes in its Unix compatibility subsystem, but that wasn't around for the first 25 years of its existence.)
You can make that exact same argument irregardless of what switch character Unix uses. The real problem is not the switch character itself, but the approach used by standard Unix tools to "parse options until parsing fails, then assume it's an argument" which becomes an amazingly great foot gun if the actual option is inserted by a shell glob or a shell variable.
[1] https://github.com/microsoft/MS-DOS/blob/master/v2.0/source/...
[2] https://github.com/microsoft/MS-DOS/blob/master/v2.0/bin/CON...