
What are some Unix design decisions that proved to be wrong or short sighted? - protomyth
https://www.reddit.com/r/unix/comments/9rfj8t/what_are_some_unix_design_decisions_that_proved/
======
haolez
Interesting insight from the linked thread:

“Short-sighted - users and groups.

It makes sense on a single-host system being shared at a university. It is
weak for pretty much any other setting. It acts as a distraction in pretty
much every setting where we now use unix: on personal systems, on cloud
deployments, in the enterprise. There are constant hassles of aligning users
and groups between systems, and ensuring that applications are structured
along those lines.

To offer an alternative. We would be in a better place if unix provided some
kind of standard process-group sandboxing, along the lines of jails.
Permissions would be applied at the jail level rather than the filesystem
level. The sandboxes in smart-phone OSs hint at the way.”

I feel this pain in some cloud deployments, but I’ve never thought about the
whole users and groups concepts as the core of this issue.

~~~
jimktrains2
Isn't that what cgroups are in Linux? Not jails in the bsd sense, but a way of
grouping processes and setting specific ulimits and other permissions?

~~~
xemdetia
It is a layer of abstraction that helps the application piece but I wouldn't
say it solves the real problem being mentioned. That problem is closer to top
level uid management. When you are working with 50k systems and have to deal
with things like "because there was a slight deviation of install order tomcat
is uid 105 on one system and 106 on another set of systems, some tarballs are
screwed and need a second perm pass otherwise they'll come up as owned by a
different user."

And while there are ways to improve this with LDAP and things on Linux covered
by nsswitch and so on, they just aren't very nice. And because so much is
driven by uid internally finding that one system out of 100 where a constraint
is violated is needlessly troublesome, compare this to Windows and AD where at
any moment there is _always_ a clear source of truth.

Cgroups fixed that in the local space but just having a clean logical mapping
that is consistent across all implementations where things are wholly separate
by default and there is no bleeding in of the parent OS like a VM does

------
jonjacky
Ken Thompson once said he wished he spelled the 'creat' system call with
another 'e'.

Seriously, see these slides by Rob Pike from his talk in 2001: "The good, the
bad and the ugly: the Unix legacy":

[http://herpolhode.com/rob/ugly.pdf](http://herpolhode.com/rob/ugly.pdf)

Pike writes: "What makes the system good at what it’s good at is also what
makes it bad at what it’s bad at. Its strengths are also its weaknesses. A
simple example: flat text files. Amazing expressive power, huge convenience,
but serious problems in pushing past a prototype level of performance or
packaging." And so on, with several other examples.

~~~
xte
Also unix haters ML condensed in The UNIX-HATERS Handbook contain many
interesting point of view.

------
twblalock
Shared, dynamically linked libraries.

This was understandable in a time when disk space was scarce, but it causes
all kinds of dependency hell when you want to install programs that need
different versions of the same library, or when you upgrade one of those
libraries and break a bunch of programs.

Plan 9 got this right with static linking a long time ago:
[https://9p.io/wiki/plan9/why_static/index.html](https://9p.io/wiki/plan9/why_static/index.html).
Sadly this has still not been adopted by major Unixy operating systems.
Flatpak and Snap seem promising for isolation of dependencies in Linux without
requiring full containerization.

~~~
kbenson
The downside, and I view it as a fairly big downside, is that security updates
are complicated. When there's a problem in libjpeg or zlib, you can get 99% of
the affected programs by updating the shared lib and restarting the affected
applications. What's more, determining if you are affected is much easier.

Compare that to trying to figure out which of your statically compiled
applications happen to include the affected version of the library.

As with so many choices we need to make in software development, it doesn't
distill down to one way simply being _better_. It's a _trade off_ , and in
this case it's not just for disk space, but also for maintainability and
easily discovering how exposed you are with respect to security.

~~~
GordonS
I was going to say much the same thing - static and dynamic linking both have
their pros and cons, you can't really say one is categorically better than the
other.

~~~
zzzcpan
Dynamic linking doesn't have much pros beyond saving some disk space and some
insignificant amount of build time.

~~~
GordonS
I agree those are small things these days, but IMO the biggest pro is security
related - it makes it easier to update libraries when security issues arise.
For example, if a new issue is found in OpenSSL, you can update the OS's
shared library, but for statically linked apps you're dependant on vendors
rebuilding and publishing those apps to link to the new version.

------
blattimwind
Signals, pthreads, vintage memory management stuff (sbrk and friends),
undefined file system path encoding, octal access control instead of ACLs,
file locking, no real support for asynchronous I/O beyond select() for sockets
and many, many more.

~~~
tinus_hn
Really poor how they chose ‘vintage memory stuff’ in the early 70s

~~~
kbenson
> Really poor how they chose ‘vintage memory stuff’ in the early 70s

You can choose to read is as "they chose vintage memory stuff" or as "one of
the mistakes is what we might now think of as vintage memory stuff". It's all
about context.

------
xte
In philosophical terms UNIX, IMVHO, have few wrong concept: LispMachines
demonstrate how well work a fully integrated system, unix on contrary choose
to be a "platform" of independent component communicating via simple IPCs. At
first win because if far simple to develop, in the long term it loose. As a
comparison try standard unix environment vs Emacs.

Another concept is a certain separation between users and programmers, in both
LispM, Xerox Alto and even Plan9 users can smoothly program their environment
easy, so users have the maximum flexibility possible and a usable a effective
systems thanks to high level of integration and freedom. As actual comparison
try Squeak, few Scheme implementation, Emacs itself with elisp config they are
not, of course, integrated and effective like ancient LispM or Alto however
demonstrate enough how powerful can be such paradigma.

Last thing the everything is a file that's really limited in unix, well
implemented in Plan9. Everything is a file is a super-nice concept, but
everything must be really everything.

------
amaccuish
For me, it's got to be UID/GIDs and ACLs. Windows SIDs and ACLs are much more
elegant, and SIDs have the benefit that they are valid across different
computers/domains, with no conflict.

------
justinsaccount
text streams are only a universal interface when working with text. It's one
thing to use tr, sort, and uniq to count the number of words in a file. It's
another thing entirely to take commands like ifconfig that output human
readable data and then try to do things like

ifconfig | grep 'inet addr:'| grep -v '127.0.0.1' | tail -1 | cut -d: -f2 |
awk '{ print $1}')

(grabbed from first SO question I could find)

Text streams were great 40 years ago, but you run into the limitations of what
you can do properly with awk and sed pretty quickly.

Passing structured data through pipes using a standard format would have been
a universal interface. Text streams are really just a void* pointer. Sure, you
have the data, but good luck making sense of it.

~~~
dsr_
The really odd thing is that there's a great method of doing this that many --
but by no means all -- tools support: dedicated data separator characters.

01 SOH(Start of Header) 02 STX(Start of Text) 03 ETX(End of Text) 04 EOT(End
of Trans.) 05 ENQ(Enquiry) 06 ACK(Acknowledgement) 21 NAK(Negative acknowl.)
22 SYN(Synchronous idle) 23 ETB(End of trans. block) 28 FS(File separator) 29
GS(Group separator) 30 RS(Record separator) 31 US(Unit separator)

You've got everything you need there to build really useful serialized data
pipelines, and it would only be a day's work to implement them as a library in
pretty much any language.

~~~
AnIdiotOnTheNet
In-band signaling is not a good idea. Case in point: spaces and quote
characters in path names.

------
jtth
The perpetuation of the distance between a program and its source code in an
environment that encouraged small utility programs.

------
maxxxxx
Personally I think making file names case sensitive was not a good idea. I
know some people like this but I have seen a ton of errors just because
somebody typed "MyFile" instead "Myfile".

I also think that file names should be more constrained. If we didn't have
whitespace in file names we could have avoided uncounted hours of people
trying to get quoting right.

~~~
Symbiote
That would be much more complicated to implement. The file system would need
to understand the lower/uppercase version of every Unicode character (e.g.
Greek, Cyrillic), and that can change depending on what language we're using
(e.g. Turkish I and İ).

I think it's Windows and Mac OS X that made the mistake, with case-insensitive
filesystems. How do they handle I and İ?

~~~
TheAceOfHearts
FWIW, macOS lets you use a case-sensitive filesystem, it's just not the
default. In my experience, almost everything works perfectly fine out of the
box.

The only serious problem I've encountered, which I'd imagine is a real deal-
breaker for some, is that all of Adobe's products refuse to install on a case-
sensitive filesystem.

IntelliJ editors also show you a warning telling you to add the following
custom property: `idea.case.sensitive.fs=true`. Although I find it odd that
they make you manually set the value, since they can clearly detect when it's
needed...

On rare occasions homebrew will fail to install something, but it's usually
fixed quickly with a small tweak to the formula. Ever since I switched a
couple years back I've probably had to fix around 2 or 3 obscure packages, and
I always push the change upstream.

------
inetknght
Not sure if it's _specifically_ unix design, but it's something I attribute to
Unix. In my experience something I think was short-sighted was for stdin and
stdout to be two separate file descriptors, and for everyone to assume they're
pipes, and to assume that pipes are unidirectional.

I want stdio to be fd0 and it to be both readable and writeable. I want it to
be socket-like. Read from "stdin", and you read from fd0 and it's a socket.
Write to "stdout", and you write to the _same_ fd0. It would simplify child
process handling code since you can use one call to `socketpair()` for stdio.
Stderr could be implemented on top of the out-of-band stream, and for program-
to-program communications on Linux with a domain socket, you can pass file
descriptors and discover the PID of your peer.

~~~
jstanley
And how does that work in a shell pipeline? Stdin comes from one process and
stdout goes to another. This is easy if stdin and stdout can be separate
places.

~~~
gizmo686
There is no real reason why the read call and write call on a filedescriptor
need to go to the same place.

To make this work well, I think you would need the concept of a pipeline at
the file level. Eg, if process A has a pipeline file pipeA, and process B has
a pipeline file pipeB, you would need some form of linkPipe(pipeA,pipeB)
syscall that causes in in buffer of pipeB to be the same as the outbuffer of
pipeA.

Granted, this breaks the general model of a file representing a single block
of bytes that both reads and writes operate on, but that model is already gone
in UNIX (eg, the in and out buffer for sockets are seperate).

I'm not convinced this is actually a good idea, but seems doable.

------
ryl00
I'd imagine having to maintain last-access time on files can be constraining
for filesystem performance.

~~~
czinck
At least Linux doesn't do that by default anymore (or only down to a certain
resolution). So yes, proven very wrong/short sighted.

~~~
gizmo686
I think Linux still requires you to set the noatime mount option to disable
this.

Granted, this is done by default in every modern distro I have seen.

------
frou_dh
All I'm thinking of are sh/bash gotchas. And it's not particularly fair to
attribute those to capital U Unix, because the cool thing about shells in unix
is that they're just normal user-mode programs that can be (and are) written
by anyone.

------
billfruit
I do not how much this is related to unix design decisions, have you tried to
restrict how much RAM a process can use in Linux/Unix, some years ago I
thought it was very complicated to do..

------
codedokode
File descriptors inherited by child process by default.

------
dmitrygr
IO being blocking by default certainly must be up there.

