Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Everything they can refer to is a file.

In what sense is (say) a signalfd a "file"? How about a pidfd?



They represent resources managed by the kernel, exposed to userspace via an integer handle. The already established convention for opening and closing files fits that use case. You can duplicate and pass that handle to child processes, and the resource itself is reference-counted for free. They are "abstract files" with specific behaviours, if you will.


Originally, a file in computing referred to a group of punch cards stored in a box. If this is the definition you hold on to, then signalfd and pidfd do not represent a file. In fact, by that definition nothing in modern computing is a file.

However, as the technology evolved and punch cards went out of style, everyone (except maybe you) accepted "file" as an abstract representation of said box. signalfd and pidfd fit within that abstraction.

Should I preemptively start into your next question of why Linux has tty when there is no teletype to be found?


I don't think using the term "file" for any kernel-managed resource is common outside of Unix terminology. It is certainly not the terminology of Windows. It is not even really true on modern Linux either, nor was it ever fully true in Unix land - Plan9 is probably the only OS that really went for "everything is a file" to the utmost degree.

For example, in either Unix or Linux, a socket is not really a file, even though an open socket has an associated fd. But you can't open a new connection by using open()/read()/write() on any Unix or Linux system (you can in Plan9). So these are not really files. Similarly, most devices expose special ioctl() codes to perform operations on them and barely interact with read() or write().


> Plan9 is probably the only OS that really went for "everything is a file" to the utmost degree.

Plan9 took the "everything is a file" metaphor and extended it to "everything is a file in the filesystem". A file doesn't necessarily have to be found in a filesystem, even if that does add a level of convenience. If we look back to what a file is, a system to keep track of all of your punch card boxes is clearly useful, but if you scatter those boxes around the office haphazardly they are still files.


Outside of Unix land, a Process or a Socket or a Mutex or an Event or a [...] are not considered files. The Windows APIs for example have no concept of "a file" as a generic thing. In Windows, you could say that File, Mutex, Socket, Process, Thread, Event and many others are all "Objects" referenced by "Handles".

It's also notable that the term "file" most likely derives more from the concept of a paper file as the place where one stores information rather than the specific punch card files you mention. The expression "[something is] on file" to mean that "some entity is storing this information" predates computers and is very likely related to use of the term "file" for "something that is stored [by a computer system]".

Regardless, using the word "file" to refer to a process or a TCP connection (in the form of a Berkley socket) is entirely specific to Unix and systems inspired by it, and is seen as a key part of the Unix philosophy (via the expression "everything is a file").


> It's also notable that the term "file" most likely derives more from the concept of a paper file

Yes, that's right. The dictionary defines file as "a folder or box for holding loose papers that are typically arranged in a particular order for easy reference." Which is literally what a box of punchcards is – hence how the word made its way into computing.

Obviously modern computers do not actually use files in any way. All we have today is a loose abstraction that pretends to represent a file, and within that abstraction Unix-like systems place processes and TCP connections. These are all files in the abstracted sense. Other systems may choose to do things differently. Again, modern computers don't have actual files, just a pretending that they do.

For better or worse, we have come to agree that abstraction can also be named "file", even if it is not actually a box full of paper. Which is true of a lot of things in computing. It turns out that floppy disk button you often see doesn't have anything to do with floppy disks. Hard to believe, I know.


My point is that the way this made it to computing is not so much the specific "file of punch cards", but through a more abstract route: "a physical file is a way to store information in an organization" -> "anything that stores data in a system in general can be called a file" -> "data stored on a computer system is called a file". We used the expression "to have on file" / "to put on file" to mean "to have stored in some way, not necessarily in a physical file" even before computing was a thing.

So, "a file" in the figurative sense means "something that is stored" - whether that's a physical file in a file cabinet or someone's memory, or an entry in a ledger etc. This matches the term "file" as it is used in Windows and in most other operating systems, and the way it was used in the earliest Unix systems. However, it doesn't actually match the way it is used in modern Unix systems - a process or an event or a mutex is not "something that is stored". So, calling these things "files" is unusual and doesn't match the normal figurative meaning of "file", within or without computing.


> My point is that the way this made it to computing is not so much the specific "file of punch cards", but through a more abstract route

That is not what the history books tell. If you have a different story to share, I'm sure the record would love to be updated. I'm rather surprised you haven't already shared your story with us. I guess you don't really have a point.

> So, "a file" in the figurative sense means "something that is stored"

I expect you are thinking of the parallel definition of "file", the one which often goes along with the word "folder". Which is a funny one as in the real world the folder is the file as it is abstracted in that model, but anyway... What we speak of may not be a file in that sense, but as it happens, words can have multiple meanings. In fact, you just spent half your comment telling us that, so there is no turning back now...


> If you have a different story to share, I'm sure the record would love to be updated.

Sure, just looking at the Wikipedia article on "Computer file" [0], we see that the "punched card file" was only an early use of the word in relation to computing.

But they show a slightly later use [1] that matches our modern notion much better, and that uses the exact expression "on file", without referencing punched cards in any way, so quite clearly not in a way derived from the practice of keeping information on punched chard files:

> [...] But such speed is valueless unless - with comparable speed - the results of countless computations can be kept "on file" and taken out again [emphasis mine].

> Such a "file" now exists in a "memory" tube, developed at RCA Laboratories.

Wikipedia also shows that "file" was often used in the early days of computing to refer to the storage location, the physical storage device, not the contents in memory. For example, what we'd today call "disks" were called "disk files".

Wikipedia ultimately claims, unfortunately without a citation, that the modern sense of the term "file" came to denote the contents rather than the physical storage once the first "file systems" started being used, and those were managing multiple "virtual files" (that is, what we'd call virtual disks today).

> I expect you are thinking of the parallel definition of "file", the one which often goes along with the word "folder"

Yes, since that is quite obviously the metaphor most lay people have been taught in modern times at least: your data is organized in file folders, each containing multiple files. Basically in this context, "file" is more or less a synonym for "document". This is the most directly applicable non-computing equivalent of the modern computing term "file" (and it, again, doesn't match most things that are "files" in Unix).

And yes, of course words have many different meanings. A file can also mean something that you use on your nails, but that is obviously not relevant to what we're discussing.

[0] https://en.wikipedia.org/wiki/Computer_file [1] https://web.archive.org/web/20220109114611/https://books.goo...


> signalfd and pidfd fit within that abstraction.

Seriously, I'll bite, explain exactly how a signalfd or a timerfd fit the abstraction of a box of punchcards?

Even the FreeBSD man pages (because their authors have an ounce of sense) stopped calling them files. They call them object descriptors or just descriptors and they're literally "references to objects"


Explain how they don't. Realize it is all pretend. There is no actual box of paper anywhere anymore, just the pretending thereof. The fun thing is, when pretending, you can pretend whatever you want. If you want to pretend that a signal is a file, then it can be a file. There is nothing tangible behind it that ensures a signal can't be a file.

There is separate computer abstraction that emerged which you may know of by something along the lines of "files and folders". Funnily enough, that one is, indeed, confused. A file, by definition, is what the folder is trying to abstract in that model. But this parallel use of "file" is no doubt why FreeBSD moved to using "object", to try and avoid using the same word to mean different things. But in the end, words are allowed to have more than one meaning, so "file" is equally usable and it just comes down to arbitrary preference.


> Explain how they don't.

Simple, because the intersection of all the interfaces of these different "file" types is nothing other than close(). At that point the abstraction has lost all ability to describe anything common better than just OS managed object. When a noun is used as an abstraction, this is often a good benchmark of utility, better than just your arbitrary preference.

File in computing may have always been an abstraction, but it started out in relation to some sort of data storage and an appropriate interface, usually also addressable by some index (a filesystem). If you have a object with open/close/read/write/seek, you at least have something. Then we drop seek, and add ioctl. Now the cracks have formed, but there is still something there, there is still data involved and things still have a common source: the file system.

But at this point these other fds, the only thing common is close().

You can try to go the a different (plan 9) route, but that requires something to tie these together, a filesystem. And that is not something the systems in question have.

> But in the end, words are allowed to have more than one meaning

But once an abstraction has lost its usefulness people do not have to go along with the stupidity. I don't even have much care about this. Call them files. But where it is a bit irritating people still go on to say "everything is a file" when describing useful attributes of modern UNIX likes as if this illustrates some deep insight of this abstraction. This "everything" involves one operation: throwing the thing a way. It's stupid.

We should retcon file as a shorthand for "roundfile".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: