Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
What Is PID 0? (dave.tf)
407 points by todsacerdoti on June 7, 2024 | hide | past | favorite | 73 comments



People are far overconfident online for what they know. The definitive and confident tone most online commenters speak in should probably only be spoken by experts in their own fields.

I sometimes wonder if that is why LLMs can so confidently hallucinate — because they were trained on piles of overconfident human texts.

It's an interesting thing to ponder.


People want to listen to folks who are confident.

And that sentence right there is an example of what I mean. I could write 10 words, 100 words or 1,000 words adding caveats to "People want to listen to folks who are confident," but most people don't want to hear it and they'd tune out. But nine words, they'll listen to and use that, even if it's not right all the time.

This isn't just an "online" issue. Anecdotally, I'd say it's in human nature. I've read plenty lamenting how men are (over)confident at work and garner (unwarranted) success relative to less confident women. And IME, confidence at work is pretty successful, if only because folks _try_ the confident suggestion. The person with a host of caveats might have a better suggestion, but they are less confident in their result, which folks sense and shy away from.

And then there are casual situations (which most of "online" discourse is), where I regularly see strangers confidently offer one another advice which is usually received positively. A lot of the advice is wrong, but that doesn't really matter.

> I sometimes wonder if that is why LLMs can so confidently hallucinate — because they were trained on piles of overconfident human texts.

The LLMs that I have worked with have no concept of "true" and "false". They have no sense of confidence in what they sense.

They _phrase_ it definitively because that's what we want.

"What is the capital of Australia."

"The capital of Australia is Timbuktu."

The LLM doesn't know if that's true. It's just making a statement we asked it to make.


Exactly right, they do not have a concept of true and false as unsupervised learning simply makes them good at predicting the next token. But I think there is an over-confidence bias in the training data sample. On top of that, instruction tuning wants definitive answers, as you say. And finally, RLHF probably favors over-confident answers because people like that. From start to finish, over-confidence bias is everywhere — we both produce over-confident training data, and tune for over-confident answers.

Or, well... that's what I think. See, I've not trained an LLM, I have only read about it online, and very little in books I have on the topic. I did some machine learning exercises in university, and that's the extent of my practical knowledge. And as I say that, the impact of my words goes down, right? They are taken less seriously than if someone said all that stuff about LLMs but never said they don't have practical experience. And yet, this makes the information as it is presented more exact, the limitations are clear, so it is more useful.

More useful, but far less appealing... This is a really interesting topic.


There is a reason con man is short for confidence man.

People are extremely susceptible to someone who sounds confident.


People should be suspicious of statements regardless of tone. Conmen, hackers, cult members, job applicants, and AIs are all trying to trick people who only listen to tone.


It takes a lot of cognitive work to doubt and analyze everything. It's not really feasible, is it?


It's also not really necessary a lot of the time. If some random person online confidently says that the newest tesla uses an engine which contains ball bearings made in Indonesia by child slaves, I don't have to spend the time to doubt and analyze that because it doesn't impact me personally. I'd only ever need to take the time to double check that if I were going to buy a tesla or before I went and spread that information around as if it were fact. How true or false it is doesn't affect my life in any way. It can just be something a random person said online and I can treat it as such.

Whenever you see information that sounds like it could be extremely important to you and your situation (and when being wrong could really hurt you) then no matter how authoritatively the information was delivered that's really when you should invest the time to verify it. Much of the time that investment is just a quick internet search anyway.


Review enough code, and even 2 + 2 can look sus.

Where's the operator overload? ;)


In the garbage language that we dont use anymore. Right? Right!? :)


Unfortunately, due to budget cuts, we could not afford to vanquish all of the antiques in the architecture. We do have an infinite spell of Ben Gay, however....


s/spell/supply/


You think 2.add(2) is more trustworthy?


well yeah but there's that legacy system, the replacement isn't ready for GA yet so...


As with many things, it becomes easier with practise. Also, you can pace accordingly: do I quickly read 10 articles today, or pick 2 and peruse them in depth?


So safe the effort for the things that actually matter in your life.


Some nice archaeology here, but I think it's important to say "pid 0 is part of the [Linux] kernel" (much less the further details) is only useful from a certain perspective—if you are debugging the kernel itself, using its more idiosyncratic interfaces like trace points within e.g. eBPF to examine the system as a whole, etc.

From the perspective of a userspace process using standard APIs, I think a more useful approximation is "pid 0 refers to myself". It's what fork returns in the child. It's what you pass in to kill(2) to signal your own entire process group. Probably other variations too.


There is no PID 0, it's a ABI convention just as using a negative PID is a convention to use the PGID instead like kill -9 -1. PIDs start at 1. The Linux kernel allocates PIDs to kernel threads that are "processes" without a separate address space running in a privileged mode with separate stacks, and generally ignore kill() signals.

Command to get when a Linux box was started as opposed to just running uptime -s:

    ps -p 2 -o lstart=


Okay. I said approximation, and saying that the approximation is not completely true is not that interesting.


Even in your own examples 0 means different things. It’s not an approximation, it’s wrong.


Being slightly wrong is kind of the thing that makes an approximation an approximation


All models are wrong, some are useful. I found his explanation useful while knowing it was not correct


But this statement is wrong and should not be propagated: "pid 0 refers to myself".

pid 0 refers to the thing that is running 'myself', not to myself, itself. It is the thing which gave way to allow 'myself' to be executing in the current context.

It is more accurate to say "pid 0 is the source of cpu 'attention' which allows myself 'awareness'", as it discovered I was not idle, and granted me power to proceed with processing ..


I think generalization would have been a better word but everyone understands what they meant


Why not

   ps -p 1 -o lstart=
?

In real life the difference should be rarely visible. And then, did we want to know when it started booting or when it became usable. The latter would be rather tricky (even defining what it exactly meant by that).


On NT-based Windows, PID 0 is "System Idle Process" and is quite similar in function to the Linux one. On DOS-based Windows, IIRC there is no such thing as PID 0 since PIDs there are actually kernel memory pointers and thus very high: http://www.thescarms.com/VBImages/RunningProcs.gif -- instead, the idle loop is inside VMM32.


Having written some code that did some DKOM in Windows, I'd say PID 4 comes closer.


Hah. Another topic where the "common knowledge" is just utter garbage and actual research yields a different picture. That doesn't stop people from being convinced of it.

The author of this post did the only correct thing and checked the kernel's source code, with is the authoritative source for this information.

The conclusions at the end are a bit whacky.


I can't help sharing one of the loveliest uses of `kill 0` I know:

    #!/bin/sh
    #
    #    Usage: upto LIMIT COMMAND
    #        Run COMMAND until LIMIT seconds have passed, then exit.
    #
    test "$1" -gt 0 || {
     printf '%s\n' " Error: first argument must be a positive integer."
     exit 1
    }
    
    sh -ic '
     sleeptime=$1
     shift
     exec 3>&1 2>&3
     { 
      "$@" >&3
      kill 0
     } |
     {
      sleep "${sleeptime}"
      kill 0
     }
    ' sh "$@"


Nice for learning.

For practical use I would prefer https://www.man7.org/linux/man-pages/man1/timeout.1.html


What are the benefits of one over the other?


Your script does not handle stderr, stdout, and SIGINT correctly, probably more.

With enough work at least some of them could be fixed, maybe even all. But why reinventing the wheel if timeout is contained in coreutils and available on nearly every Linux machine?


There's a third use of PID 0, besides the already-mentioned "idle" and "self".

On Linux, `getppid` returns 0 if the parent is a process in another PID namespace.


This is very interesting. For those interested in following all the parts of early kernel booting that were out of scope for this article, please read this fantastic resource: https://0xax.gitbooks.io/linux-insides/content/


On Darwin/macOS it's easy - kernel_task shows up with PID 0 right there in top!


It's interesting that the v4 code linked reuses pid 0 (the code that assigns pids simply does "p->p_pid = ++mpid;") - I vaguely remember having to fix that in kernels we were working on in the mid 80s


It seems like TID would be a better term for the in-kernal concept of a "pid", which could stand for "task id" or "thread id", whichever you prefer. And wouldn't be as confusing.


And afaik that is sometimes a thing via gettid()


PID 0 on most academic Unixs used for teaching operating system design still swap out the entire process, and call into the memory subsystem ("paging") to do so.

Linux is not the owner of the concept of PID 0. Saying that PID 0 frequently is involved with paging in and out memory is not incorrect.


Please, name these academic Unixes! I would love to go see what they do. Down-thread there's a mention of minix, which does the normal thing: demand paging, context switches only the page table directory pointer, and process memory images are moved around the storage hierarchy indirectly, through page faults. Which other academic Unix or Unix-like did you have in mind?

Linux is indeed not the owner of the concept of PID 0. It's fortunate that I didn't say that! It is, however, not frequently involved with paging in and out memory.


xv6 and its many forks are what I'm thinking about

You address this somewhat in the post:

> Going back to the Wikipedia article, it seems the author of that edit wanted to write “swapping”, in the classic Unix V5 sense of swapping out whole processes as a consequence of scheduling. But the edit didn’t clarify that “swapping” was being used in an archaic sense that was likely to confuse the modern reader.

> context switches only the page table directory pointer

Swapping out the the PTD pointer is exactly what I'm thinking of. I'm wrong, because I didn't have the common colloquial meaning of "swapping" (paging out memory to disk) in my mind

I think it's a little strange such a meaning has come to dominate, at least in a classroom setting it is still fairly common to discuss the operation of the scheduler as "swapping pages".


Yeah, admittedly it's confusing terminology generally, because it's still natural to say you're swapping the page tables out when you do a context switch on current systems. I probably do at some point in the post!

The distinction I was trying to get at was that, in early Unix, all process bytes were being actively streamed to and from disk as part of scheduling because the hardware didn't yet have a concept of virtual memory. So, if you wanted to make a program ready to run, you had to fully load it into memory, and shove anything else out of the way right then and there. That makes the scheduling function 5% deciding what should run, and 95% playing memory sokoban to make that happen.

OTOH, on systems with paged virtual memory, the scheduler is almost entirely "what's a good thing to run?", and implementing that decision is updating a couple of pointers. The only place the memory hierarchy creeps in, is if the scheduling algorithm wants to be fancy and account for things like NUMA nodes in its ranking of tasks.

I think it's reasonable, looking at it in isolation, to describe this part of the kernel as a "swapper", or the operation as "swapping". I think where it turns into a bear trap is when presenting these concepts to folks less familiar with kernel internals, where words like "swap" and "pages" are firmly the domain of the memory subsystem. And so, if I hand them a task and say "this is the swapper", IMO the majority will interpret that as being a component of virtual memory management, and they wouldn't be at fault for thinking that.

Empirically this happened in the 2008 wikipedia edit: "swapping" mutated to "paging" because in modern vmm-land that's a valid synonym, and that in turn became "this task is sometimes called 'sched' for historical reasons, and it handles paging" on the web. And cue a decade of confused students and stackoverflow users asking followups like "but if this task does paging, why does linux have all these kswapd threads?" That to me suggests that, for better or worse, the memory subsystem owns those words now, and the rest of the kernel has to be very careful if it uses them to mean something else, if it wants to avoid casual onlookers creating false associations. Something something naming things is still the hardest thing in computer science :)


I checked xv6 and it doesn’t swap out processes. What teaching OSs are you thinking of?




I agree with the author's complaint about the problems of Wikipedia being taken as authoritative on operating systems. I have seen all kinds of bizarre claims which are at odds with reality, but which, being described in a wikipedia page, are taken for gospel, and end up reproduced and sometimes embellished further.

His article is probably quite a good discussion of what happens on Linux. It is over-reaching however if it is supposed - as it seems to be in the conlusions - to be talking about modern Unix-likes generally.

PID 0 on NetBSD (and I suspect of Free, DragonFly, Open, etc, as well) simply means the kernel process. Here are a few of the threads that run under the kernel process in NetBSD:

  PID PPID CPU LID NLWP PRI NI   VSZ   RSS WCHAN    STAT TTY     LTIME COMMAND
    0    0   0 118  106 123  0     0 28132 physiod  DK-  ?     0:00.00 [system]
    0    0   0 117  106 125  0     0 28132 pooldrai DK-  ?     0:00.00 [system]
    0    0   0 116  106 124  0     0 28132 syncer   DK-  ?     0:00.00 [system]
    0    0   0 115  106 126  0     0 28132 pgdaemon DK-  ?     0:00.00 [system]
These are all true and authentic threads, they just don't spend any time executing userland code. The work they carry out is, respectively: to carry out I/O to/from buffers in userland, because this may incur page faults and cannot therefore be done in a soft interrupt which has no thread context; to reclaim pages from the pool (slab) allocator; to lazily synchronise dirty buffers back to disk; and to carry out page replacement.

All of these listed above carry out memory management, so it is not correct to say that PID 0 "has nothing to do with memory management" or that the Wikipedia article is wrong to discuss paging as a responsibility of PID 0. That's what pgdaemon is doing!

There are many other threads that are part of the kernel process (or "PID 0") on NetBSD - modern kernels generally use a lot of them to carry out all sorts of tasks. A few others on NetBSD include worker thraeds for running asynchronous I/O completions and for processing various kinds of input in the networking stack.

Illumos should also be considered. Looking at its PID 0:

      root     0     0     1     1   0 11:22:53 ?           0:02 sched
We can see it is called sched. Why sched? This article talked about the historic role of PID 0 in process swapping. Process swapping is a scheduling problem (like a lot of problems in software). This is why swappers are traditionally called medium-term or memory schedulers. Illumos generally gives most groupings of kernel worker threads their own processes with their own PIDs, but one, called "sched", remains in PID 0, and its responsibility? Process swapping:

https://github.com/illumos/illumos-gate/blob/579c23696ac6891...

The Wikipedia article has now been hastily edited, and replaces a claim that was true only of certain Unixes other than Linux with a claim true only of certain Unixes including Linux. Is this an improvement?


> I have seen all kinds of bizarre claims which are at odds with reality, but which, being described in a wikipedia page, are taken for gospel, and end up reproduced and sometimes embellished further.

Editing Wikipedia is fun, but sometimes it is hard to know where to edit. Here's a simple process to find correctable errors!

(1) Sneak onto your local university's campus.

(2) Sit in any operating systems lecture.

(3) Watch for nonsense in the slides.

(4) Locate the relevant Wikipedia page.

(5) Rewrite the entire two-page section around whatever was on that slide, because chances are it's all nonsense.


Isn't it lovely how all these detailed code references are just a link away via GitHub?


I don't like that, it's not good practice.

One should give links to original sources, i.e. https://kernel.org as far as Linux is concerned. Example: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

Even if git guarantees that the content is the same (if someone bothers to verify that the SHA-1 is the same and we exclude the possibility of a SHA-1 collision in git, which is yet to be demonstrated).

kernel.org existed before github.


When I type "ps -aux", why do I get all the information from other users too? Shouldn't this be private information by default?


Why? Because Unix was written in the 1970s when privacy and computer security where not a topic yet. Hardenings have been added later, but the one you mention is still not the default.


You can harden your system using hidepid.

Chapter 4.1: https://www.kernel.org/doc/html/latest/filesystems/proc.html


My point: why isn't this private __by default__?

(Also, I fear that by setting this flag it will break a lot of tools that expect the flag to be 0)


On FreeBSD there's a sysctl to control it and there's a kernel module you can configure to allow specific UIDs/GIDs to see everything so you can still allow your monitoring tools to do their job without being root, for example. It's a standard config in places I've worked


It's configurable - "hidepid=1" mount option on procfs.


Careful, this won't stop systemctl from showing the full process command line in its output.


Hmm, that's not why I am seeing

             ├─kopano-server.service 
             │ └─2685 n/a
             ├─systemd-logind.service 
             │ └─1037 n/a
             └─kopano-gateway.service 
               ├─    959 n/a
               ├─   1413 n/a
               ├─  52489 n/a


Why not just edit wikipedia to be correct?


> I’d love to go edit the Wikipedia article and set the record straight… But would that count as “primary research”? Would the edit be reverted because it disagrees with most of the web? Does publishing this post mean editing wikipedia would now count as some variation of sockpuppeting or self-promotion?


Wikipedia has now been edited, citing this blog post as a source.

Him editing Wikipedia to include his own research is "primary research". Him editing using his own blog post as a a citation is presumably self-promotion. But someone else editing it in seems okay.


Kind of clickbait/misleading title? The article even states that PID is a userspace concept. There is no PID 0. PIDs is how userspace refers to processes in syscalls. You can't refer to a PID 0 because `kill(0)` has a special meaning.

But "What is TGID 0?" is much less catchy of a title. Hence, clickbait.


Specifically, kill()ing pid 0 kills every process in the process group of the calling process. Pid 0 is more like a NULL pointer. You can put something in memory there, but a lot of systems will treat it as an invalid identifier for their own purposes.

I had some ruby code that called kill(str.to_i) and killed every process when the string wasn't an integer, which is when I painfully found out that to_i returns 0 on invalid strings and kill(0) kills (potentially) a lot of processes.


[flagged]


install DarkReader and never be mad about this again


And instead be mad at the extension's flakiness and random style changes.


What? Why not? Many many websites have white backgrounds


dark mode cultists are truly a strange breed. personally i like neither dark nor light mode, but more of a sepia tone at a lower brightness that I feel is much more soft on my eyes.

both dark and light mode feel way too harsh, both in terms of contrast and the usual color pallettes people often choose (bright whites [ #eee and above] and dark blacks [ #111 and below] )


Lack of automatic dark mode support on websites is very annoying at night, when you've turned dark mode on and everything else is dark, except that one website you just opened, and it now hurts your eyes or bothers other people in the bedroom.


I never use dark mode on a laptop (because I hate it, it's a personal preference), not even at night. I don't bring a laptop to the bedroom if anyone else is there.


That's why automatic dark mode on websites doesn't have any effect unless you turn on dark mode. It lets you choose.


We need to put the UA back into UA and let people decide how they want sites to look.


White background is fine.

What is not fine is gray text on white background.

Sincerely, anyone who utilizes low contrast between text and background has utterly no respect for peoples' eyes. Yes, that includes Hacker News with downvoted comments.


Wow, this entire article is completely off the mark.

0 is a special value in the POSIX kill function. It denotes every process in the process group of the caller.

That's it.

(OK, so because of that, a given OS based on POSIX can internally get away with using the value in some ways, to denote some process that nothing in user space would ever know about, let alone try to send a signal to.)

Don't get nerd-sniped! Know the 15 second answer and move on.


Maybe it was only used as the `all in progress group` sentinel because it was used by the kernel and could never be the pid of a userspace process?

There's no reason that Unix couldn't have defined `PID_ALLPROCGRP = -2` instead.

I mean, what do you think came first the development of Unix? The pid taken by the kernel thread used for scheduling, or a special value to a syscall when someone realised "hey, maybe being able to kill the entire process group would be handy. What spare values do we have we can use to indicate that?"

(And remember, POSIX codified existing practice. Unices don't use pid 0 for a reason because POSIX says so; POSIX says so because that's what Unices did)


> Maybe it was only used as the `all in progress group` sentinel because it was used by the kernel

Ah, but for that we have to long beyond/before Linux. kill(0, signal) existed before a line of Linux was written.

Linux followed the existing POSIX spec which gave it PID 0 to do whatever.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: