It is sometimes used to allow one binary to be the symlink target of hundreds of...

cubist_castle · 2024-09-03T14:07:28.000000Z

I just learned that rustup/rustc/cargo etc. work like this too. I couldn't understand why the gentoo formula was symlinking the same binary to a bunch of aliases.

kbolino · 2024-09-03T14:15:36.000000Z

On my system, these are hardlinks (regular files with a link count >1 and the same inode) rather than symlinks, though I'm not sure why.

mostthingsweb · 2024-09-03T14:19:10.000000Z

Maybe to avoid broken links if you move the original files? That's the main benefit of hardlinks vs symlinks in my mind at least.

actionfromafar · 2024-09-03T14:50:23.000000Z

That can also be a downside, you believe you have moved stuff but now you can have different versions of programs that don't expect that to be a possibility.

ForOldHack · 2024-09-03T20:02:25.000000Z

If there is a simlink, a hardlink and an executable, all with the same name, which one will it run? Which one will the shell object to? Which one should the shell object to. If a virus/SUID program overwrites a simlink, no problem, but ift it traces the simlink to the executable, and then over writes that...

alerighi · 2024-09-03T18:33:51.000000Z

And that makes a lot of sense, especially for binaries that are statically linked (as usually are Rust binaries), since that could save a lot of disk space!

duped · 2024-09-03T14:41:16.000000Z

clang does this too.

mistercow · 2024-09-03T14:02:51.000000Z

Also if you want a program to call itself, which is sometimes useful, this way lets you actually call the same program, rather than assuming the name and path.

duped · 2024-09-03T14:20:05.000000Z

Don't do this - if you (reliably) want the path to the current executable there is no portable way to do it, but on Linux you need to readlink /proc/self/exe and on MacOS you call _NSGetExecutablePath. I forget the API on Windows.

theamk · 2024-09-03T14:58:49.000000Z

I would not say it in such absolute way - /proc/self/exe has downsides as well. As this resolves all symlinks, so this breaks all the things that depend on argv[0], like nice help messages, python's virtualenv, name-based dispatch, and seeing if the program which was executed via symlink or not.

A lot of times you know you never called chdir(), in which case I'd actually recommend executing argv[0], as this is nicest thing for admins. If you are really worried, you can use /proc/self/exe for progname and pass argv[0] as-is, but that's overkill a lot of times.

duped · 2024-09-03T15:29:24.000000Z

Those are all cases where you're using argv[0] as an argument to the program where it's appropriate. Using it as the path to spawn a child process is incorrect. You're free to re-use it as an argument.

I have fixed enough software that made this mistake that I'm confident to be absolute about it. It's a very easy mistake to make but it's really annoying when software makes it and someone needs to deal with it at a higher level. It's better for developers to know that argv[0] isn't the path to the executable it's what was used to invoke the executable.

vlovich123 · 2024-09-03T18:52:25.000000Z

What’s the issue with using argv[0] as a way to spawn yourself? I don’t recall running into a lot of issues.

duped · 2024-09-03T19:26:09.000000Z

If it's a relative path, then changing the working directory will break (chdir("/") is a very common tactic at the top of main()).

It's possible/desirable for the parent to change the PATH of a child process, particularly one that spawns other processes. So the argv[0] used to spawn the original process may be garbage for spawning children.

Similarly in any kind of chroot jail (which may or may not be docker these days), relative paths and PATH can be garbage even if they don't change.

The real problem is that I've seen in-house and open source frameworks/libraries that have a function like `get_executable_path` that reads `argv[0]` and this is just incorrect behavior. Spawning yourself is one of the less risky things you can do, but there are gotchas and a way to avoid them!

vlovich123 · 2024-09-03T19:49:24.000000Z

Hmm... I generally have so many issues with chdir (e.g. someone gives you a relative path to a file you need to read and now that's screwed up because you did a previous chdir) that I just avoid all use of it in the first place.

Generally don't run into chroot all that often these days & docker gives you a fully virtualized environment where if a relative path is garbage then you may have other problems too (e.g. given relative paths to files). You certainly have to be careful around chroot / docker anyway as I think resolving /proc/self/exe probably is dangerous too for all the same reasons and you need to be careful to use the literal "/proc/self/exe" string for the spawn command and also require that /proc is mounted and remember to pass through argv[0] unmolested (or mutating as needed depending on use-case).

There's enough corner cases that I'd hesitate given blanket advice as it requires knowing your actual execution environment to a degree that there's lots of valid choices that aren't outright "wrong". And some software may be portable where argv[0] is a fine choice that works 90% of the time without worrying about maintaining a better solution on Linux.

duped · 2024-09-03T21:59:39.000000Z

It's very common for daemons/servers to chdir("/") at the top of main. Relative paths sent by clients getting broken is a feature, not a bug. (In fact I just fixed a bug related to this an hour ago because a relative path was not being canonicalized before being passed to the daemon I'm working on and it caused a file to be written to the wrong place).

There's no way create a process such that /proc/self/exe is incorrect except if the process itself performs a chroot, or someone has overwritten what it points to. I'm talking about some other program running the process where those challenges don't show up.

> . And some software may be portable where argv[0] is a fine choice that works 90% of the time without worrying about maintaining a better solution on Linux

Except it's broken on MacOS and Windows, too!

I'm pretty confident saying that if you want to get the path to an executable, use the bespoke method for your platform because it ain't argv[0]. I have seen that codepath break so many times that there should just be a standard library method for it (and there often is, depending), and I have written this function at several companies.

There are not any edge cases that I'm aware of, except for a few esoteric ones. But there are quite a few edge cases for using argv[0], they exist on all platforms, and it's very annoying for people that have to fix or work around it because a software author didn't understand what argv[0] was.

im3w1l · 2024-09-04T00:11:39.000000Z

For the c-programmer adding a dependency is so difficult that he would rather use a roll his own 99% solution than use a library. It does protect him from supply chain attacks, I suppose.

vlovich123 · 2024-09-04T16:53:53.000000Z

> It's very common for daemons/servers to chdir("/") at the top of main. Relative paths sent by clients getting broken is a feature, not a bug.

I instead put that in the lauhch script / systemd policy. That way when I run the server locally for development weird shit doesn’t happen in my root.

theamk · 2024-09-04T03:24:47.000000Z

yesh, that's why my post literally said, "A lot of times you know you never called chdir()..."

Sure, don't put this in the library, but there is nothing wrong with using it in the app where you know no one makes this call.

mbrumlow · 2024-09-03T15:21:42.000000Z

I think you forget the exec system call’s first argument is a path to an executable, followed by an array of arguments, where arg[0] lives.

I can’t find issue with exec(“/proc/self/exe”, [ program , … ).

alerighi · 2024-09-03T18:32:34.000000Z

Well, it could be for example that /proc is not mounted. A lot of software breaks for this, while really there is no need for it to be so. Also that approach only works on Linux, if you want to write a portable software what you do?

mbrumlow · 2024-09-03T19:28:29.000000Z

I am mainly pointing out that arg[0] is still valid. Writing portable software is an entirely different topic.

sweetjuly · 2024-09-03T17:00:20.000000Z

Note though that both of these solutions are racy and so should not be done if "someone symlinking really fast and swapping the binaries" is in your threat model. Linux proc/self is safe though, just not the result from readlink.

duped · 2024-09-03T17:57:32.000000Z

Well that's true, but also something that can't be addressed within a currently running process afaik.

flohofwoe · 2024-09-03T14:24:21.000000Z

There's also this very handy and tiny cross-platform library:

https://github.com/gpakosz/whereami

ForOldHack · 2024-09-03T19:59:58.000000Z

Four cardinal sins of programming: 1. Self modifying code. ( The word 'recalcitrant' comes to mind. 2. calling your own program to execute itself. 3. Interrupting the flow of control with a jump. 4. Non-graceful exit. 5. Renaming 'hack' as 'vi' or 'ps'

SoftTalker · 2024-09-03T14:50:58.000000Z

There's no guarantee that the name and the path are still the same executable that is running, or that they even exist anymore.

wang_li · 2024-09-03T17:10:54.000000Z

In most of the variants of exec*() there are separate arguments for the thing to be executed and the *argv[] list. Argv[0] being the executable is just a convention. In perl $ARGV[0] is the first positional parameter. In

    $ perl myscript.pl a b c

$ARGV[0] is "a".

mistercow · 2024-09-03T15:22:54.000000Z

I mean sure. All software is built on assumptions. Make sure the assumptions you’re making are appropriate in context.

wongarsu · 2024-09-03T18:44:07.000000Z

Unless you are on Windows

glandium · 2024-09-03T22:19:35.000000Z

You can actually rename an executable that is running, on Windows. That's a way to handle self updates: rename the executable, create its replacement, execute the new one to make it remove the old executable.

akira2501 · 2024-09-03T14:12:50.000000Z

Beware TOC TOU problems when doing this.

fallingsquirrel · 2024-09-03T14:18:06.000000Z

You can do this without assuming the name by execing /proc/$PID/exe. Then you're not vulnerable to the argv[0] spoofing described in the article. (But of course since argv[0] does exist, you should set it properly and pass through your own argv[0] unchanged.)

dpassens · 2024-09-03T14:20:52.000000Z

That's not portable, though. OpenBSD, for example, doesn't have /proc.

hnlmorg · 2024-09-03T17:10:43.000000Z

That’s Linux only. Wouldn’t even work on macOS, which would likely be a significant number of your users.

hi-v-rocknroll · 2024-09-03T16:21:25.000000Z

coreutils-static did this too. The advantage of shared libraries and multiple-use single static binaries is they're only loaded once.

layer8 · 2024-09-03T15:35:59.000000Z

The article discusses this.