Hacker News new | past | comments | ask | show | jobs | submit login
Surprising Consequences of macOS’s Environment Variable Sanitization (hynek.me)
95 points by ash on Jan 11, 2023 | hide | past | favorite | 23 comments



It’s definitely surprising that macOS does this, but it makes me also wonder if the authors considered just modifying the binary so that the dynamic linker could find the library: https://www.joyfulbikeshedding.com/blog/2021-01-13-alternati...


They did (it's me :)) and unfortunately in this case rpath can't be used, because that particular Python driver uses ctypes (https://docs.python.org/3/library/ctypes.html) to open the binary drivers which oversimplified means that there is no binary top modify.

I hope I make it in the preamble clear that this is bad and one should not have to deal with this – but it happens in practice and I hope such a summary is useful.

For posterity: if you want to wrap / use a C library in Python, you should go for CFFI (Cython works too and is overall faster, but has other downsides). This PyCon US video is a great up-to-date summary: https://www.youtube.com/watch?v=gROGDQakzas


Maybe it's possible to patch the dependency? For example, for Node.js / npm, there are automated ways to do this, like [patch-package](https://www.npmjs.com/package/patch-package). Does Python / pip have something similar?


You mean at the Python driver level? Unfortunately, that doesn't work with ctypes.

I've tried it by adding the DYLD_LIBRARY_PATH to os.environ before calling ctypes.LibraryLoader.LoadLibrary (https://docs.python.org/3/library/ctypes.html#ctypes.Library...) and it didn't work. I suspect ctypes gets somehow initialized much sooner and adding environment variables in your apps doesn't help.

TBH I didn't research it further, since the problems of the post are more general and it can happen that you trip into them regardless of runtime.


Here’s the relevant source: https://github.com/python/cpython/blob/8dd2766d99f8f51ad62dc...

I believe you should just be able to replace the library name with an absolute path to the library, and remove the need for the lookup at all?


Right, but that would take some really deep patching of code I don't control, just so it works in development.

If this were a production issue, I would probably fork the driver and do what you're suggesting (it's not like it's actively maintained or something :|).


Why not fork it? If only to have the code under your control in case it abruptly becomes inaccessible.

Then the fix is a single if statement. Seems worth it to me.


You can still do it.

rpath doesn't just apply to executable binaries, it applies to all shared objects including libraries. So, what you can do is modify the dyld that the python/ctypes module will eventually dlopen(). You should be able to find it by rooting around in the files installed by the module in question.

rpath is evaluated not just at binary execution, but also upon any dynamic linking -- including dlopen(). So it will work for both ctypes and CFFI.

Happy to provide further pointers if you need!


Out of curiosity, what are the downsides of Cython?


Conceptually, Cython is mainly for accelerating Python code, and can _also_ access C code. Meanwhile CFFI is specifically for calling C code and nothing else. I recommend the video for the differences.

One concrete thing that pops to my mind is that Cython doesn't support Py_LIMITED_API which means that you need to ship a lot more binary wheels. At least the issue is still open (https://github.com/cython/cython/issues/2542) and Cython projects IME need new wheels for each minor Python release. Compare that to cffi projects that (musl & pypy aside) only have to ship wheels for one Python version / architecture: https://pypi.org/project/argon2-cffi-bindings/#files


As a developer I really dislike this security feature of macOS. We have a whole system for running binaries from the build directory (ie. without installing) which relies on environment variables including DYLD_*, and it works everywhere except macOS. Unless you do some scary-looking changes to the system at boot time.

(If I used macOS for anything other than ssh-ing in for development, I'd probably appreciate it, especially if running $random stuff).


You can work around that by using relative linker paths (rpaths).

See 'install_name_tool -add_rpath' (to add an rpath search path to a library or executable for finding other libraries) and 'install_name_tool -id' (to change the ID of the library to be '@rpath/<name of library>', ie:

   install_name_tool -add_rpath @loader_path/../lib <path-to-library>
   install_name_tool -id @rpath/<name-of-library> <path-to-library>
Works with Frameworks too.

Agree it is a PITA to setup though.


Can you use linker to hardwire build directory for this case?


The author writes that this kind of ad-hoc breakage of system interfaces is a "good thing in general" but it absolutely is not. I've dealt with this kind of lazy, reactionary response to security extensively. It's pernicious and becomes incredibly destructive at scale.

Interfaces must be simple and consistent. Inconsistencies make systems dramatically more difficult to use -- but it's notoriously hard to measure the cost of system complexity and the countervailing pitch to confound simplicity is very straightforward: "Bad guys did this thing. This is a rarely used thing. Let's break this thing."

As this blog illustrates, it is now prohibitively difficult to explain or predict how environment propagates in OSX. There is a dramatic increase in the likelihood of bugs. It would be better to simply delete the functionality of DYLD_LIBRARY_PATH from the dynamic linker. Of course, we can't do that because DYLD_LIBRARY_PATH exists for an important reason -- but that important reason is also why DYLD_LIBRARY_PATH's functionality should be protected and not capriciously broken in this fashion.

This kind of approach to security will always result in fragile, buggy systems in the long term. It will also undermine developer comfort, ultimately reducing the number of people willing to develop software for a platform (one straw among many, but they add up and eventually there's a final straw)

Are attackers really going to be deterred by destroying the convenience of setting environments in shells? Of course not. Figuring out how to set environment without using Apple's broken userland is a trivial task. So what's the point?


I wonder if you could write a LD_PRELOAD library that overrides setenv() (or whatever is used to sanitize the environment) calling the real setenv found via dlsym(RTLD_NEXT, "setenv") in most cases but suppressing the sanitizing of DYLD_LIBRARY_PATH, then run a shell with this library preloaded and Bob's your uncle. Would be kind of gross, and if you legit needed to clear DYLD_LIBRARY_PATH, you'd be out of luck.


Something like this maybe:

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <dlfcn.h>
    
    typedef int (*unsetenv_prototype)(const char *name);
    static unsetenv_prototype real_unsetenv = NULL;
    
    int unsetenv(const char *name)
    {
     if (!real_unsetenv) {
      *(void **) &real_unsetenv = dlsym(RTLD_NEXT, "unsetenv");
      msg = dlerror();
      if (msg) {
       fprintf(stderr, "Failed to override unsetenv(): %s\n", msg);
       fflush(stderr);
       real_unsetenv = NULL;
       return -1;
      }
     }
     if (strcmp(name, "DYLD_LIBRARY_PATH") == 0) /* suppress sanitizing this */
      return 0;
     return real_unsetenv(name);
    }

Assuming unsetenv() is what is used to do the sanitizing. Might be more likely execve() is what needs to be intercepted though, and by then, it's too late, (unless you just need to jam in a constant value known ahead of time).

Edit: another idea. If it only sanitizes certain variables, but not random ones, override setenv() and getenv(), and when passed DYLD_LIBRARY_PATH, instead set/get "MY_SPECIAL_LIBRARY_PATH", which isn't sanitized, and just set that instead of DYLD_LIBRARY_PATH.


LD_PRELOAD is an ELF thing, macOS has DYLD_INSERT_LIBRARIES, but it doesn't matter because this security feature is implemented in the dynamic loader itself, which also precludes using DYLD_INSERT_LIBRARIES anyway.


I am not sure why this person has not considered wrapping every command in /usr/bin/env? Regardless of the shell, env(1) would preserve that variable and do an execve.


That doesn't help you if there's a surprise call to /bin/sh or /bin/bash somewhere in the call stack. Keep reading, I know it's long but I tried to make it comprehensive. :)


The underlying problem (as is often the case) is using references/names (like /bin/sh) without defining them.

FYI the GNU Make in Nixpkgs doesn't have this problem, since it patches the reference https://github.com/NixOS/nixpkgs/blob/nixos-22.11/pkgs/devel...

Nixpkgs also provides wrappers/shims around Python, etc. but is also patches all #! lines to hard-code the exact paths of specific binaries (e.g. `#!/nix/store/iffl6dlplhv22i2xy7n1w51a5r631kmi-bash-5.1-p16/bin/bash`) https://github.com/NixOS/nixpkgs/blob/master/pkgs/build-supp...


To see how much havoc such a core change causes for Nix package builders, please see my troubleshooting report on building Postgres in Nix [1].

BTW, my conclusion was that Postgres, as distributed by the Nixpkgs, is untested because they very likely never bothered to run `make check`. And hence I would not recommend anyone to rely on it for production use.

[1]: https://www.postgresql.org/message-id/flat/CABwTF4VBTKLORbKM...


> To see how much havoc such a core change causes for Nix package builders, please see my troubleshooting report on building Postgres in Nix

I consider this to be an excellent example of things working as expected. The point of Nix is not to make things easy, it's to expose all of the mistakes we've made (like depending on unspecified files, or attempting to network connections, etc.)

In this case, Nix(pkgs) has found and patched a bug in GNU Make: that it relies on undefined behaviour, namely the existence and behaviour of a /bin/sh executable. That's important, since the behaviour of /bin/sh differs between distros, e.g. some use Dash, some use Bash in POSIX-compatibility mode, macOS uses a version of Bash that's over a decade out of date, etc. Since GNU Make is used to build very low-level components, like GCC, that bug was causing pretty much everything in Nixpkgs to be under-specified/undefined. I'm very glad they fixed it!

It sounds like the Postgres test suite is also under-specified, since it attempts to inherit this undefined behaviour from GNU Make. Thankfully it sounds like Nix is doing its job, by rejecting this incomplete package definition. Even better, the problem is exposed when running the test suite; great job Postgres devs, that's exactly what test suites are for!


It should be possible to avoid problems with dynamic linking with judicious choice of $ORIGIN at compilation time.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: