
Libc on macOS invokes Perl as a subprocess for string processing (2017) - DyslexicAtheist
https://twitter.com/PttPrgrmmr/status/918705233072594945
======
coldtea
Well, FreeBSD does it by calling a sh builtin:

    
    
      		/*
      		 * We are the child; make /bin/sh expand `words'.
      		 */
      		(void)__libc_sigprocmask(SIG_SETMASK, 
      &oldsigblock, NULL);
      		if ((pdes[1] != STDOUT_FILENO ?
      		    _dup2(pdes[1], STDOUT_FILENO) :
      		    _fcntl(pdes[1], F_SETFD, 0)) < 0)
      			_exit(1);
      		if (_fcntl(pdesw[0], F_SETFD, 0) < 0)
      			_exit(1);
      		execl(_PATH_BSHELL, "sh", flags & WRDE_UNDEF ? "-u" : "+u",
      		    "-c", "IFS=$1;eval \"$2\";"
      		    "freebsd_wordexp -f \"$3\" ${4:+\"$4\"}",
      		    "",
      		    ifs != NULL ? ifs : " \t\n",
      		    flags & WRDE_SHOWERR ? "" : "exec 2>/dev/null",
      		    wfdstr,
      		    flags & WRDE_NOCMD ? "-p" : "",
      		    (char *)NULL);
      		_exit(1);
        	}

~~~
MichaelMoser123
funny that glob doesn't do that
[http://xr.anadoxin.org/source/xref/macos-10.14-mojave/Libc-1...](http://xr.anadoxin.org/source/xref/macos-10.14-mojave/Libc-1272.200.26/gen/FreeBSD/glob.c)
however wordexp does.

~~~
coldtea
Glob is easier to code than wordexp. Much fewer rules.

------
self_awareness
Seems it's very old code. New Libc doesn't do this. This is Libc from Mojave:

[http://xr.anadoxin.org/source/xref/macos-10.14-mojave/Libc-1...](http://xr.anadoxin.org/source/xref/macos-10.14-mojave/Libc-1272.200.26/gen/FreeBSD/wordexp.c#105)

The 'perl' code was a part of Libc v825.24, which seems to be included between
10.7 (Lion) and 10.8 (Mountain Lion).

Of course I still find it hilarious that even the old code did that!

~~~
opencl
The current version replaces the perl subprocess with an sh subprocess.
Doesn't seem like much of an improvement.

~~~
anyfoo
Well, wordexp's purpose is literally to "perform shell-style word expansions",
as quoted from the man page. It even supports command substitution if you
don't pass WRDE_CMDSUB.

So really, the entire premise of that POSIX function is horrible[1]. Just like
system(), which also explicitly executes the given command line using the
shell. These functions are not safe to use with untrusted input (e.g.
remotely), ever.

EDIT: [1] But arguably only as horrible as calling out to the shell is in
general. If you e.g. use it as part of a shell utility that assumes full
POSIX-permissioned access to your user anyway, it's not unreasonable because
there isn't any privilege escalation at all. Though I'd argue that in the case
of system() it's probably more clear to the developer that a shell callout is
happening. And also, that the "shell-style" expansion performed here is kinda
muddily defined.

------
panic
Discussed in 2015 (the code was out of date even then):
[https://news.ycombinator.com/item?id=9025572](https://news.ycombinator.com/item?id=9025572)

------
josteink
That's obviously wrong.

It should use Emacs instead.

~~~
protomyth
I’m actually surprised there isn’t a library called libemacs. It would fulfill
the mythos and be really useful for a lot of tools.

~~~
gnulinux
Emacs doesn't do anything in C other than running lisp, emacs really is just a
lisp implementation with GUI and text editing focus. libemacs would be just
like lua or cpython.

~~~
kazinator
Emacs has over a quarter million lines of C, which must be there for a reason.

------
fouronnes3
Perl doesn't depend on libc?

~~~
detaro
It does, but that's "not a problem" as long as it doesn't use this function to
implement something that's executed during this function.

------
jslabovitz
Turns out this is actually documented in the manpage for wordexp()! (And
refers to the mentioned fact that it now calls 'sh' directly.)

> BUGS

> Do not pass untrusted user data to wordexp(), regardless of whether the
> WRDE_NOCMD flag is set. The wordexp() function attempts to detect input that
> would cause commands to be executed before passing it to the shell but it
> does not use the same parser so it may be fooled.

> The current wordexp() implementation does not recognize multibyte
> characters, since the shell (which it invokes to perform expansions) does
> not.

------
senozhatsky
Shall somebody send a pull request [0]

    
    
        -    /* XXX this is _not_ designed to be fast */
        +    /* XXX this is _not_ designed to be safe */
    

[0] [https://github.com/Apple-FOSS-
Mirror/Libc/blob/2ca2ae7464771...](https://github.com/Apple-FOSS-
Mirror/Libc/blob/2ca2ae74647714acfc18674c3114b1a5d3325d7d/gen/wordexp.c#L192)

~~~
acura
Isn't the line bellow that line enough? /* wordexp is also rife with security
"challenges",

~~~
senozhatsky
My assumption was that those security "challenges" were related to expansion
(wordexp()) per se; not to the way wordexp() was implemented in this
particular case.

------
MichaelMoser123
Now I see why you can't change /usr/bin on macos. actually there is both perl5
and python2.7 in /usr/bin, libc does have a choice (that is if the tweet is
true)...

[https://en.wikipedia.org/wiki/System_Integrity_Protection](https://en.wikipedia.org/wiki/System_Integrity_Protection)

~~~
olliej
You can’t change use/bin because that’s a common malware attack vector.

It also has the nice effect of forcing user installed utilities to install in
the /local/ variants (which user build projects should be doing on Linux
iirc), so an OS update doesn’t overwrite user data.

~~~
tinus_hn
Also if it bothers you, you can just turn it off. It's just difficult enough
people can't simply put some screenshots on a webpage to work around it
because they are too lazy to code properly.

------
fit2rule
I find it hard to believe that there's any software out there that doesn't,
eventually, invoke Perl as a subprocess .. I mean, its Perl.

~~~
leejo
That's essentially what the OG tweet is saying: _Pinnacle of software
development: you can solve the problem with three lines of Perl, but you
don’t, because of a non-argument against Perl._ Since there didn't used to be
that many arguments against perl/Perl it worked its way into a lot of systems
even if it wasn't actually implementing the system.

Of course Perl, having fallen out of vogue, probably wouldn't be used today
but it used to be _everywhere_ so its footprint is still pretty large.

Also - I can't help but see the irony in shelling out _to_ perl given
experienced Perl developers always tell the less experienced ones to avoid
shelling out _from_ Perl if possible and to only do that as a last resort if
there isn't an existing library to solve the problem.

~~~
rurban
Only loud-mouthed inexperienced middle managers tell their juniors to follow
NIH and do everything in pure perl. It's pure fear to be broken by changed
dependencies.

More experienced managers tell their devs to shell out to standard tools like
sh, wget, curl, dig, mysqlclient and not use the builtin pure-perl libs. The
tools are much better, the code is 10x smaller and faster, and you are getting
updates for free (e.g. ssl). Even in C I very often call system("wget
[http://..."](http://...")) and avoid libcurl.

~~~
leejo
_Only loud-mouthed inexperienced middle managers tell their juniors to follow
NIH and do everything in pure perl. It 's pure fear to be broken by changed
dependencies._

It's not NIH if you're advocating using CPAN modules or existing libraries.

Sure, use the right tool - but don't advocate to juniors a method that can
lead to OS command injection because they're not experienced enough to know
to, or how to, sanitise their inputs.

------
the_mitsuhiko
Not exactly related to the link but apparently the author of that tweet
blocked me. I don’t recall having ever had any interactions with them. Is
there a way to contact that person and figure out why? I have no idea what I
did and I’m quite puzzled.

~~~
stonogo
You find out someone has blocked you, and your instinct is to communicate with
them? The specific thing they have explicitly disallowed? I'd reconsider this,
and just move on.

~~~
the_mitsuhiko
Is that how you are supposed to deal with that? How is one supposed to improve
as a human if one does not know where their mistakes are? If I did something
wrong I would like to know.

I did not contact that person anyways and I doubt there is a way. But at least
knowing when I was blocked could help me deduce what might be the reason.

Honestly I do find this quite upsetting.

~~~
Shish2k
> Is that how you are supposed to deal with that?

I believe the commonly accepted answer is "Yes, it's not my job to give you
the necessary information to improve yourself". The same people who say that
also often ask "Why aren't the people around me improving themselves?" :P

(Personally, I would prefer it if people pointed out my mistakes, and I do the
same for others as a courtesy, not an obligation. I do understand if they
don't have the energy to do that, but I think if they don't, then they forfeit
the right to complain about a lack of improvement)

~~~
the_mitsuhiko
> I believe the commonly accepted answer is "Yes, it's not my job to give you
> the necessary information to improve yourself". The same people who say that
> also often ask "Why aren't the people around me improving themselves?"

But that’s in no way comparable. When in the real world i screw up I can tell
from people’s responses. Body language, their actions etc. In this case it’s
just discovering at one point someone blocked one without any indication of
when that happened and why without any indication.

//edit: also even weirder in an effort to see where our interactions might
have been i found a tweet from 2014 by myself about the same topic:
[https://mobile.twitter.com/mitsuhiko/status/5264923088676700...](https://mobile.twitter.com/mitsuhiko/status/526492308867670017)

------
acura
Not good looking at all but, latest commit to that repo is Updated on Oct 11,
2012.

So how well does it reflect reality?

~~~
raimue
The official sources are published by Apple on
[https://opensource.apple.com](https://opensource.apple.com)

This repository is just a snapshot that somebody else prepared and uploaded to
GitHub, but apparently it is not maintained.

