
Never run ‘python’ in your downloads folder - ingve
https://glyph.twistedmatrix.com/2020/08/never-run-python-in-your-downloads-folder.html
======
codethief
Python's import system would be my biggest complaint about the language. I've
used Python for more than ten years, but there are so many pitfalls with
regards to imports that I still have to look things up on a regular basis.
What makes things even worse is that packages sometimes even meddle with how
imports work. (See e.g. Tensorflow – I recently had to debug the import magic
they're doing. It was a nightmare.)

Don't get me wrong, I LOVE Python! It's my go-to language for both small and
big tasks. But sometimes I do wish Python had a function like PHP's require()…
(Obviously for package-local imports, not for importing other packages.)

~~~
nonbirithm
Lua uses a "require()" function instead of an import statement for loading
modules. I think that this is a crucial difference from a lot of programming
languages that is vastly understated. What this means is that you can
completely replace "require()" with your own function that has complete
control over how the module importing system works. Because of this it is
possible to build your own runtime code hotloader entirely in Lua without
needing to modify the source of the compiler, and thus Lua is the only
"mainstream" programming language listed in the list of notable live coding
environments on Wikipedia[0]. (I'm surprised that Clojure isn't listed there,
though.)

The only way this can be accomplished is by designing the import system from
the start to use a function instead of a statement. It's too late for Python
to redo things and change its import system, unfortunately. It really is the
one language feature you have to get right the first time or you'll miss out
on those kinds of enhancements forever.

[0]
[https://en.m.wikipedia.org/wiki/Live_coding#Notable_live_cod...](https://en.m.wikipedia.org/wiki/Live_coding#Notable_live_coding_environments)

~~~
yzmtf2008
Untrue, you could literally replace `__import__` in python and achieve the
same thing:
[https://docs.python.org/3.8/library/functions.html#__import_...](https://docs.python.org/3.8/library/functions.html#__import__)

~~~
nonbirithm
Sorry, I think you're right. I think the actual problem I was trying to get at
is that Python makes it easy to both import from a module and bind the
imported functions/values to local names at the same time. This means the
values get bound privately in the module, instead of always being late bound
by accessing the module as a dictionary. And because the code in libraries
uses this way if importing things, you can't get at the old names if you try
to hotload a module.

The docs for Python's importlib say as much:

> If a module imports objects from another module using from … import …,
> calling reload() for the other module does not redefine the objects imported
> from it — one way around this is to re-execute the from statement, another
> is to use import and qualified names (module.name) instead.

But Lua isn't actually different here since you can still bind the members of
an imported module to local names manually, it's just that there is no "from"
keyword that does this for you. My personal theory is because of the lack of
something like "import * from" in Lua, it became more common for library
authors to use late binding and access the returned module by indexing into
it. However this is just my personal theory.

Also, now that I think about it you still have to make adjustments to your
code in order to have hotloading be feasible, even in Lua. I deliberately
chose to use late binding in the code I wrote with "qualified names" (really
just table indexing). And all the third party dependencies are versioned in
the project's repository, and don't have dependencies on other external
modules, so there was no need to worry about a dependency breaking hot
reloading because it uses imports differently. I think this might be a
cultural thing. A lot of Lua libraries are written in a single-file, no-
dependency manner which makes them much easier to integrate in the ways I
want. Python seems to have a much larger ecosystem with many external
dependencies depending on others.

So yes, I think it makes sense that Python could have module reloading, but
the problem is that the language encourages the use of local binding from
imports which is incompatible with it, and there is already a lot of code that
imports things like that, so it's not as practical if you're using libraries
from pip or use "import <...> from" anywhere in your codebase.

~~~
anonymoushn
I think you have problems if you just write local foo = require ("foo") in two
different files and you want to replace the module object in both of them at
once.

Edit: the below is probably wrong about what you're doing. It sounds like your
require-replacement merges the new return value of a module into the old one,
so code that accesses stuff through the module's top level table will get the
right result.

Wrong stuff: Maybe the hot reloading system you use relies on modules mutating
the module object rather than making a new one. This is pretty unusual among
Lua modules though. I usually see people unconditionally make a table and
return it.

~~~
nonbirithm
Yes, the table merging is what mine actually does. It clears the old table
reference and inserts the contents of the new module into it. It seems to work
well enough for my use-case.

------
heavenlyblue
Well, to be clear if you accidentally downloaded a main.c file with a virus
and then accidentally copied someone’s advice to compile an app using “cpp
main.c” while being accidentally in your Downloads folder, you will also
accidentally end up with a virus on your computer.

This article is too long for what it is trying to say. Although I will go the
author a kudos for being able to write so well about nothing - it took me to
scroll about a third of the article to realise that what they were saying was
blatantly obvious.

PS: the only reason I clicked on it was because I thought python by default
would import files with some “magic” names in the current directory.

~~~
kgwgk
For most people it’s not blatantly obvious that running

    
    
      python -m pip install ./totally-legit-package.whl
    

in a folder will execute a file pip.py in that folder

~~~
verroq
The real question is who runs python -m pip when there is a perfectly good
/usr/bin/pip?

~~~
coldtea
That's an easy question to answer.

python -m pip is often advised over pip or /usr/bin/pip, because it uses the
version of pip associated with that particular major+minor python install.

[https://snarky.ca/why-you-should-use-python-m-pip/](https://snarky.ca/why-
you-should-use-python-m-pip/)

~~~
travisoneill1
Wait, but isn't the point of the article that it will not do that, and rather
just run the pip.py in you current working directory?

~~~
Sebb767
Only in the (probably unlikely) case that you have a `pip.py` in your CWD. In
all other cases, it will execute pip as intended.

------
taeric
Python seems to embrace more magic and other subtle visual cues than any other
language. I am convinced it is only "beginner friendly" because of how much
folks insist it is so.

~~~
kevin_thibedeau
Python has little magic in it. It's 3rd party libraries that do un-Pythonic
things.

~~~
quietbritishjim
The descriptor protocol is definitely magic!

It normally doesn't _matter_ that it's magic because most people don't need to
interact with it. And it causes a very intuitive result in the part of the
language that most people actually do interact with: that x = obj.foo; x() is
the same as obj.foo() (not true in Javascript!). But that doesn't change the
fact that method binding happens in a more complex way under the hood than
you'd expect.

------
rurban
With Perl we fixed that security risk earlier (3 years ago) and better. The
path of the script is never added to the global script path anymore. This
fixed all these attacks automatically. If the script was in /tmp, ~/Downloads
or just ~/. We don't need to warn our users to take care, they won't. Most of
them don't even read such best practice warnings.

~~~
JJMcJ
Similar to how Bash usually doesn't have current directory on the default PATH
for executables, you must explicitly reference it ./local-executable for
similar concerns.

~~~
rahimnathwani
csh (tcsh?) used to be my default shell, and the current directory was first
in the path. That was really convenient. At the time, I didn't realise the
risks.

------
dwheeler
Hmm.. it seems that part of the problem is that Python interprets "empty entry
in PYTHONPATH" as being the same as ".". That just seems wrong. If you want
the current directory as one of your PYTHONPATH entries, then you should be
required to insert ".".

Perhaps the sharp edge would be reduced if Python skipped empty entries in
PYTHONPATH. This might require some people to explicitly add "." in their
configurations, but they should have done that anyway, and it wouldn't require
_code_ to change. PYTHONPATH is also getting less used anyway, so this might
be a relatively unnoticed change by most.

What do you think? Would that be worth proposing?

~~~
fanf2
It is probably for compatibility with the way POSIX handles $PATH
[https://pubs.opengroup.org/onlinepubs/9699919799.2018edition...](https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/basedefs/V1_chap08.html)

~~~
jwilk
Right; but for PATH this treatment of empty entries is mostly harmless,
because the variable is normally non-empty.

~~~
dwheeler
Yes, that is _exactly_ the problem. People treat PYTHONPATH just like PATH,
but the "easy scripting tricks" that are fine with PATH become subtle security
vulnerabilities when you do the same thing with PYTHONPATH.

E.g., people will often add a directory to PATH by doing:

PATH="$PATH:my_new_dir"

This is practically always fine for PATH, since it's practically always set.
If you use exactly the same syntax with PYTHONPATH, you have an instant
security vulnerability. Now the first directory searched is the current
directory, and if you ever wander into directories with untrusted fils (like
"Downloads/" or "somebody_elses_container/") it could become serious fast.
This is a sharp edge that was never intended.

I'm going to look into proposing this as a PEP. The change is trivial, but
there may be surprising uses. A PEP would mean that it'd get more scrutiny (so
the change will be less likely to be reverted later).

------
teddyh
The Python 3 executable has the “-I” (isolated mode) option to mitigate this
kind of thing.

~~~
jwilk
-I is hardly useful, because it also removes the user's site directory packages directory (typically ~/.local/lib/python3.X/site-packages/) from sys.path.

So packages installed with, say, "python3 -m pip install --user" wouldn't be
importable.

------
tgflynn
Aside from the Python related stuff one thing that confuses me about this
article is the claim than a browser can put files in your Downloads directory
without you requesting that. Exactly what browser/HTTP feature allows that ?

I don't think I've never encountered this behavior and any time I've
downloaded a file I've had the opportunity to see and change the name it is
stored under.

~~~
Sebb767
It is possibly when you select 'always save to downloads' in your preferred
browser.

~~~
AlphaSite
In safari it will not let you download without explicit user interaction.

~~~
masklinn
Safari will absolutely let you do it, though it might not be the default
anymore (certainly used to be, and you can configure it in Preferences >
Websites > Downloads).

Way more problematically, Safari will also open “safe” files (which includes
zips and PDFs) by default after having downloaded them.

------
richard_g
Reminds me of a Wordpress site that ended up having a bunch of files uploaded
to the uploads directory with the extension .php -- and the webserver allowed
them to be executed. Unfortunately things like this are often overlooked!

------
Const-me
I wonder who and why decided that downloading files from the internets into a
fixed location is a good idea? A downloaded file may be a PDF document,
installer of a software, a song, a photo, CAD file, anything at all. Makes no
sense to save everything into a same location.

Hierarchical file systems were invented for a reason. For the same reason, I
want browsers and chat apps to ask me where to save each file I download.
Fortunately, most of these apps have an option to enable such behavior.

As a nice side effects, this prevents security issues like the one described
in the article.

------
fomine3
Some Windows programs has similar issue (DLL preloading attack). The
vulnerability is common since around 10 years ago but the mitigation is not
enabled by default so some apps/installers still vulnerable.

------
jwilk
Relevant Python bug report:

[https://bugs.python.org/issue33053](https://bugs.python.org/issue33053)

------
cool-RR
Great observation about Python scripts in the downloads folder. Here's a way
to make this vulnerability even easier to exploit: Place a `sitecustomize.py`
file in there. That way, you don't even need the `-m pip`, since that file
will be imported automatically on any Python run.

------
choeger
SELinux is made to prevent this kind of issues. Or rather: To encode the
safety advice made by the author as a set of rules/constraints and ship it to
everyone.

Now the question is: Is SELinux the sawstop the author asks for?

~~~
jmnicolas
SELinux is bearable on a server, but I don't know if I could live with it on a
dev workstation though (to be fair, as a mainly Windows dev, I never made the
effort to "learn" SELinux properly).

------
axilmar
All these security issues stem from the fact that all the programs have a
common view of the system.

It's a major operating system flaw that no solution for it is in the horizon.

------
ezekiel68
Given that python 0.9.0 was released in 1991, it's kind of astonishing (in a
good way) that issues like these take nearly 30 years to creep into our
collective consciousness.

In my view, the focus on low friction and getting new users quickly productive
with minimum fuss (shared by many scripting languages) sets the stage for
these inevitable discoveries. And I'm not just hand-waving with the vague
"these" characterization. Focus on ease-of-use creates roving best-practices
factions that wax and wane over the years, with consequences perfectly
captured in the famous xkcd "Python Environment" cartoon.[0] The fact that the
empty string interpretation in PYTHONPATH hasn't been a bigger deal much
earlier raises interesting questions. (Are bad guys not trying as hard as we
imagine? Will newbie-friendly languages always end up with these kinds of
exploitable surfaces? etc)

[0] [https://xkcd.com/1987/](https://xkcd.com/1987/)

------
da39a3ee
This isn't a solution, but I recommend always using /tmp as a Downloads folder
(i.e. symlinking ~/Downloads to /tmp under macos so that it is at least purged
every time you reboot.

