chances are, vital files can be accessed with the rights your application has.
When you deploy your webapp, NEVER run your web server or application container server as root. I deploy python using nginx and uWSGI where nginx runs as the "nginx" user and group, and uwsgi runs as the "uwsgi" user and group.
Since uWSGI is spawning all the python instances, if I want to be able access files on disk such as for storing and reading sessions, I must explicitly change the folder owner to the uwsgi group for the webapp to even launch.
If you take only one security precaution for your production python app, make sure none of the instantiating servers run as root and that they must be given explicit permission to access folders.
Moreover, this whole article embodies the statement NEVER, EVER, IN ANY CIRCUMSTANCES TRUST USER INPUT.
Which is why it says "at the very least". Werkzeug's secure filename function explicitly only whitelists characters in filenames.
> Moreover, this whole article embodies the statement NEVER, EVER, IN ANY CIRCUMSTANCES TRUST USER INPUT.
The problem is that often you will assume that certain APIs were written with that in mind when they were not. os.path.join for instance looks very innocent. I expect my database abstraction layer also to handle SQL injection projection for me, so I would be very confused if SQLAlchemy turned out to not escape strings passed in as placeholders.
Eh. I'm inclined to offer very little sympathy to anyone who believes that the os module in python provides any web injection protection. The os module is named "miscellaneous operating system interfaces" and anyone who got above a C in their OS course in college should know that no UNIX programmer took web injection vulnerabilities into account when designing the os interface.
I think it is fair to take a moment to consider maintenance, deployment and configuration before adding complexity to a project, but a monolithic script is usually only the answer for relatively simple questions.
Although it would prohibit some legitimate uses, you could just prohibit any usage of '../' anywhere in the path.
path = posixpath.normpath(path)
return not path.startswith(('/', '../'))
Now obviously doesn't apply if you figure out which page to serve based on a submitted GET variable, like the OP seems to suggest doing. I'd see about using PATH_INFO instead (it would also make the Urls easier to read, obscure the implementation a little and potentially be more portable) But it's reasonable, and safer, to just say "the page variable has a specific subdir/subdir/file.html structure and immediately 404 if any of those components don't match ([\w–]), rather than trying to use filesystem-path semantics to resolve to a file and trusting that'll be secure and does what you want.
Don't ever allow the client to specify a filename; use an indirect reference such as a uuid. If you must, always perform path normalization and/or only allow alphanumeric characters.
>>> import posixpath
More generally, I think we should have a reasonable expectation that any library/API that gives one access to resources that are dealt with using text (file paths, HTTP requests, SQL stuff, interfaces to a command line, etc.) should include sanitization functionality.
Wouldn't that make a nice general design guideline?
His code seems to only check for the case where it starts with ../ of course this requires knowing of an existing directory in e current folder but that is not insurmountable for an enterprising hacker.
I'm on an iPad so I cannot check right now.
Edit: Just tried it out on my mac and it works like I remember.
I think SELinux is now standard part of the Linux kernel, and if you're using Ubuntu there's AppArmor.
Freudian slip? Or something halfway between a typo and a misheard word?
Do you mean that it's evil when called with untrusted input by a trusting program or evil in general and not to be used under any or most circumstances?
The article suggests the former but your comment the latter. I would think the function is pretty safe and okay when you use it with a known input. Better than writing your own, anyway.