

Common Mistakes as Python Web Developer - alanthonyc
http://lucumr.pocoo.org/2010/12/24/common-mistakes-as-web-developer

======
davidhollander
Good advice for problems, but one should focus more on the underlying problem
mentioned:

 _chances are, vital files can be accessed with the rights your application
has._

When you deploy your webapp, NEVER run your web server or application
container server as root. I deploy python using nginx and uWSGI where nginx
runs as the "nginx" user and group, and uwsgi runs as the "uwsgi" user and
group.

Since uWSGI is spawning all the python instances, if I want to be able access
files on disk such as for storing and reading sessions, I must explicitly
change the folder owner to the uwsgi group for the webapp to even launch.

If you take only one security precaution for your production python app, make
sure none of the instantiating servers run as root and that they must be given
explicit permission to access folders.

------
Locke1689
The author also makes a common mistake: blacklisting as safe string escaping.
My standard is to only accept alphabetic characters in any file-based user
input system (I find very few filesystem problems require more than this). For
example, one of my applications allows people to upload files. They never
access the file names themselves, but are allowed to input a "file name." This
file name is all alphabetic and is checked for length restrictions and encoded
in my own file naming system. The translations are maintained in the
application database. This prevents a wide variety of attacks on the
filesystem itself.

Moreover, this whole article embodies the statement NEVER, EVER, IN ANY
CIRCUMSTANCES TRUST USER INPUT.

~~~
the_mitsuhiko
> The author also makes a common mistake: blacklisting as safe string
> escaping.

Which is why it says "at the very least". Werkzeug's secure filename function
explicitly only whitelists characters in filenames.

> Moreover, this whole article embodies the statement NEVER, EVER, IN ANY
> CIRCUMSTANCES TRUST USER INPUT.

The problem is that often you will assume that certain APIs were written with
that in mind when they were not. os.path.join for instance looks very
innocent. I expect my database abstraction layer also to handle SQL injection
projection for me, so I would be very confused if SQLAlchemy turned out to not
escape strings passed in as placeholders.

~~~
Locke1689
_The problem is that often you will assume that certain APIs were written with
that in mind when they were not. os.path.join for instance looks very
innocent._

Eh. I'm inclined to offer very little sympathy to anyone who believes that the
os module in python provides any web injection protection. The os module is
named "miscellaneous operating system interfaces" and anyone who got above a C
in their OS course in college should know that no UNIX programmer took web
injection vulnerabilities into account when designing the os interface.

~~~
the_mitsuhiko
Sympathy or not, these things happen.

------
settrans
Common mistakes as an experienced developer: over-complicate your system with
a complex topology of multiple, independent services with separate
maintenance, deployment and configuration concerns when a simple, monolithic
script could do.

~~~
huxley
There's a world of possibilities that exist between over-complicated systems
and monolithic scripts.

I think it is fair to take a moment to consider maintenance, deployment and
configuration before adding complexity to a project, but a monolithic script
is usually only the answer for relatively simple questions.

------
jimmybot
Hmm, startswith('/', '../') in is_secure_path doesn't check for paths like
this: './../'

Although it would prohibit some legitimate uses, you could just prohibit any
usage of '../' anywhere in the path.

~~~
tav
You've overlooked the critical call in the previous line:

    
    
        def is_secure_path(path):
            path = posixpath.normpath(path)
            return not path.startswith(('/', '../'))
    

The call to `normpath` normalises the path, e.g.

    
    
        >>> normpath('./../foo')
        '../foo'

~~~
jimmybot
Ah, you're right, thanks.

------
ggchappell
Shouldn't sanitization tools of the kind he is talking about, be in the Python
standard library? The os package is supposed to let you deal with OS-specific
things in an OS-independent way. Isn't filename sanitization one of those
things?

More generally, I think we should have a reasonable expectation that any
library/API that gives one access to resources that are dealt with using text
(file paths, HTTP requests, SQL stuff, interfaces to a command line, etc.)
should include sanitization functionality.

Wouldn't that make a nice general design guideline?

------
weaksauce
It's been a while since I tried it but doesn't this also work to get you up a
directory?

ExistingDir/../../parentRestrictedDir/passwords

His code seems to only check for the case where it starts with ../ of course
this requires knowing of an existing directory in e current folder but that is
not insurmountable for an enterprising hacker.

I'm on an iPad so I cannot check right now.

Edit: Just tried it out on my mac and it works like I remember.

~~~
tav
I've also explained this in another comment, but basically, the call to
`normpath` in his code takes care of this, e.g.

    
    
        >>> normpath('ExistingDir/../../parentRestrictedDir/passwords')
        '../parentRestrictedDir/passwords'

------
Ysx
Django avoids dirctory traversal with django.utils._os.safe_join()

[http://code.djangoproject.com/browser/django/tags/releases/1...](http://code.djangoproject.com/browser/django/tags/releases/1.2.4/django/utils/_os.py#L24)

------
deno
Parsing paths properly is certainly a good idea. But perhaps a more important
issue is why you're not running your web application in a secure sandbox in
the first place.

I think SELinux is now standard part of the Linux kernel[1], and if you're
using Ubuntu there's AppArmor[2].

[1] <http://en.wikipedia.org/wiki/Selinux>

[2] <https://help.ubuntu.com/10.04/serverguide/C/apparmor.html>

------
telemachos
> A few weeks ago I had a _hatred_ discussion with a bunch of Python and Open
> Source people at a local meet-up about the way Python's path joining works.
> (emphasis mine - "heated"?)

Freudian slip? Or something halfway between a typo and a misheard word?

------
macco
I have a question about the "Mixing up Data with Markup"-Problem. If you don't
have markup in your database - how do you structure your content? What is a
header, what is a paragraph?

~~~
deno
If your input is HTML then your data is HTML. Author just warns about
converting plain text etc to HTML before storing, thus making assumptions
about data presentation.

~~~
macco
Ah ok, that is something different. My fault. Thanks.

------
astrofinch
I recommend Unipath for dealing with the os.path.join issue:

<http://sluggo.scrapping.cc/python/unipath/>

------
bcl
os.path.join is evil. Don't use it. Yes, I recently learned this myself
(luckily not the hard way).

~~~
TomasSedovic
Well, that's a pretty broad statement.

Do you mean that it's evil when called with untrusted input by a trusting
program or evil in general and not to be used under any or most circumstances?

The article suggests the former but your comment the latter. I would think the
function is pretty safe and okay when you use it with a known input. Better
than writing your own, anyway.

~~~
bcl
Evil with un-trusted input. And dangerous when you forget that passing it a
new root will wipe the previous paths.

------
obiefernandez
Of course the most common mistake is not using Ruby instead :-p

