

A guide to preventing SQL injection - fogus
http://bobby-tables.com/

======
tptacek
Both the guide and the strip are inaccurate. Parameterized SQL statements are
also vulnerable to injection: you can't parameterize identifiers, only values.
So any dynamic column references, dynamic sort orders, or dynamic limit
statements remain injectable.

To see why this matters, consider every data table view ever implemented in
any web app: table columns are selectable with checkboxes, you can click the
columns to change sort order, and the data is paginated.

You should obviously be using parameterized queries. They are clearly more
secure than concatenated SQL queries. But they aren't a magic totem against
injection, and we have much better resources for developers than XKCD. Start
with (yes really) Microsoft:

<http://msdn.microsoft.com/en-us/library/ms161953.aspx>

~~~
axod
Do people really use variables based on raw userdata to select table/columns?
Seems like an incredibly rare case, and bad design. I've never seen/used it.

Parameterized queries _are_ a magic totem against injection, unless you mix
them with concatenation of unclean userdata. Which would be silly.

~~~
tptacek
Of course they do. Sort columns send "ASC" and "DESC", pagination limits and
offsets are sent in offset= parameters, etc. How else do you think people do
it, other than the most obvious way?

Don't make the problem even worse than it is by trying to downplay it just
because _you_ know what you're doing.

~~~
axod
OK you got me on those 2 cases. For sort order:

$sort_order=="ASC"?"ASC":"DESC"

For offsets and limits convert/parse it to a number first and ensure it's
within sane bounds.

You're right though. Parameterization isn't a magic bullet that allows you to
forget about all other unclean data _if_ you're using other unclean data in an
SQL query.

Surely _we_ all know this though? :/

~~~
tptacek
My advice is, pass page= as your pagination and pass 1/0 in for ASC or DESC.
If you have selectable columns, number them in the parameters, don't name
them.

Point being, when you compose the query, you should never be inserting text
from parameters; you should be converting parameters into known-good values.

------
brown9-2
I think it might be a bit more helpful to explain to people _why_ or _how_
parameterized queries are better than "doing it yourself" for this sort of
thing.

~~~
patio11
I'll take a stab at it.

If you think you are good enough to write code resistant to SQL attacks, you
are wrong. If you think the best coder in your company is good enough to write
code resistant to SQL attacks, you are wrong. Now realize that it is neither
you nor the best coder in your company that you have to worry about, but the
performance of the worst guy employed by the team in Bangalore on his first
day back at work after his mother's funeral.

All code by necessity has a weakest link. The best you can do is make sure it
is the library you are using.

(Incidentally, this also describes manual memory allocation.)

~~~
tptacek
Sigh.

You're still "doing it yourself" even if you use parameterized statements.

My take on this article has changed in the past hour; before I thought it was
cute but inaccurate, but its state is transitioning to "actively evil".

The reality is that companies that have a lot of SQL but never have SQLI vulns
do _all of the following_ :

* Use an ORM like Hibernate (or AR or Django) for "front-line" database queries

* Standardize on parameterized statements for the complex stuff

* Design and implement a "house style" for query builders and "modular" SQL statements

* Factor as much as possible into stored procedures in the database

* Run databases in least-privilege mode, so that code that only needs read access to a few tables can "revoke" the unneeded privileges

* Sweep their codebases for SQL statements and audit them with a team signoff

By "do all of the following", I mean "A-L-L of the following". The ones that
don't are the ones that tell us "there's no way you're going to find SQLI in
your audit; you'd get fired for having concatenated SQL here", and then lose
their entire database in the first week.

I don't mean to sound obnoxious. I just hate the idea of people walking around
thinking their code is safe because they switched to ? instead of "'" + x +
"'".

~~~
statictype
Edit: errr, nevermind. I read your posts further down.

Well to be fair, the article isn't claiming that parameterized queries will
make all your DB transactions secure, just that they will prevent injection
attacks. Which is true isn't it?

Unless you have dynamic code generation in your sql, parameterized queries
make injection attacks impossible, don't they?

~~~
tptacek
No, statictype. It isn't true.

------
WilliamLP
So why can't I use mysql_real_escape_string? Let's assume that I have to use
PHP and there's no way in hell this app is ever going to need to be ported to
a different db. (Or wrap it in a "quote" method.)

Edit: Instead of a downmod, how about an answer?

~~~
tetha
I think, escaping the input from the user in some way is a way which might
work, but basically does not address the core issue of sql-injection attacks,
pretty much like putting transparent tape over a crack in a window. It sort of
works against most known cracks, but there might (and in security-terms, that
means: there will) be a crack which cannot be taped well.

But, analogies are no good, let's get to the core of this.

Executing an sql-statement basically means that the server parses, optimizes
and executes the statement. Whenever you type something like

    
    
      "Select name from dogs where tag=" + mysql_real_escape_string($user_tag)
    

you assume: the server will parse this query as a select-statement, with the
field 'name' selected, whereas the table to examine is 'dogs' and there is
some predicate on the tag, and some properly secured value goes into the
predicate.

If an injection attack is successful, the assumption that the parsed SQL-
statement and the SQL-statement in the code match fails, because a successful
SQL-injection infiltrates the actual structure of an SQL-statement and
modifies it pretty much arbitrarily.

Now, it is true that it might be possible to escape all known possible user
inputs (which might be possible, as SQL should not be turing complete, but I
am not sure about the newest standards), so the user input cannot infiltrate
the structure of the SQL-Query, because in the user input, no control
characters are working. But this is, as I said, mostly like saying: Well,
there is a hedgedog on your chair, so use a pillow before you sit down or it
might hurt both of you. It does not address the core of the problem.

The core of the problem is: you are doing two things at once. You are
transmitting the actual (static) query you want to execute -- the structure of
the query -- and the actual values in a single go. The real way --
parametrized queries -- first transmit the structure and after this, they
transmit the value. Thus, it is guaranteed that the parsed structure in the
database will be exactly the structure you specified in your code, and thus,
an attacker cannot modify the structure of your query by inserting values into
the structure, because at this point, it is clear that the values are values,
and not actual structure code. And this is the reason, why parametrized
queries are better than assuming that a certain function (or multiple
functions, who knows) can handle ALL possible injection attacks any hacker (or
cracker :/) genius might ever think of. Transferring the structure first and
the values separately is like picking the hedgehog up, sitting down and having
a happy hedgehog sitting on your lap ;)

Note that I am aware of the problem that this just guarantees that the
_structure_ of the query is transmitted properly. If the _structure_ of your
query allows arbitrary _behaviour_ of your query via appropiate values ('if
param1 == "admin" then eval param2' sort of rings), then attacks are possible
again. However, these attacks are different from the sql structure injections,
because these attacks do not aim to modify the structure of your query, but
they rather aim to abuse the behaviour of your query. If you like to think in
analogies, a structure injection would be like patching the linux kerlen
somehow to get a backdoor going, while a value attack against a query with
dynamic behaviour is more like replacing the 0-page with arbitray code doing
nasty things.

Also note that not using parametrized queries with dynamic behaviour opens up
_two_ attack vectors: Attacking the structure of your query (bobby tables) and
abusing the behaviour of your queries.

And also note, that securing a single query (or, all queries in your
application) is also no substitute for actually securing your database server,
because overall, your data is the goal of an attacker. The data is the money,
not the application. Thus, your application is just another attack vector
against your actual data base and your actual data, and your application
consists of more or less attack vectors (which might be injection attacks,
behaviour attacks, attacks via session hijacking, ...). Thus, you might use
good queries, but if every user has weak passwords and also maximum privileges
in the database, an attacker will ignore the application and probably try to
attack the database directly.

Sorry for the wall of text, I just wanted to answer this and did not have that
much time to make it shorter, HTH, Tetha

------
amalcon
Also remember that parameterized queries are likely to be faster. They play
better with the optimizer cache.

------
umjames
Hate to sound pedantic, but the Java example is a little off.
java.sql.PreparedStatements have their parameters start at index 1, not 0. So
the first "?" is parameter 1 and the second is parameter 2.

It's one of the few places in Java where indexing starts at 1 instead of 0.

~~~
tetha
... which probably is the reason he got it wrong. Consistency is a blessing.

------
TallGuyShort
This is similiar to the advice to NOT write your own encryption code: Odds
are, somebody already wrote it 10 times better than you could.

------
raquo
Why did they not include PHP?

------
_ck_
Someday every xkcd strip will have its own website.

    
    
      (that wouldn't be so bad actually)

~~~
lucasoman
So if you want comic-strip security, just go to this site. Perfect.

~~~
lucasoman
Why the downvotes? It's a superficial glance at SQL injection fixes with
little explanation or education. Just like you can't put a novel in a comic-
strip, you can't explain SQL injection security with a single block of code.

------
ilyak
I was expecting some kind of PHP bullshit about replacing quotes.

The article turned out to be actually good. Cheers.

