
Bobby Tables - nikbackm
http://blog.cleancoder.com/uncle-bob/2017/12/03/BobbyTables.html
======
red_admiral
If I could downvote topics, this would get one.

We have an API for that. It's called prepared statements.

We also have a lot of other APIs that try and do away with SQL, and fail in
one of two ways. The first is people write code like
personDAO.fetchAll().toList().size() - admittedly you can make the same
mistake with SQL but I haven't noticed this quite as much. The second way is
that SQL is a very powerful language once you combine cursors, grouping,
inline views etc. and I haven't yet met an ORM libary that can come close. So
you're back to fetchAll() and do the work on the web server if you think SQL
is a scary language that you don't want to touch.

~~~
Shoothe
Couldn't agree more. Just recently I've used SQL to write a "lock and fetch"
data query in one relatively simple instruction reducing the application logic
complexity considerably.

By the way can anyone recommend a good book on advanced SQL (not a specific
DBMS)?

~~~
red_admiral
There was a HN topic on this about a year ago:
[https://news.ycombinator.com/item?id=13417326](https://news.ycombinator.com/item?id=13417326)

The problem with "advanced SQL, not a speficic DBMS" is the more advanced you
get, the less overlap there is between different dialects. MariaDB/MySQL for
example didn't support WITH until the 10.x versions (WITH RECURSIVE was added
in 10.2.2 according to the site) and it still needs an explicit alias for
SELECT ... FROM (SELECT ...) style constructions. SQLite can't do RIGHT/FULL
OUTER JOIN etc.

Karwin's SQL Antipatterns is a good book for a "second look at SQL" but I
wouldn't really call it advanced yet. The manuals for postgres and MariaDB,
while written as manuals and not tutorials, contain a lot of useful
information and a fair part of it applies to most relational databases.

Performance tuning is even more specialised. Although I've been praised as a
consultant before now for simply adding an index to something, my experience
is mostly on the MySQL side of things - I don't think it would be hard to beat
me in an "advanced postgres" quiz.

------
Areading314
Is this a joke? I've never met someone who had a hard time with safe query
parameterization.

It's wrong to say sql can be replaced with "an api", it does not have the same
flexibility, power, performance, etc. And would drastically slow down your
development. Also, what is "the api" going to use? Text files?

It's better to teach people to practice safe SQL than to require abstinence.

~~~
jgh
I feel like this is a blog post from 2003 or something, and the audience is
PHP developers who are writing inline SQL statements.

------
paulddraper
Proper SQL parameterization is a great example of where library/language
design has a lot to offer.

For example, in Java

    
    
        // IMPROPER
        PreparedStatement statement = connection.prepareStatement(
          "SELECT * FROM users where email = '" + email + '"'
        );
        statement.execute();
    

vs

    
    
        // PROPER
        PreparedStatement statement = connection.prepareStatement(
            "SELECT * FROM users WHERE email = ?"
        );
        statement.setString(1, email);
        statement.execute();
    

The first code is easier and simpler than the second code -- especially if
have lots of parameters, or if your queries start mixing arbitrary data and
variable parts of the SQL. So people are sometimes going to write the first
code.

\---

In contrast, Scala has more sophisticated facilities than Java. I contribute
to
[https://github.com/lucidsoftware/relate](https://github.com/lucidsoftware/relate)
where the example would simply be

    
    
        // GOOD
        sql"SELECT * FROM users WHERE email = $email".asSeq[User]()
    

It's just so easy that no one is ever even _tempted_ to do the wrong thing.
Use developers' laziness to your advantage, not your detriment :)

You can keep the full power of SQL (index hints, hand-tuned locking, DBMS-
specific features), without constantly fighting the propensity towards SQL
injection.

~~~
rhizome31
Django ORM is also less error-prone than the Rails example shown:

    
    
        User.objects.get(email=email_input)
    

Raw Python DB-API, on the other hand, can easily be misused, as I've see on
several occasions.

------
wrs
This is a brilliant suggestion! We should also stop using HTML to prevent HTML
injection, and stop using XML to prevent XML injection, and stop using the
stack to prevent stack overflow.

(Actually, using an API _to build_ SQL, HTML, and XML, instead of doing string
substitution, is exactly the right way to prevent ___ injection, but that's
not quite what this says.)

------
BjoernKW
> What would replace SQL? An API of course!

Turtles all the way down. At some point you have to use some sort of data
access language.

Providing an API instead of using SQL queries only works if you have a fairly
limited number of different procedures to be called and data structures to be
returned. Otherwise you'll end up with a combinatorial explosion of API
methods and input values.

If you view your SQL database as mere data storage that approach could be
valid. If however, you're working with complex relational models and data
processing providing an API is a pointless endeavour.

I've seen (legitimate, mind you) SQL queries which were more than a hundred
lines long and in turn called multiple additional embedded queries. Some of
those queries used groups and window functions alongside with normal tabular
data. Good luck with providing that as an API!

~~~
walshemj
Only a Hundred? I have seen some that where over a 1000 in BT's CSS system

------
catnaroek
I know of at least one language[0] that makes SQL injection impossible, by
simply type-checking SQL alongside with the rest of your application, but I'm
not sure Uncle Bob would like using it[1].

[0] [http://impredicative.com/ur/](http://impredicative.com/ur/)

[1] [http://blog.cleancoder.com/uncle-
bob/2017/01/11/TheDarkPath....](http://blog.cleancoder.com/uncle-
bob/2017/01/11/TheDarkPath.html)

~~~
Shoothe
For anyone interested in how it looks like:
[http://impredicative.com/ur/todomvc/todomvc_db.ur.html](http://impredicative.com/ur/todomvc/todomvc_db.ur.html)

It seems DELETE FROM requires WHERE... Good idea :)

~~~
catnaroek
Far more important IMO are the following features:

(0) The type checker ensures that database and application types are
synchronized. It's literally impossible to query inexistent tables, inexistent
fields, or assume they have a type that they in fact don't have.

(1) Tables can be private to a module, just like everything else in an ML
dialect. This allows you to establish “table invariants” that other modules
cannot possibly break. If this still sounds too abstract, here's a concrete
benefit: You can dernormalize private tables, without risking other modules
putting them in an inconsistent state. If I ever go back to programming ERP
systems, I would love to do it in Ur/Web.

~~~
Shoothe
Do you mean the type checker that runs during compiling connects to the
database? Seems interesting but on most of my projects there were several
environments so it wouldn't solve all problems. A similar, but simpler
solution for other languages would be to generate types from the database
schema before compiling.

Do you have a sample code of these table invariants at hand? I'd gladly take a
look...

~~~
catnaroek
> Do you mean the type checker that runs during compiling connects to the
> database?

Okay, it doesn't go that far. But the compiler produces the table creation
scripts. So, if your build process includes running said scripts, then
database and application types are guaranteed to be synchronized.

> Do you have a sample code of these table invariants at hand? I'd gladly take
> a look...

Imagine a general ledger that's guaranteed to never go out of balance, no
matter what third party modules do that interact with the accounting
subsystem, because the only module that can directly manipulate accounting
tables is the accounting subsystem. Guaranteed by the type checker.

I haven't implemented such a thing in Ur/Web, because implementing accounting
systems from scratch is a frigging lot of work, most of it utterly boring, and
nobody has paid me to do it. But it's feasible, and the ML module system is
the reason why.

------
skywhopper
This is a really poor article, that doesn't give reasonable advice. SQL
injection attacks are a big risk, for sure. But SQL persists because it's
incredibly useful. "Stop using SQL" is completely pointless advice.

Meanwhile, in the real world, every interface/API/protocol/whatever has the
potential for being exploited to do things that weren't intended. You may as
well say "stop using computers!"

------
whack
_" What would replace SQL? An API of course! And NOT an API that uses a
textual language. Instead, an API that uses an appropriate set of data
structures and function calls to access the necessary data."_

Isn't this similar to what frameworks and libraries such as Jooq do? They
provide data structures and function calls where you pass in the user inputs,
and the library will safely construct/execute prepared statements. However, I
still think of them as being SQL based tools. Is this what Bob is
recommending, or does he have something completely different in mind?

------
mnm1
"And if you make absolutely certain that no text that crosses that boundary
going towards the SQL engine has any SQL in it. Then you will absolutely
prevent SQL attacks."

Not to mention your app will be extremely inflexible and likely slow beyond
all hell. This is exactly what was done at my current work. The query overhead
of the ORM system is 10x slower than the actual runtime of the query. 10 times
slower. For overhead. Unnecessary overhead. For an engine that is completely
inflexible, unusable, and increases development time by 2-10x (Doctrine 2 ORM
if you're curious). Really, this is the worst advice I've ever seen on HN.
People have been creating such systems as the author describes for years and
they have all failed in comparison to SQL. SQL prepared statements, on the
other hand, are available to all languages and prevent injections. Other than
specialized systems or DBs like Linq and Datomic, the only decent, solid
relational database access systems I've ever seen in my life all relied on
SQL.

I really can't even imagine what the author is referring to at this point.
Should I build out my own Linq or Datomic like layer for every app? Should I
use an ORM with the right architecture that hopefully won't slow my app 10x
with its overhead? Should I just write a new DB from scratch? The article says
to build an API. Well how is said API supposed to access the DB? Clearly, the
author hasn't thought this out at all.

~~~
Glide
What he wants is to stir up some controversy so that an executive is convinced
that his engineers are doing things wrong and will bring him in to "fix"
things.

His part about pushing that stuff down the stack reads to me like: "read my
latest book!"

------
cdevs
Clean code and clean architecture are great books. I find sql injection from
kids fresh out of college all the time in php but to be honest I really can't
tell the difference in that rails code. At work I've build a query helper
function that takes a query string and a array of bindkeys/bindvalues and
everyone must use this, easy logging, easy security etc...

In this articles defense something SHOULD have been done instead of all the
books that came out of their years that start off teaching sql injection and
then barely mention injection near the end. It is a bit silly we still hear of
this problem and the ongoing amazon s3 open links. The solution is typically
some start up but the original software should have fixed its own problem, why
did we end up treating database queries as low level and as powerful as
letting users toss in assembly code.

~~~
pmontra
The unsafe Rails code accepts a string as argument and copies it as is into
the SQL code. The safe statement accepts a map and transforms it into SQL code
using all the safety features made available by the underlying database
driver. It's what your helper function seems to do.

A solution would be banning any string argument, maybe except the inevitable
one for fin_by_sql.

The post probably argues for databases exposing directly their inner API.
Let's suppose there is a standard. I bet that there would be still ways to
write vulnerable code.

------
RmDen
is he trolling or what? You want an API.. it already exists.. Stored Procs,
parameterized queries or ORMs which will use parameters........

~~~
Double_a_92
Just another senile moment of our beloved uncle...

------
meritt
I was considering purchasing this author's books (Clean Coder & Clean Code)
but after reading this unintelligible garbage, I'll pass.

~~~
pkrumins
Question to fellow coders:

Can anyone comment how, if at all, these books are different from classics in
the field, such as The Practice of Programming, The Elements of Programming
Style, The Pragmatic Programmer, or Complete Complete?

I can't think of anything new that another book on writing good code could
say, that hasn't been said already.

That's why I haven't looked at these books yet. It feels to me these books are
like another book about Newtonian mechanics, there just can't be anything new.

Thoughts?

~~~
meritt
Some quick research led to a number of threads that were fairly critical of
this author: "his writings are only here to promote himself, his books and his
consultancy" and "a windbag" and "self-aggrandizing".

[https://news.ycombinator.com/item?id=14664210](https://news.ycombinator.com/item?id=14664210)

[https://www.reddit.com/r/programming/comments/mpusp/9_things...](https://www.reddit.com/r/programming/comments/mpusp/9_things_i_learned_from_reading_the_clean_coder/)

[https://www.reddit.com/r/programming/comments/3q5n9q/just_fi...](https://www.reddit.com/r/programming/comments/3q5n9q/just_finished_reading_this_clean_code_robert_c/)

------
mangecoeur
> What would replace SQL? An API of course!

That literally describes all of 'NoSQL'. Everyone is adding SQL back into
their nosql systems, because it's insanely useful. Since anyone remotely
interested in databases knows this, I can only assume this is a kind of weird
joke.

------
GuB-42
Argument by ignorance.

The safe/unsafe example is obvious to me, and I don't even know Ruby. The
unsafe cases have parameters, the others have not (they just build a string).
And anyone who knows the basics of injection should know that building a
string to be evaluated is naturally unsafe and should be done very cautiously.

The author makes it as some obstacle that is impossible to overcome, like you
have to know some arcane symbols and magic. These things are basic operators.
They are similar in most languages and any professional programmer should have
a pretty good intuition on the matter.

Here is the reasoning :

User.where("email = '#{email}'") : unsafe / the variable is between quotes, it
should be a red flag

User.where("email = '%{email}'" % { email: email }) : unsafe / again, a quoted
variable, bad. The "%" operator looks like a string manipulation operator
rather than a argument.

User.where(email: email) : safe / uhm... looks like a language idiom, need to
look that up

User.where("email = ?", email) : safe / a common way of doing prepared
statements, seems OK. Unlike with the "%", the "," seems to imply that "where"
is going to process the arguments, which is good.

User.where("email = :email", email: email) : safe / another common way of
prepared statements, again, the "," implies several arguments.

What happen here is that too many people decide to switch off their brains.
The "unsafe" forms are a relic, but some guy do it that way because it is how
it used to be done, and some other guy copy because it "works", then another,
then another, etc... No one in the chain seems to question the practice, even
though safer and more convenient forms exist. Having a new API won't help,
because if people are not even bothered to learn the safe way of doing SQL,
there is no way they will use that API.

And sure, hiring inexperienced programmers, not giving them proper training,
and making sure they can't take the time to actually understand what they are
doing doesn't help...

------
perlgeek
There are APIs around SQL that are pretty good in avoiding SQL injections; for
example SQLAlchemy (a Python ORM) allows you watch pretty much any SQL query
without writing literal SQL. You can use raw/literal SQL, but it's buried
pretty deep in the docs.

Having a textual representation (with placeholders, of course) has some
advantages I wouldn't want to give up:

* learn once, use it everywhere

* it's independent of the host language

* it's a reification of a query or an action, which is very useful for logs (like slow query log), debugging and the likes

------
msangi
Wait, I thought that Uncle Bob said that programmers should just be
'professional' and stop being sloppy instead of reaching for tools that can
help them

[http://blog.cleancoder.com/uncle-
bob/2017/10/04/CodeIsNotThe...](http://blog.cleancoder.com/uncle-
bob/2017/10/04/CodeIsNotTheAnswer.html)

------
obsession
Now that we are on the topic of prepared statements, does anyone know how to
do "SELECT * FROM WHERE x IN (list of values...)" properly? As far as I know,
you cannot do it with prepared statements (as they can only be single values)
and you are forced to roll your own escaping mechanism.

~~~
tatersolid
ANSI SQL and Postgres has had an array type since forever, MSSQL has table
variables and types, other SQL databases have similar if not perfectly ANSI
standard features (well probably not MySQL)

------
shapiro92
There is only blame in this article. I kept scrolling bellow when he mentioned
API, hoping for an elaborate answer. Nothing just complaining.

------
dctoedt
Give the author (a bit of) credit for this much, though: His title seemingly
alludes to [https://xkcd.com/327/](https://xkcd.com/327/) but he never
mentions it, perhaps taking it granted that his readers will be familiar with
it. Maybe the title was designed to attract attention; if so, it worked, at
least with me.

