
My name still causes SQL errors - slyall
https://twitter.com/gintveld/status/730317679886794752
======
ergl
I know this too well. Some pages will tell me that my name or address is
'invalid' because it contains spaces or forbidden characters.

I'm sorry, but 'ñ', apostrophes and accent marks are not forbidden, and no
name should be ever 'invalid'. How can a name even be 'invalid'? Sometimes I
think too many sites assume (wrongly) that all customers will be an average
english-speaking, US-resident person.

~~~
lmm
Sites don't (or at least shouldn't) have any obligation to serve everyone.
Most people are content to render their name in plain ASCII. How much extra
are you willing to pay to be able to use those symbols? Or do you expect
everyone else to subsidize you?

~~~
atomwaffel
Certainly, most websites have no obligation to serve me as a customer, but if
they tell me that my name is "invalid", they shouldn't be surprised if I take
my custom to a competitor who isn't quite so rude.

But you seem to be missing the fact that "Gijs in 't Veld" is perfectly valid
Ascii, as would be names like "O'Leary" or "O'Reilly" that are perhaps more
common in the English-speaking world.

~~~
candu
And some websites, such as government portals to various departments and
services, _do_ have implicit (and often even explicit) obligations to serve
everyone.

I wonder how well GP would take it if, say, the IRS webpay or DMV sites (or
their equivalent wherever GP happens to live) balked at their name.

~~~
atomwaffel
Yes, government services was what I thought of when I wrote " _most_
websites". Then again, I feel like those tend to be the worst user experience
offenders (which is what this comes down to). It took me several attempts to
mangle my home address into something the US visa application form would
accept and I've seen _ö_ turned into _Ã¶_ on my own country's official
websites so many times that I can remember the mojibake.

I wonder why that is. I guess part of the reason is that government doesn't
need to invest heavily in UX because there is no threat of people going
elsewhere – when you need a passport, you need a passport, and you're not
going to order one from Amazon. In fact, there's probably a disincentive for
government to continuously improve their UX precisely because any changes to a
system that's in place and working well enough only cost them money.

Are there any counterexamples of government websites that offer good user
experience?

~~~
bzbarsky
The tax e-filing system in Massachusetts is a government website that offers a
quite good user experience. It could be a bit better (e.g. it could save your
general W-2 info from year to year on the assumption that you probably didn't
change employers), but it's fairly nice in most ways; the W-2 thing is the
only problem I've run into with it, and it has a fairly clear UI. Especially
given the complexity of the actual paper MA tax forms.

On the other hand, the MA website where you pay things like unemployment
insurance for your employees is terrible, complete with being offline outside
of normal business hours (because afaict they take it offline while they do
batch imports from physical media that companies send them). Note that this is
not as much of a "voter-facing" website, of course. ;)

------
atemerev
I wonder how everyone in the original thread comments is still falling for
this infamous "sanitize input!" trap.

If "the input" is sanitized, poor guy would be left without an apostrophe in
his name, and rightfully pissed off.

Prepared statements are the only way to do it.

~~~
MattBearman
I agree that prepared statements are much better than sanitising inputs, but
surely a sanitised version of his name would be _Gijs in \ 't Veld_ Leaving
the apostrophe in tact

~~~
Scarblac
That would be silly, then the backslash will be shown everywhere.

~~~
paulasmuth
I can't tell if you're trolling or not, but this is not how string escaping
works in SQL or any other programming language that I know.

    
    
        >> SELECT 'here is an apostrophe: \'';
        >> returns: here is an apostrophe: '
    

The backslash is not part of the string but just a hint to the compiler. The
literal string '\'' represents a one byte (if ASCII) string containing only a
single apostrophe character.

[https://en.wikipedia.org/wiki/Escape_character](https://en.wikipedia.org/wiki/Escape_character)

~~~
jameshart
There's a difference between saying that

    
    
        Gijs in \'t Veld
    

is a 'sanitized' version of

    
    
       Gijs in 't Veld
    

and saying that

    
    
       'Gijs in \'t Veld'
    

is a SQL String literal representing the string

    
    
       Gijs in 't Veld

------
pilif
Aside of the obvious SQL injection (heck - the way this is set up, it
obviously even takes multiple queries in one string - this is pure heaven from
an injection standpoint), I'm also wondering what kind of schema this is?

Creating an user as a sequence of operations? Why?

Imagine the ways this data can get corrupted? What if for some reason only one
of the queries executes? What if somebody removes one of the rows for a user
but not the other when doing support or maintenance?

~~~
Macuyiko
In a past life, I converted an intranet application for some government
department from Oracle/Coldfusion.

The whole application, every SQL query taking input, was vulnerable to SQL
injection.

For many insert operations, I'd find a pattern appearing of first inserting a
row with some random value in one of the columns (`temp_key`), then performing
a select based on this key to get the row back to know the primary key, and
then continue to update other fields in the same row, and adding other records
to other tables referencing this primary key.

Obviously, transactions was a foreign concept to the original developers. I
still remember taking a deep breath once I found a code snippet that would try
an insert again if the insert failed because the randomly generated `temp_key`
was already used before, a problem that seemed to really start bothering them
as the database grew larger...

~~~
delgaudm
I'm stretching my memory as a "full stack" dev back in the 90s but I seem to
recall that CF couldn't do transactions for its first few versions. People
were begging for <cftransaction> to become a thing. I also don't remember
being able to call an oracle stored procedure from CF in version 1 or 2.. But
I'm hazy on that for some reason I remember the ODBC driver not supporting it.
As a result I recall resorting to hacky stuff like what you describe. It
wouldn't surprise me in the least if there is government cfml code out there
from the 90's. It was a challenging time with far fewer tools.

------
83457
That reminds me of the approach taken by many contact forms for members of the
US congress. I believe the vendors have to follow certain security processes
which includes sending the user to a 404 page if certain malicious attempts
are detected. The end result in many cases is that there is a list of words
that can not be submitted successfully through their form that changes over
time and just check in certain fields based apparently on however the vendor
wants to implement the security measure. Is your name Walter? Sorry but you
sound like a sql injection attack so we will redirect you and the message you
just spent an hour on over to a 404 page.

------
stepvhen
Very related: [http://www.kalzumeus.com/2010/06/17/falsehoods-
programmers-b...](http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-
believe-about-names/)

~~~
huuu
I understand what you are saying but this has nothing to do with names but
with SQL injections, escaping and prepared statements (the lack of).

~~~
jameshart
It has everything to do with names. Among the things that programmers
sometimes believe about names, as in this case, is _that when you stick
apostrophes around them they are always a valid SQL string literal_.

~~~
biot
Inexperienced programmers naively believe that about all string input, so the
fact they believe it about names isn't particularly interesting.

------
greenleafjacob
Prepared statements are actually worse than escaping in Postgres because
prepared statements can sometimes use the wrong query plan. Refer to the
documentation:

> In some situations, the query plan produced for a prepared statement will be
> inferior to the query plan that would have been chosen if the statement had
> been submitted and executed normally. This is because when the statement is
> planned and the planner attempts to determine the optimal query plan, the
> actual values of any parameters specified in the statement are unavailable.
> PostgreSQL collects statistics on the distribution of data in the table, and
> can use constant values in a statement to make guesses about the likely
> result of executing the statement. Since this data is unavailable when
> planning prepared statements with parameters, the chosen plan might be
> suboptimal.

Sometimes this can be more than "suboptimal" but actually an entirely
different index and orders of magnitude worse performance.

------
chris_wot
This guy isn't just a tester, he's the test data.

------
icedchai
Prepared statements are hard. They've only been around for decades.

------
raubitsj
Good ole Little Bobby Tables [https://xkcd.com/327/](https://xkcd.com/327/)

~~~
wagglycocks
I feel like that joke is so widely known that it can be retired, or at least,
referenced without needing the link.

