
Hello, I’m Mr. Null. My Name Makes Me Invisible to Computers - BerislavLopac
http://www.wired.com/2015/11/null/
======
carlio
10 years ago I worked for Jagex on RuneScape, an RPG. There are rules for that
game about conduct and so on - no swearing, no scamming, blah blah. If you see
someone doing something "bad" you can report them and the customer support
team reviews the event and can allocate "black marks" on players who infringe.

One particular user accumulated thousands of black marks. We thought they were
doing something incredibly bad. A black mark gets you banned for a short
period of time, this player was banned for a millennium. But they kept sending
messages saying that it was unjust, they did nothing wrong.

Turns out their playername was 'null' and half of other players' offences were
attributed to them due to null in Java being cast to 'null' curing
concatenation...

~~~
ben_jones
What was it like working with Andrew Gower? He strikes me as a technical
genius with some of the stuff Jagex was doing with Java circa early 2000's
[1].

[1]:[https://www.youtube.com/watch?v=yrVUegwSKlY](https://www.youtube.com/watch?v=yrVUegwSKlY)

~~~
rhaps0dy
I couldn't quite understand the video. Why did they clone Maya and maintain
feature parity and use the same formats?

~~~
hollander
They didn't clone Maya, but the features it has. Being able to build
everything, including all 3D material, in their own environment, made it easy
to link everything together. And when something changes, that requires an
update to the 3D model, they don't have to go back to Maya to edit the model
there.

------
modernerd
Tony Hoare, null's creator, regrets its invention:

“I call it my billion-dollar mistake. It was the invention of the null
reference in 1965. At that time, I was designing the first comprehensive type
system for references in an object oriented language (ALGOL W). My goal was to
ensure that all use of references should be absolutely safe, with checking
performed automatically by the compiler. But I couldn't resist the temptation
to put in a null reference, simply because it was so easy to implement. This
has led to innumerable errors, vulnerabilities, and system crashes, which have
probably caused a billion dollars of pain and damage in the last forty years.”

[https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retra...](https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions)

[https://www.infoq.com/presentations/Null-References-The-
Bill...](https://www.infoq.com/presentations/Null-References-The-Billion-
Dollar-Mistake-Tony-Hoare)

~~~
brobinson
Including "null" is one of the most unfortunate things about Go. I'm glad to
see that other modern languages (Swift, Rust, and so on) are avoiding it.

~~~
biztos
I too find it annoying in Go, though I'm not sure what the default value of a
reference in a struct would be otherwise.

However, I do see the value of NULL in a database context even though it makes
database interfaces harder -- especially in Go, where the standard marshaling
paradigm means anything NULLable has to be a reference and thus have a nil-
check every time it's used.

The conceptual match is so awkward that when I write anything database-related
for Go, if I have the option then at the same time I make everything in the
database NOT NULL; even though that screws with the database.

Ah, NULL. When I think about the pain it causes, balanced against its utility,
I sometimes wish I'd never heard of it.

And I'm sure learned people said the same thing about Zero, once upon a time.

~~~
TylerE
What's wrong with using an Option/Maybe type? Can't Go do that?

~~~
brobinson
Go effectively has Option types for database query results:
[https://golang.org/pkg/database/sql/#NullBool](https://golang.org/pkg/database/sql/#NullBool)

~~~
unscaled
This is not a generic option type, but rather a tri-state bool, or an
Option<bool>. Go has no user-defined generics, so you can't have a bool type.
It does have built-in "magical" generics, namely arrays, slices maps and
channels, but no option/maybes. Language-level option types are not unheard of
(C#, Swift and Kotlin all have a notion of this sort, although they all
support proper user-defined generics as well).

~~~
masklinn
> Language-level option types are not unheard of (C#, Swift and Kotlin all
> have a notion of this sort)

Swift's Optional is a library type:
[https://github.com/apple/swift/blob/master/stdlib/public/cor...](https://github.com/apple/swift/blob/master/stdlib/public/core/Optional.swift#L122)

Though the compiler is aware of it and it does have language-level support
(nil, !, ?, if let)

------
eyelidlessness
Whatever you may think of `null` (ahem
[https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retra...](https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions)),
it is not a string type. It is horrifying to me that all of the problems
described in the article are cases of allowing a user to execute arbitrary
code by string input, and the article fails to point this out.

~~~
mcherm
It is also security policies put together by overly cautious people. They say
"I can't trust our developers to write properly secure code, and it won't hurt
anyway so I'll just institute a blanket policy to block everything that might
appear in an SQL injection attack. Let's see... we'll block quotes, ";", the
word "null"... maybe a few more things."

~~~
eyelidlessness
This is also horrifying, as it indicates that they're taking wildly
destructive precautions but still putting user input into SQL queries.

~~~
kedean
I don't think the suggestion was that the policy is created because the data
is going into sql queries. The policies are often in place because the people
in charge of the policy flat out don't trust the developers. Say the policy
comes out of the CTO's office, they have two options: 1) liberal password
policy, risk a crappy developer screwing up, or 2) conservative policy, not
have to worry at night.

~~~
Buttons840
Slight change of topic here. Your comment made me realize that once a system
gets too complicated the only change that can be made is to add more
complexity.

In this case. It would be better to ensure all SQL is properly escaped, but
because that isn't a trivial task, instead you end up adding another layer of
complexity.

------
aurelian15
Two weeks ago, when I applied for a new passport (in Germany), I was asked to
make sure that my personal information on a printed out form was correct.

Immediately the line saying "Artist's/Religious name: null" caught my eyes. I
told the officer: "Well, the info about my artist's name is wrong. I don't
have one."

As I was expecting beforehand, I was told: "That's exactly what 'null' means
here." To which I replied: "And what if my artist's name was 'null'?". To
which I got no satisfactory response. I guess these problems are only apparent
to programmers...

So, if anyone wants to cause confusion, now you know how.

~~~
city41
Reminds me of a tweet I saw a while back where someone was told their four
digit pin couldn't be a year. To which they replied "aren't they all years?"

~~~
jaytaylor
[https://twitter.com/mralancooper/status/652532677979959296](https://twitter.com/mralancooper/status/652532677979959296)

and my "solution":
[https://twitter.com/jtaylor/status/772607233829969920](https://twitter.com/jtaylor/status/772607233829969920)

~~~
BerislavLopac
Well, technically, any PIN starting with one or more zeroes is not a year.

------
BinaryIdiot
> For those of you unwise in the ways of programming, the problem is that
> “null” is one of those famously “reserved” text strings in many programming
> languages.

It's not a text string but there are some systems that present it as such and
those are the issue. Lazy programming. I remember back working for a financial
institution and was told to add some validation to a service that handled
adding investment plans to someone's account. The third party service we had
to use was using SOAP and NULL values were represented by, you guessed it,
simply putting the NULL value in the XML instead of simply omitting the item
itself.

No amount of protest would make the third party service change the way it
worked. "It's always worked like that and we haven't had an issue" and the "if
we changed it thousands of our customers would be negatively affected! We
can't do that!" were the primary excuses.

~~~
colejohnson66
Pretty much any language that evaluates this as true:

    
    
        "null" == null

~~~
eeZi
Python doesn't, and any language that evaluates this as true is crazy. Not
even JavaScript does this.

~~~
eyelidlessness
Not even PHP (it took me about 10 minutes to remember how to PHP before I
could test it to be sure).

~~~
ht85
What's the difference between yourself and a Wordpress plugin?

None, doing things in PHP makes you really insecure.

------
wglb
One more item for the list: [https://www.kalzumeus.com/2010/06/17/falsehoods-
programmers-...](https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-
believe-about-names/)

~~~
ryandrake
I love the "Falsehoods programmers believe about X" articles. They are
basically ready-made failure test cases. Whenever a junior co-worker would say
something like "I know! I'll just write our own date and time handling code."
I'd point him at [1], let him stew on that for an hour or so, and then, sure
enough, go back and lo and behold he's convinced to just use a ready-made
third party library instead.

1: [http://infiniteundo.com/post/25326999628/falsehoods-
programm...](http://infiniteundo.com/post/25326999628/falsehoods-programmers-
believe-about-time)

~~~
ryankupyn
Poor time-handling also have big consequences as well. In Nevada, for example,
overtime calculations use a rolling 24-hour period, and as a result many
workers are incorrectly paid each year because their employer's system doesn't
account for DST changes.

------
chiph
We had a customer named Echo. He couldn't make a payment because our credit
card processor looked for common Unix shell commands and filtered them out. I
think we ended up creating a discount code just for him so that he'd get the
item for free to make up for the annoyance (zero amount due meant no call to
the card company).

~~~
david-given
...filtering out... common Unix commands? By the credit card processor?

I would ask 'why', but every time my brain goes there it comes up with
terrible, terrible scenarios that I just don't want to think about.

~~~
masklinn
Do you mean you're not supposed to have a system(3) call with a command built
by concatenating user-provided strings as part of your validation process?

~~~
david-given
DON'T WANT TO THINK ABOUT.

------
brian-armstrong
The comments on wired shed some light on what might be happening, that legacy
systems used the string 'NULL' to represent null and now the legacy data has
intermingled and can't be cleaned up. Gross!

This guy is basically a walking fuzzer on forms and input systems. He could
find vulnerabilities without actually doing anything illegal ;)

~~~
mikeash
Lots of Irish people are walking SQL injection testers, and don't even know
it.

~~~
darkhorn
And people from The Hague.

~~~
mrsuprawsm
as in "'s Gravenhage"?

~~~
darkhorn
Yes.
[https://en.m.wiktionary.org/wiki/%27s-Gravenhage](https://en.m.wiktionary.org/wiki/%27s-Gravenhage)

------
beached_whale
When it comes to email validate, wow why. But if you have to validate whether
the recipient gets the email. You will have to do this either way. If you
still want to do more, validate the hostname against dns as that can be quick
and compared to a typist is not a performance issue(don't forget idn). If you
still want to do more, you are only really left with size checks and a few
control characters.

[https://github.com/beached/validate_email](https://github.com/beached/validate_email)
is about what I think is the most one should validate without sending an email
you have to send anyways

------
coding123
For those thinking that only a complete idiot would consume the string null vs
the value null when processing form input, you're pretty much correct. Most
languages, even scripting ones, don't rely on the value of the string, but of
the value of the reference. And while there are probably many web-front ends
that happily work with "null" as a field, that's likely not the issue here.

The issue that will most likely strike are all of the assumptions that your
front-end team don't care about- downstream systems. This typically comes from
software that is packaged for explicit purposes. In Healthcare we have things
called MPIs (master patient index) and as a rule we typically have data
scrubbers because in the MPI world we HAVE to work with bad data, self-entered
data. We don't have the luxury of relying on primary keys for everything.
Names like "Baby Smith" often comes in as the yet-to-be-named baby of the
Smith family. I won't go into HealthIT 101 here, but suffice to say, we must
do a lot of data scrubbing and/or annotation.

What compounds the issue even more is that most of these software components,
be it healthcare or other industries, are typically resold and therefore have
customers that keep piling on more and more use-cases. There are solutions to
the problems, but many of them involve breaking human-end workflows (how do
you tell a customer that you'll now require every patient to remember their
patient ID, for example, before you admit them?). Those workflow changes tend
to be non-starters and from there, the chain of never-ending bad coding
compromises enter the scene.

Now I'm not defending the practice of treating "baby" or "null" or any other
string as a special case, but being part of some of these workflow issues for
years has led me to believe that not only are coders around the world going to
continue these practices, there's not much they can do about them until more
and more of our software gets more built-in intelligence. Maybe in the future
we can actually detect it better when the name "baby" actually is their first
name! (And actually with all of the deep learning going on lately, I'd imagine
someone is building something like this now!)

All I'm saying is that it takes a broader understanding of the issue here to
fully get why it's not just a front-end coder's error, but rather a deeper,
as-of-now hard to solve problem. Think about when you start integrating with
outside systems that don't follow your organization's understanding of a name.
What about when you get into analytics? When you connect to external systems
that performs mailings? Do YOU know the assumptions and use-cases that went
into the platforms you're about to hop on board with? Probably not all of
them.

------
woliveirajr
This is kind of finding that "little Bobby tables" exists in real life :-)

[1] [https://xkcd.com/327/](https://xkcd.com/327/)

~~~
ams6110
Mr. Null's second cousin.

------
harry8
wired, don't ever change. I'm not running ad blockers but wired blocked me
claiming I am. Made me remember nothing in wired is worth reading and I should
do something more useful or enjoyable.

~~~
mnw21cam
Agreed. No ad blockers here, but the web site claims there are.

~~~
laurent123456
Ironically, I have an adblocker but Wired doesn't complain about it, and let
me view the content just fine.

------
DanBC
The other version of this is the stack exchange question.

An employee, whose last name is Null, kills our employee lookup:
[https://news.ycombinator.com/item?id=3900224](https://news.ycombinator.com/item?id=3900224)

We have an employee whose last name is Null. He kills our employee lookup
(2012):
[https://news.ycombinator.com/item?id=6140631](https://news.ycombinator.com/item?id=6140631)

------
Normal_gaussian
Type coercion and a misguided approach to removing character counts is a pain.

It is embarrassing how little of the code I have seen uses type unions either
explicitly or implicitly. How often crucial checks and balances are removed
for example or test code. How often copy and paste is used over abstract.

This is an embarrassment to all of us, and I am not sure that it is absent
from my work.

------
plasticchris
Wired must listen to NPR [http://www.npr.org/2016/04/02/472716929/bluff-the-
listener](http://www.npr.org/2016/04/02/472716929/bluff-the-listener)

Edit: or the other way around if you read the dates...

------
NamTaf
My friend posted a screenshot a couple of days ago of a system rejecting name
input values that were considered rude or offensive. His last name? 'Wang'.

He had to lie about his last name to progress the form.

~~~
krylon
Isn't Wang kind of a common name in... somewhere? China, I guess? (The founder
of Wang computers was of Chinese descent, IIRC...)

~~~
NamTaf
Wang is an extremely common Chinese surname, yes. It's a romanisation of a
couple of names which also sometimes become Wong and Ong.

[https://en.wikipedia.org/wiki/Wang_(surname)](https://en.wikipedia.org/wiki/Wang_\(surname\))

------
chmaynard
This is the funniest article I have ever read on HN. Christopher Null, you are
a wonderful humorist and you may be destined for immortality (at least in some
circles).

------
guessmyname
Isn't a name in a database and/or variable just a string?

How can it become the data type Null without a literal casting?

Is this just affecting scripting languages?

~~~
eddieh
I think you underestimate how much stuff relies on string concatenation or
string interpolation to build SQL statements that are also case-insensitive by
nature. Yes, people should probably use an ORM and yes they should properly
quote input, but so much code exists that doesn't. Even a strongly typed host
language doesn't preclude stringly typed SQL.

~~~
elygre
No, people shouldn't quote input. They should _bind_ input.

~~~
eddieh
Please explain! I assume you mean bind to a type, but how does that help when
you build a SQL statement? Surely whatever type you use has a 'toString' (or
language equivalent) that is explicitly or implicitly used with string
concatenation or string interpolation.

~~~
0x0
No, you put in placeholders like ? in the query string, and supply replacement
parameters out-of-band separately. See for example the bindValue api in the
PHP DBAL API: [http://docs.doctrine-project.org/projects/doctrine-
dbal/en/l...](http://docs.doctrine-project.org/projects/doctrine-
dbal/en/latest/reference/data-retrieval-and-manipulation.html)

This way, possibly-user-supplied values are never mixed with the query string,
and you don't have to worry about quoting.

~~~
eddieh
Haha, I didn't know that was what that was called. It has been a few years
since I did database driven development.

------
pcmoore
For those blocked by ad blockers: take the title of the article, paste it into
to google and then view the cached page.

------
koytch
Judging from the comments so far, the best name for a privacy conscious person
seems to be Null O'Bash.

------
m0nty
I once had to (try to) explain to a colleague why he couldn't have a Windows
username corresponding to his preferred initials - "PRN" (he settled for
"PN"). I imagine someone with the initials "CON" would have a similar problem.

------
aerovistae
Any site that registers his last name as a problem is vulnerable to SQL
injection, no?

Doesn't make sense to me that he says most big companies have a problem with
his last name; they generally do not have this vulnerability.

~~~
Hondor
Big companies probably have more pieces of software that increase the chance
of one of them having trouble with null, or they're overly cautious because
they know they have that risk.

------
rcthompson
I feel like this is the real life incarnation of the classic sci-fi trope
"does not compute".

------
helthanatos
I had a math teacher named Mr. Null.

------
martinko
Meh, in what language does ('Null'==Null) evaluate to true?

~~~
dagw
It doesn't have to be at the language level. It's not unreasonable to assume
that someone, somewhere in the mists of time decided that the string "NULL"
was a reasonable choice for representing missing data in some input data and
things kind of snowballed from there.

~~~
gtf21
I've seen this in code I've reviewed before, where people will check for `==
"empty"` instead of just using types like `null` which were designed for this
(in javascript).

~~~
kqr
A practise that may come from some shell scripting languages where `[ $x == ""
]` is an invalid command because empty strings aren't strings.

~~~
_kst_
That may be true in older shells, but in modern shells [ $x == "" ] is valid,
and "" is a single token whose value is the empty string. But I've seen old
code that does things like

    
    
        if [ x$x == "x" ] ; then ...
    

apparently because it was necessary at one time.

~~~
kqr
It depends on what you mean by "older" shells. More modern shells tend to
support this properly, but the baseline for most shell scripts is /bin/sh
which is not guaranteed to be modern – it is often a new release of Sh.

