
“(null)” is a bad idea - cperciva
http://www.daemonology.net/blog/2015-08-13-null-is-a-bad-idea.html
======
tbrownaw
_anti-pattern known sometimes as "defensive programming": If something goes
wrong, pretend you didn't notice and try to keep going anyway_

No. Defensive programming means, be predictable even in the face of nonsense.
Return an error code instead of segfaulting, define reasonable non-error
behavior for thing that don't _have_ to be errors, etc. Sweeping things under
the rug just means they'll come back to bite you later, which isn't defensive
at all.

 _and when faced by a choice between making the presence of a bug immediately
obvious or trying to hide the bug and potentially exposing serious security
vulnerabilities... well, I know which I would prefer_

This over a feature that makes "printf debugging" more usable? Is he smoking
something?

If the security of your code relies on a parameter not being null, you'd be a
fucking idiot to let "printf() should segfault" be the way you catch that.
Check it explicitly; not only is it safer but it makes broken callers easier
to debug.

~~~
delan
It can be difficult to let go of printf(3) debugging, but using a debugger
instead can seriously improve your development workflow.

~~~
jeremiep
Not for multi-threaded or networked code where breaking in the debugger
changes the program's behaviour and causes timeouts. I still prefer printf in
these cases.

~~~
nitrogen
As a Ruby and C developer, I wish there were more tools like Java's
_jvisualvm_ for realtime visualization of running systems without the overhead
and forced synchronization of a debugger.

The next best thing IMO is detecting crashes and printing a stack trace, which
reminds me I've been meaning to release some code I wrote that does so for
C/glibc programs.

~~~
jeremiep
I completely agree about jvisualvm, I'm still blown away by the quality of the
Java ecosystem. I've done a lot of server-side work on the JVM and it really
blows away every single piece of tooling for C/C++.

While I'm no fan of the Java language, you get all these benefits for other
JVM-hosted languages such as Clojure and Scala and that is fantastic. You
basically get the entire ecosystem for free.

------
__david__
Having "%s" print "(null)" is perfectly fine for debugging and actually makes
your life a lot easier since seeing "(null)" in your output is much quicker
than popping into the debugger trying to figure out what line caused a
segfault. (Yes, you can do it with a ternery but writing rigorous debug code
is never on anyone's list).

Having "%s" print "(null)" is absolutely useless in the meat of your string
manipulation code. It's pretty much guaranteed to be a bug and could easily
mask a problem (as described in the article).

So I think the real problem here is that printf() is used for both debugging
and for non-debugging string manipulation (sprintf() and friends). It's trying
to serve two masters and it can never get both cases right. Maybe we should
make "%s" not check for null and add a null-able "%S" for debugging.

------
millstone
I often use printf for debugging. An unexpectedly NULL string is exactly the
sort of thing I want it to help me debug!

~~~
cperciva
Crashing works just as well as printing "(null)" for that purpose. Possibly
even better, since you'll have a core dump. ;-)

~~~
millstone
Sometimes it does, but often it does not:

    
    
        printf("name: %s %s", first, last);
    

If it crashes, there's no easy way to know which one was NULL!

~~~
cperciva
Tell your debugger to open the core dump and print the two pointers. ;-)

~~~
SomeCallMeTim
Just try that when debugging an app on an embedded system, mobile, or a
console, where you have access to the console but no core dump.

Sorry, but I agree with all the folks who say printf (and similar) is
categorically more useful when it prints "(null)". I've had _many_ obscure
crashes on systems that were precisely because a logging printf in a seldom-
tested piece of code had a parameter that sometimes was _correctly_ null would
crash it out. If the debugger isn't attached and we just have a crash dump we
can't read (happens even on platforms where you're SUPPOSED to be able to read
them), then we have to attach the debugger and hope that we can reproduce the
obscure state that caused the crash.

Anything that makes debugging statements more robust and less crash-prone is a
_good thing._ End of story. As others have said, if you want something to
verify that a pointer is non-null, use an assert().

~~~
cperciva
_As others have said, if you want something to verify that a pointer is non-
null, use an assert()._

I said that, too. But sometimes developers don't think to put in every
possible assert. Having the C library explode violently when it detects a bug
in their code _even when they forgot to include an assert which checks for
that bug_ will help to find bugs under such circumstances.

~~~
kazinator
In the environment under question, you do not need an assert. You just have to
dereference the pointer. Unless the pointer is displaced past the unmapped
page zero (i.e you're accessing str[4096] and above, and not touching str[0]
through str[4095] on a system with 4K pages) you will get a fault resulting in
a fatal signal: SIGSEGV. Whereas an assertion will get you a SIGABRT, but the
result is the same. The assertion will be earlier, that's all, before you try
to dereference the pointer. Probably not much earlier.

Environments that do not have the hardware-assisted detection for null
dereferences can still be targetted by a compiler which generates code which
does the check on pointer dereference. That can be included in your debug
builds, or even left in production builds if the performance impact is
acceptable. I seem remember that some compilers for the Intel 8088 MS-DOS
environment had a code generation option for trapping nulls.

------
kazinator
This is nonsense. There is no reason why printf has to be the sacrificial
function that blows up on null. Those same "misguided" operating systems run
on hardware that traps null dereferences via an unmapped page. A predictable,
tidy crash is just around the corner if that same is used in the program. On
the other hand, if that printf is the _only_ place the pointer is used, then
there is no security hole. It's just undefined behavior on ISO paper, and you
don't agree with how it is locally defined by the implementation.

How do we know it's a security hole? Maybe some structure has a "char * " name
field, and null indicates "anonymous/has no name". It's an ISO C conformance
bug to try to print that with %s, but that's a portability issue. If that's
the only problem in the program (everywhere else it checks for null), there is
no real issue.

POSIX and C libraries are not entirely about allowing null pointers or not.
For instance gettimeofday lets you pass a null pointer for the "struct
timezone * " second argument, which indicates that you aren't interested in
specifying or retrieving it. On the other hand, a null argument is undefined
behavior in getgroups, even if the array size is specified as zero (meaning,
return me the size of the array actually needed and I will call you again soon
with such an array)! A non-null pointer must be passed as a POSIX-conforming
eggcorn. According to the logic of this article, if I pass null I'm committing
a security hole, and an OS which allows getgroups(0, NULL) is misguidedly
sweeping it under the rug.

Null pointers are not some "bogey man". I built a Lisp dialect in whose
implementation a C null pointer represents the nil symbol (empty list, false,
...). It prints as "nil", satsifies the symbolp function and so fort. Lexical
variables are initialized to nil if not given an initial expression. Some C
library printing a null pointer as (null) seems like nothing by comparison.

Be that all as it may, you cannot actually _fix_ this perceived issue by
changing from (null) to crash, cold turkey. Unknown numbers of nonportable
programs out there (otherwise well-tested programs) which rely on this will
just crash.

I just did a "grep -r -F '(null)' /var/log" on a system here and found a few
hits. Most of those are probably via the %s conversion, and all that code
would have to be fixed if printf were to assert this. And that's just
instances confined to logging, not all output everywhere.

~~~
rtpg
>Maybe some structure has a "char * " name field, and null indicates
"anonymous/has no name". It's an ISO C conformance bug to try to print that
with %s, but that's a portability issue. If that's the only problem in the
program (everywhere else it checks for null), there is no real issue.

I'll take 'uses of Option/Maybe' for 200 Alex

Some might say it's overkill, but if you can have a char* be null, then for
most intents and purposes you have two states to deal with on any usage of
that value.

Just because such a design can exist doesn't mean we should hobble our tools
to support it.

~~~
kazinator
It could simplify things elsewhere.

In Lisp, CAR and CDR accessors take a cons cell or NIL, which isn't a cons
cell, and has no such fields.

It wasn't always like that; according to the HOPL paper on evolution of Lisp,
InterLisp introduced this nicety and it was copied by others.

These functions are complicated with that extra state, but programs are
simplified.

A null "char *" indicating "this object has no name" is better than some
special string value doing the same, because it's different from every
possible string value.

~~~
rtpg
I'm not saying a special string value, but a wrapper object with an
"existence" bit. You can then assume, if the existence bit is flipped right,
that the char* "exists".

This is functionally equivalent, but forces existence checks when they're
needed. And by removing "null" from your vocabulary, an entire class of errors
disappears.

With C this is a bit trickier to ensure. But I think the design pattern can
still be useful. And it doesn't make code more complicated, but rather it
makes the complicated aspects explicit . Instead of testing that name == null,
you test that name.exists is False, and if not you use name.value. And you
make null a banned word in your codebase.

The result is that you _never_ have to ask "is this char* possibly null?". The
answer is always "no". So you remove superfluous checks

I think this pattern works much better when you have stronger type checking,
though...

------
jeffdavis
"This is an example of an anti-pattern known sometimes as "defensive
programming": If something goes wrong, pretend you didn't notice and try to
keep going anyway."

That's why SQL NULL is also a bad idea.

~~~
wvenable
SQL NULLs propagate; you can't get an answer from a NULL unless you explicitly
plan for it.

~~~
nitrogen
It can be a bit counterintuitive that "([NULL] < x)" and "NOT ([NULL] < x)"
both return NULL in SQL.

    
    
        MariaDB [(none)]> select null as a union all
            select 1 as a;
        +------+
        | a    |
        +------+
        | NULL |
        |    1 |
        +------+
        2 rows in set (0.01 sec)
        
        MariaDB [(none)]> select * from (select null as a
            union all select 1 as a) T where a >= 1;
        +------+
        | a    |
        +------+
        |    1 |
        +------+
        1 row in set (0.00 sec)
        
        MariaDB [(none)]> select * from (select null as a
            union all select 1 as a) T where NOT a >= 1;
        Empty set (0.00 sec)

------
pjtr
"If you accidentally pass a null pointer as the argument for a ‘%s’
conversion, the GNU C Library prints it as ‘(null)’. We think this is more
useful than crashing. But it’s not good practice to pass a null argument
intentionally."

[https://www.gnu.org/software/libc/manual/html_node/Other-
Out...](https://www.gnu.org/software/libc/manual/html_node/Other-Output-
Conversions.html)

It's documented.

~~~
cperciva
There are more C libraries than just GNU C...

~~~
kazinator
If there weren't, we wouldn't need a library clause in the ISO C standard to
know what is portable among them. That doesn't add up to an argument that
libraries shouldn't have documented extensions in areas left undefined by the
standard.

------
kazinator
From BSD errno manpage:

    
    
         14	EFAULT Bad address.  The system	detected an invalid address in
    	       attempting to use an argument of a call.
    

Is that an example of a misguided operating system? The kernel should kill the
offending process instead of sweeping the bug under the rug with an errno code
that might not be examined.

Those who live in glass houses ...

~~~
the_why_of_y
I have not checked which syscall or libc function could return that, but it's
quite possible that some applicable standard like C or POSIX or SUS mandates
such a return value.

It is better to follow the standard in such cases, so that portable code
written against the standard runs nicely.

However (claims the OP) in the case of printf, the %s null pointer case is
explicitly defined to be undefined behavior and hence the implementation
should handle this either implicitly (by crashing with null pointer
dereference) or explicitly by calling abort(), as that will teach programmers
not to rely on superficially benign implementations of undefined behavior.

~~~
kazinator
> _mandates such a return value._

And ISO and IEEE standards can't be "misguided", just competing operating
systems that are more successful.

> _explicitly defined to be undefined behavior_

Contradiction.

Undefined behavior is not defined, explicitly or otherwise. There can be
explicit wording that some _requirement is not being made_.

When ISO C explicitly states that there is an absence of a requirement, that
does not carry any more or less weight then when behavior is undefined by
omission of any wording of a requirement. All UB is equal.

> _not to rely on superficially benign implementations of undefined behavior._

An excellent example of beneficially fixing undefined behavior is to implement
strict left to right ordering on all operations and their side effects.

A language or language dialect in which evaluation order is defined is safer
and better for engineering.

~~~
the_why_of_y
> And ISO and IEEE standards can't be "misguided"

That's not a claim that I would ever make. But the right time to bring up
objections is during the drafting of the standard.

If you aim to implement a standard you should have a very good reason before
you implement some interface differently than specified. I would argue that
these return values are not a good enough reason. gets(3) on the other hand
would hardly be missed...

> Undefined behavior is not defined, explicitly or otherwise. There can be
> explicit wording that some requirement is not being made.

In C, undefined behavior has a quite specific meaning: invoking undefined
behavior is an error in the program and the implementation is at liberty to do
anything it pleases.

 _" undefined behavior: behavior upon use of a nonportable or erroneous
program construct or of erroneous data, for which this International Standard
imposes no requirements."_

Your example of evaluation order is not undefined behavior, but unspecified
behavior: there is a set of possible orderings and the implementation may pick
any one of them.

~~~
kazinator
_In C, undefined behavior has a quite specific meaning: invoking undefined
behavior is an error in the program and the implementation is at liberty to do
anything it pleases._

That doesn't match my understanding, which is that undefined behavior is about
erroneous constructs or data, or _non-portable_ constructs.

Oh look, you quoted it. Thanks!

Some undefined behaviors are important sources for useful, documented
extensions.

------
ikeboy
So is "is a bad idea" replacing "considered harmful"?

~~~
cperciva
Huh. Somehow the words "considered harmful" never even occurred to me while I
was writing that.

------
gpvos
"on some misguided operating systems" — on which well-thinking systems _does_
printf crash in this case?

------
linkydinkandyou
I think it's a Bad Idea in C/C++, but it may not be a Bad Idea in some untyped
interpreted languages, or in a special "checked build / debug" C++
environment.

------
vortico
In the context of where this would be a bad idea (e.g. flight control, rescue
rover, etc), you would not use `printf()`, or perhaps any standard output at
all. If anything, you will use lower level `write()`, where EFAULT or
something would be raised on NULL, and/or static buffers, where NULL is
impossible. `printf()` is really only good for developer-friendly programming,
where absolute rigor is not the focus.

~~~
cperciva
_`printf()` is really only good for developer-friendly programming, where
absolute rigor is not the focus._

printf gets used all the time in systems code. I count 1221 occurrences of
printf (and friends) in BIND 9 alone.

More to the point, "(null)" was just an example of a debugging-hostile pro-
vulnerability attempt to silently "fix" bad data.

~~~
jeremiep
BIND 9 doesn't have the same requirements as flight control software. Nobody
will die if BIND segfaults, at least I hope not.

The software running in your car, plane, spacecraft and the likes have very
strict regulations and you will find it radically different from systems code
found in a linux distribution, even if written in the same language!

See [http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf](http://lars-
lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf) for more details. I wouldn't want
to write code like this as it sounds very unproductive, but their reliability
and astronomically small number of bugs speaks for itself.

~~~
setpatchaddress
BIND is important internet infrastructure. We'd all be better off if it did
follow the JPL standard.

~~~
jeremiep
Important yes but BIND can also afford having crashed nodes without the
internet coming down to its knees. I'd be surprised if such a node came down
and caused damages in the hundred of millions of dollars in the process. This
is the distinction I wanted to make.

I can't remember the last time my DNS didn't work. The internet can afford to
be 'good enough', life critical software can't.

