
Making Wrong Code Look Wrong (2005) - aleyan
https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/
======
lalaithion
Most modern typed languages support encoding this information in the type of
the variable, so that the compiler catches it, instead of in the variable
name. Alexis King wrote a blog post about it that reached the front page a few
days ago. [https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-
va...](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/)

~~~
hota_mazi
Did you read the article?

The correct way of doing Hungarian Notation, as demonstrated in this article,
is to encode in the name of the variable what it _does_ , not its type.

No compiler can know that a string variable is a "name" or a "password".

~~~
lifthrasiir
It still might be possible to have types like `UserId` or `UserCredential`,
which get parsed at the request time and never roll back to plain strings. It
is evident that they are indeed different subtypes of strings: user
identifiers can't have a space for example.

That said, I generally agree that conventional type systems are not (yet)
capable of encoding all those informations to a type. I prefer a mix of actual
types and coding conventions for the practical matter.

~~~
touisteur
I'm curious what type of capabilities are lacking in the Ada or the F# type
system to handle this. You can create 'new' (hear incompatible) types instead
of subtypes and they won't be affectable to each other. So

    
    
       type User_Id_Type is new String;
       type User_Credential_Type is new String;
    

Then in your request deserializer convert once from String to the correct
type, and at the use site only allow taking the correct type?

You can also add static predicates to ensure User_Id_Type doesn't contain
spaces. You can also make the type 'opaque' and it won't be possible to
convert to a String without a bespoke interface.

I'm not sure what's missing. I almost feel like it would be a great challenge
:-)

~~~
lifthrasiir
I think in this particular case they suffice, because they are clearly
different subtypes as I've mentioned.

It is much harder, if not impossible, if you have to describe a relation
between two or more values of the same subtypes. For example, imagine that you
already "know" certain indices never go off the boundary of given array but
want to encode that information to types. Sure, there exists a Rust crate [1]
that tracks a relation between indices and originating array via lifetime, but
it is complicated.

[1] [https://github.com/bluss/indexing](https://github.com/bluss/indexing)

~~~
touisteur
Mmmh thanks for the link and the paper about generative types. Seems very
interesting!

I'm wondering if using type predicates or type invariants (+ proof for static
verification, otherwise the check will mostly be at runtime) would help here.

Look up [https://blog.adacore.com/spark-2014-rationale-type-
predicate...](https://blog.adacore.com/spark-2014-rationale-type-predicates)
if interested.

------
stinos
Just looking at the safe/unsafe string example I wonder if another usable
approach would be to ditch plain strings and instead use thin wrappers like
UnsafeString and SafeString and a bunch of operations on them. But not
assignment, for instance.

There wouldn't be much need to care about whether the code looks wrong or not
(welll with regards to safe vs unsafe string:), because the compiler would do
it for you (or runtime I guess, depending on which laguage it gets implemented
in). I think all the examples Joel writes (the ones 'xxx is always ok' and
'xxx is not ok') are covered by it. It does mean you need a Write which only
takes SafeString I guess, and it probably doesn't mean you can still do
something wrong, but it should be much harder.

~~~
_bxg1
I had a similar thought- Rust has "tuple structs" which are effectively
structs without property names. You could make singletons:

    
    
      struct Relative(i32);
      struct Absolute(i32);
    
      let foo = Relative(12);
      let bar = Absolute(12);
    
      fn useAbsolute(coord: Absolute);
    
      useAbsolute(foo); // Err!
    

and effectively get this idea enforced at the type level. Of course support
for this pattern would vary by language, and there would probably be some
small overhead, but it could be worth it.

~~~
navaati
You’ll be glad to learn that there is absolutely no overhead in execution time
or memory consumed (and a minuscule one in compilation time).

------
dang
For the curious:

2009
[https://news.ycombinator.com/item?id=472477](https://news.ycombinator.com/item?id=472477)

2011
[https://news.ycombinator.com/item?id=2912218](https://news.ycombinator.com/item?id=2912218)

Small:

2015
[https://news.ycombinator.com/item?id=8987366](https://news.ycombinator.com/item?id=8987366)

2019
[https://news.ycombinator.com/item?id=20586837](https://news.ycombinator.com/item?id=20586837)

------
esotericn
In a previous job, I wrote an autotrader which by now I believe will have
handled hundreds of millions of pounds in untyped Python 2. Certainly tens of
millions.

There were a number of approaches I used. This is a few years back now, so
apologies if anything is unclear.

You can fake a type system to some extent in py2 by using methods and copious
'isinstance' checking.

For example, Money<EUR> \+ Money<GBP> can be made illegal by overloading
addition operators. Strings which require some sort of meaning can be given
classes and functions which use them can use isinstance and friends to perform
runtime type checks.

Another is indeed the use of 'raw_whatever' and 'whatever' in identifiers,
what I now know to be "apps hungarian" notation. 'raw_whatever' would have a
similar definition to Joel's unsafe user input. It might come from an API of
some sorts that you don't truly "trust".

Similarly, that sort of variable naming approach applied to function
parameters. Passing a 'dog_id' to a 'cat_id' function _may_, in certain cases,
be possible if both were flat strings (and not objects that could be
isinstance checked), but I favoured a variable naming approach (along with
calling functions by keyword argument) that would result in this at least
being visible after problems came up (e.g. you'd see myfn(dog_id=cat_id) and
feel an urge to hold your nose).

There were tons of these sorts of things all over the codebase, tests, etc,
and the system outperformed the previous ones by a significant margin. My
understanding is that it still hasn't suffered any significant losses; only
some minor API issues that were outside of our control.

Super fun project. Nowadays I'd just use a typed language for it and interface
with the py2 stuff via an API. Or at least make use of mypy. But that
autotrader was what the company needed at the time.

Details are in my bio if anyone has further interest.

------
pwdisswordfish2
Another way to solve the escaping problem is by some kind of interpolation
mechanism that takes care of escaping on its own: like JSX or template
literals in JavaScript (although you have to remember to tag the latter), or
prepared statements in SQL. Why fix a coding convention when you can fix the
language?

------
BrissyCoder
This will sound simplistic but just avoid using C++. You'd have to pay me
twice my current salary to go back to that language (or C).

~~~
nnq
What are you coding it right now? Haskell? F#? OCaml/ReasonML? ...maybe Go? Or
Rust?

...because in _almost all other languages_ the issues mentioned by OP are
valid! You can use "exceptions for business logic" and have invisible goto-s
in your Python or C# or Java or Scala code just fine. I don't think the advice
is language dependent at all, it applies to any language!

Sure, with a "sufficiently advanced type system" you can have the compiler
catch these kinds of errors and have wrong state unrepresentable. But in
practice you either don't have that "sufficiently advanced type system", or
you know that using it to prevent these errors will take a ton of extra work,
or end up with something so complicated that you'd spend 80% of you brainpower
wrangling abstract types algebra in your brain instead of spending that brain
power son solving the actual domain problem, and have every new developer
spend weeks before being productive in your overengineered system... variants
of stuff like "Apps Hungarian" are often a lightweight and practical solution
in any language.

Also, if operators overloading is what freaks you out, try writing code heavy
on matrix/vector/tensor algorithms without it... it will end up looking so
ugly that you'll drown in logic bugs hidden in plain sight simply because the
code looks so different from the nice math it came from that your brain is not
powerful enough to diff it correctly anymore... the zero cost operator
overloading abstraction in C++ can also be very powerful at _preventing_
algorithm bugs if used in the right domain - for "solid abstraction" with sane
origins in mathematics it _works fine!_

~~~
dragonwriter
> variants of stuff like "Apps Hungarian" are often a lightweight and
> practical solution in any language.

Except in an untyped language, the type system usually presents a better
solution with no more effort than “Apps Hungarian”, and for many cases even
without static typing you can enforce rules rather than do advisory
annotations with similar effort (e.g., for the specific unsafe-data case
addresses in the article, a Ruby's taint system.)

> Also, if operators overloading is what freaks you out, try writing code
> heavy on matrix/vector/tensor algorithms without it...

While overloading may be ideal for some subset of that, and tolerable for a
larger subset, I think what that really calls for is more custom operator
definition as supported I Haskell, Scala, Raku and some others more than
overloading.

------
stared
I used to hate code linters. Now, when collaborating with others, I set an
aggressive one.

They make wrong code look wrong, literally.

Of course, it does not catch all wrong code examples, but at least some most
glaring examples that otherwise would need manual inspection.

------
b15h0p
The checker framework
([https://checkerframework.org/](https://checkerframework.org/)) can make the
compiler understand about stuff like this without introducing additional types
in Java.

I have never tried it myself, maybe someone with experience can chime in?

~~~
tschiller
I was a PhD student in the group that makes on the Checker Framework.

It's actually pretty easy to use it to automatically enforce Hungarian
notation. I have a blog post/sample implementation here:
[https://toddschiller.com/java-hungarian-notation-
checker.htm...](https://toddschiller.com/java-hungarian-notation-checker.html)

------
kwhitefoot
Most people commenting here seem to be missing Joel's point that the Hungarian
notation makes it possible to read the code without having to continually
search elsewhere for information about what it means. I'm pretty sure that he
would be all in favour of declaring sub-types in languages that support it but
an awful lot don't or don't make it convenient.

------
emilfihlman
This was a really good read! Highly recommended for everyone. I'll highly
likely incorporate it into my own code.

------
apricot
Nice. It's the most lucid explanation I've seen of what Hungarian notation
really is, what it can do, and how different it is from sticking "ul" in front
of every unsigned long variable.

------
andy9775
WRT exceptions I'm curious what his thoughts are on the go way of handling
them (just return an exception, don't throw)? Clearly this article was written
pre-go.

~~~
andreareina
Error values have a long history. Plenty of C stdlib functions return a
status/error code, with the result being provided via a pointer the function
was passed. Common lisp lets you return multiple values, which can be used to
signal success/failure (it's how retrieving nil from a hash map is
disambiguated from not finding the key in the map) or include related data,
e.g. the fractional part of a truncated number.

