
A curated list of falsehoods programmers believe - BerislavLopac
https://github.com/kdeldycke/awesome-falsehood
======
jmnicolas
> My system will never have to deal with names from China

I work in a small French town where Chinese people were quite rare until a few
years ago. Their passports have a translation of their names in Latin
alphabet, same for Russians.

However one day one of my user told me "hey I can't find some customers in
your app, but I'm 100% sure I recorded them previously. Funnily they're all
Chinese".

It turns out that to avoid querying the database too much I was starting to
look for customers name once the user had entered at least 3 chars. Most of
the Chinese names were only 2 chars like "Xi", "Wu" etc.

I clearly remember thinking at the time of writing the original code that
nobody had only a 2 letter name :)

~~~
silicon2401
This happens with western names as well. Jo, Bo, Malcolm X.

~~~
pletsch
Wasn't he just known as X though? His legal name would have been Islamic and
his birth name was Little.

~~~
0x4a42
If you are Malcom X and you create, let's say, a Facebook account, then your
last name is X.

~~~
lisper
Harry S Truman's middle name was the letter S.

~~~
dathanb82
Ditto Ulysses S Grant (though he wasn’t born with that name).

------
Enginerrrd
I read through the entire list of falsehoods about addresses and I think I've
got a few more broken assumtpions that weren't listed:

* _A property will have an address_

(I live in a very rural area of CA where the roads have forked several times
for many miles to get to a property and they are all unnamed private roads.
People often just put down the APN and the nearest town (which might be
several miles away) as their address. I deal with this one all the time. I
routinely have to find their house for the first time. Good luck using GPS
here, if you can't read a map, you're S.O.L. [And no, even trying to confirm
that you are where you think you are with GPS on your phone is gonna be a
challenge. For one, you're miles from cell reception so if you didn't save the
map to your phone, you're screwed. For two, under heavy tree and cloud cover,
you may not even get a GPS fix on a phone.])

* _A recipient needs a name and address_

My mother lived for a time in a _very_ rural area nearby. She rode a BMW
motorcycle and someone she met only briefly once successfully sent her a
letter by simply drawing a BMW symbol as the name and address and affixing the
nearest town.

* _A person has a street address_

Homeless and rural people regularly fail this one.

------
mellosouls
Unfortunately - especially when dealing with non-technical ideas - some of
these "falsehoods" veer into opinion territory.

It's an interesting resource but perhaps the title of it is a little
egoistic/presumptuous in that regard.

~~~
unethical_ban
You just inspired an idea from me regarding HN comments. I think Slashdot
still has one of the most elegant commenting systems around, even if the
quality of the comments (due to the site's ownership) is worse.

I would love to tag HN comments with certain attributes and be able to filter
them out. For example, this comment and the parent could be tagged "meta-
discussion". Then those who want to discuss the content of the article could
filter them out quickly.

/. tags are pretty accurate - "funny" "informative" "insightful" "trolling" \-
though trolling is a delete-level offense here.

Likewise, metamoderation (peer review, or moderation of tagging) was a
brilliant idea.

~~~
PaulKeeble
I have been thinking about this for a while with Reddit. If users could mark
why they think something is worth upvoting then everyone else could just
filter to find the insightful or informative comments and get rid of the funny
ones, or vice verses depending on what they were looking for at the time.
Distilling down the comments to stuff you are much more likely to be
interested in could improve the site quite a bit, it works on Steam fairly
well for games even with a somewhat free tag system but I can't help but think
that perhaps not many people would bother.

The simplification to up and down hurts discussion on Reddit (and probably on
hacker news too) because it promotes widely comprehended and accepted content.
This is why on Reddit pictures do so well unless they are banned, they are
very quick to review and choose to up or down vote. However a 10k document
will be reviewed by a lot less people, it could fit exceptionally well with
the content and get 100% upvotes but be completely buried by all the pictures
because a lot less people had time to view it and upvote it.

There is a deeper challenge here with how comments and upvoting and content
filtering via users is done. It isn't just the lack of information from tags
but also once a document is reviewed will an upvote occur. Much of Reddit
shows users will widely upvote on the basis of the title of say news without
ever reading it. The longer the article is the less people will return to
Reddit to upvote it. Both HN and Reddit promote popularity over fit because
low numbers of upvotes on content equates to low engagement and that isn't
valuable to them, but it might be the singularly most important piece of
content for that audience its just long and badly titled.

There are a bunch of unsolved issues that Reddit and HN have surfaced around
the model for content and comment voting and where these two sites differ is
mostly only in the rules to try and mitigate the problem with the chosen
model. The problem is only more complicated models can improve it and the
moment you do that you potentially lose customers, but at the same time you
potentially don't want those customers either.

That is my unfinished thought chain on this.

~~~
Enginerrrd
>The problem is only more complicated models can improve it and the moment you
do that you potentially lose customers, but at the same time you potentially
don't want those customers either.

Therein lies the key paradox. Statistically, content quality usually has the
shape of the the lognormal distribution, with community size on the horizontal
axis and quality on the vertical. Content below a certain margin of popularity
is usually unpopular for a reason--the ideas of the community are not very
good! When content quality is high, the popularity of a community rises
because the ideas are good! After a critical threshold though quality goes
down. Once you've reached that threshold, number of users is inversely
proportional to quality. I really like HN vetting process for voting: you have
to have spent enough time on the site, and commented in a way that gained
sufficient approval from the existing community in order to get the right to
vote in that community. Keeping barriers to entry high is critical to this,
but marketing people can only see $/user.

Reddit has become increasingly intolerable for me except for a few niche
subreddits.

------
whatshisface
From the units of measurement falsehood list[0]:

> _Heterogenous units are of no practical use. (Radar beam height formula uses
> a constant expressed in nautical miles per foot)_

Constants in formulas can be expressed in any unit system you want. What the
author means to say is that one formula they saw in one textbook used nautical
miles per foot. The "radar beam height formula" is not specific to any
particular unit system.

[0] [https://www.stevemoser.org/posts/dev/falsehoods-
programmers-...](https://www.stevemoser.org/posts/dev/falsehoods-programmers-
believe-about-systems-of-measurement.html)

------
turbinerneiter
idk

> The term "domain-specific language" has meaning.

It means "is a computer language specialized to a particular application
domain".

The fact that it is hard to draw the line between DSL and general purpouse,
that DSLs grow too large, ... does not invalidate the meaning of the concept.

Or:

> It is meaningful to talk about the speed of a programming language.

It is? When I have to perform a given computation within a given time on a
given hardware, the same algorithm can be fast enough in one language and too
slow in another. Except if you want to nitpick and see _language_
independently of _implementation_, which is a valid point, but has little
practical meaning.

~~~
AnimalMuppet
Well, Haskell (say) can _never_ be as fast as C++, because of immutability.
That means that it has to make a new place to store the new value, which means
an allocation. That's not dependent on implementation.

(Except, I suppose, you theoretically could implement C++ so that it _also_
allocated every variable. Still, I think the point is valid: Haskell, by its
nature, _has_ to do things that C++ doesn't, so Haskell is going to have a
very hard time being as fast as C++.)

~~~
pas
Immutability doesn't mean no one can implement a Haskell extension that allows
safely describing and composing operations that use mutability on the inside.

Of course on current hardware with current compiler Haskell programs tend to
be slower, and also tend to be more correct compared to C/C++ (maybe even
compared to let's say C++20).

------
raverbashing
I am always appalled when systems (built in the US) have difficulties with
apostrophes in the names. I mean, have you missed all the Irish names that
exist in the US?

It would be "fair enough" forgetting about accents when computing resources
were limited, but to not deal with apostrophes? Really?

(I mean, ok, this story is not new, just look at the history of the
[https://www.reddit.com/r/AskHistorians/comments/1fu633/when_...](https://www.reddit.com/r/AskHistorians/comments/1fu633/when_and_why_did_english_orthography_stop_using_%C3%BE/)
)

~~~
bloak
Apostrophes in street names is a bit of a cause célèbre in the UK.

Also, should we treat ' and ’ as equivalent?

~~~
Akronymus
Can't forget about ` and ’

------
haolez
This doesn't seem curated. It just seems like a list (a long one bloated one
at that).

~~~
matsemann
Most github list projects seem to suffer from this. For instance the infamous
"awesome list" [0]. Select a category. Most of them contain several hundred
items. As if this is anything better than searching google for the term.

[0]:
[https://github.com/sindresorhus/awesome](https://github.com/sindresorhus/awesome)

~~~
tartoran
Even if there were all in one place it would be a good overview or an easy
place to lookup but the list is not even far from complete.

------
omginternets
I really enjoy these, but where can I find an explanation of what each
falsehood means (in particular: the network one)?

I mean, now that I know I’m wrong, how do I become right?

~~~
matsemann
Yup, I don't like some of the lists that just lists falsehoods without
providing what one's supposed to do about it.

------
Tajnymag
After reading the Falsehoods Programmers Believe About Names, I wonder, what
would you recommend as a universal way to store users's names?

There seems to be two main options:

Store the name in a single field, be correct but lose the option to sort the
names reliably.

Store the name in two fields, have the users bend over our western form
standards but be able to search and sort easily.

Do you have any real life experience with these or other "falsehoods"?

~~~
unnouinceput
I've dealt with that on a project for Kuwait embassy. The solution was to
create a 2 fields (columns) in a separate table that was linked by ID to main
"users" table and the UI would ask the user to add whatever wanted and name
the field. Then when searching the algorithm would categorize the most used
ones and present them from time to time to Admin that would manually add as
options for general search UI. As times passed on and more data was entered
this way became better and better allowing for a more fine tuning of
searchable data.

I shit you not, there were fields added there that our Western thinking would
never dream they are needed, for example plenty of old people from Papua New
Guinea added "Grandfather initial" to their name. Or others added "tribe name"
as a meaning to somehow show that they are part of an extended family.

~~~
0xffff2
I too would love to read more about this. It sounds like a great general
purpose solution.

Are there any mandatory parts? Assuming not, how does sorting work? Was it a
matter of sort orders being pre-programmed for more common fields (e.g. sort
by last name first, then by first name) and an assumption that any name that
didn't include any of the pre-programmed parts was just stuck at the end? Was
it even an issue? (i.e. how many people even had a name consisting entirely of
"uncommon" fields?)

Edit: Another question. You said just two fields in the secondary table
(perhaps you where simplifying though...). How does display work? Presumably
the user expects their grandfather's middle initial to be printed in a
particular location relative to all of the other parts of their name.

~~~
unnouinceput
The only mandatory part, from UI point of view, was the username in "users"
table. On 2nd table (think was "userExt" called) everything was user generated
and not required. From law point of view the data you entered in there was
your responsibility to comply with both embassy requirements and your country
issued ID's (personal identity card, passport, driver license, marriage
license, etc).

The document was also user generated from their defined fields and was user
customizable. My implementation was to allow users move fields on the form
however they liked and also had an option to be saved as template to be used
further by same user or others.

Sorting and filtering was the biggest issue. Since we talk about Kuweit,
Unicode complaint was paramount, including Mongolian alphabet (not to mention
classic pitfalls of Chinese, Japanese, Arabic). Initially I went with UTF-16.
That was a big mistake and I've refactored that part of the code a year later
to be UTF-8. Since this was also manually picked by Admin, it had a points
system. For example if more then 10 different users would add a new field
(let's say "tribe name") then Admin would get a notification about it and
could be added as option, for future users, to pick when entering their data.
Once defined in Admin part, it could be used for sorting/filtering like any
other already defined field. As for exact sorting, I've simply let Unicode
standard and the library used at the time to figure it out, I haven't done any
special code on it. Best to let others smarter then you on that issue to do
it, you just use their defined interfaces.

Display for Admin would show all the user defined fields, one after another.
Admin also had the possibility to view them in the user saved templates. In
case of using a template from another user to current user, the form would
simply show a red outline and empty space there if data was missing, and if
the same user had other fields not used by current view they would be
enumerated at the end of current form. Forms could be on as many pages as the
user wanted. Remember, this was used by embassy and everything entered online
would also be manually compared with the hard copy of them by the officer in
charge of your application. I don't think pranking the embassy officer with a
shitty form would be nice for you since he would simply reject your
application and you had to pay (the least amount was on thousand of dollars)
to begin the process. Only a bored billionaire would do something like that,
usual folks took great care to actually be 100% accurate.

------
encom
"Falsehoods programmers believe about life and people outside California."

~~~
Muromec
"You don't need bitcoin because you can just wire money from one account to
another"

"Okay, you don't need bitcoin if you are not sending your money to imaginary
places like Iran and N'Korea"

"Okay, you don't need bitcoin in Europe or US unless one party in your
transaction has surname that starts with Ass and ends with ange"

"What do you mean by saying that banks abroad could refuse to open you account
if you are US citizen"

"Capital controls and war? What you are, a character in a Neal Stephenson
novel or Python developer?"

~~~
ry_co
You do realize that it's still illegal to do all these things with Bitcoin,
right? It's not that banks can't do these things; they just have no interest
performing illegal activities on your behalf. Bitcoin solves nothing here.

~~~
Muromec
It's also illegal to smoke weed, drop acid, watch netflix or be gay or publish
suicidal jokes in some places. Everybody draws a line for themselves in regard
to which laws they have to respect or have no choice but to break today.

------
m0rc
It is not equivalent, but if someone has the time to read the list, I would
recommend instead the reading of R. L. Glass "Facts and Fallacies of Software
Engineering" [1].

[1] [https://www.amazon.com/Facts-Fallacies-Software-
Engineering-...](https://www.amazon.com/Facts-Fallacies-Software-Engineering-
Robert/dp/0321117425)

------
mcv
Lists of falsehoods can be very useful and enlightening (especially about
common things like dates, names and email addresses), but with the "falsehoods
programmers believe about falsehoods" it becomes just an exercise in pedantry.

~~~
kube-system
Knowing the falsehoods is useful, knowing when and how to apply that knowledge
is what makes someone an effective engineer.

------
ArtDev
This is kinda depressing reading for the start of the workweek. Also, I wish
each of these was more expanded.. I can think of many workarounds for each of
these situations but I am curious what other people's solutions might be:
[http://www.creativedeletion.com/2015/01/28/falsehoods-
progra...](http://www.creativedeletion.com/2015/01/28/falsehoods-programmers-
date-time-zones.html)

------
jbverschoor
Actually most of these things are not bound to programmers. In fact, I think
programmers in general are more aware of these lists.

~~~
Sebb767
Yes, but in the other hand, if you don't handle validation or storage of these
things it's rather irrelevant.

~~~
kube-system
I wouldn’t say it’s irrelevant. If you are speaking to someone over a phone,
over a counter, or through the mail, they interpret your communication through
the same lens of assumptions, before it even reaches your application.

~~~
Sebb767
That's why I said rather ;)

Human assumptions are one thing, but if you're directly speaking to a person,
you can explain your unusual name situation. There is no discussing with the
input validation on a web form, though. That's why I think it's far more
important for programmers rather than, for example, counter staff.

~~~
kube-system
> if you're directly speaking to a person, you can explain your unusual name
> situation.

Sort of. I'm more thinking about situations where the person makes
interpretive assumptions on their own without opportunity for discussion. You
don't normally see what people are actually filling out in the application
when you're talking to a customer service representative. Nor would 90%+ of
western CS reps have any clue what you're even talking about if you brought up
even a very common exception.

Example:

Form: [first name] and [last name]

CS Rep: what's your name?

Customer: "Kim Jong Un"

Resulting record: {firstname: 'Kim', lastname:'Un'}

And even if your developer was smart and made those fields "given name" and
"surname", your CS rep might still jam the values in the wrong fields because
they're a kid in Nebraska making $12/hr who barely understands the difference
between Korea and China.

~~~
Sebb767
Fair point, I never thought of that. Thanks for expanding on this!

------
Mulpze15
Quite interesting. I have written software (and still do), that follow those
falsehoods. But in many of those projects, it did not matter ultimately.

So I am glad I did not sweat over those, and got something up and running, and
did not get bogged by some of those very difficult questions.

Hopefully for the projects that stick around I'll be able to do some clean-
up...

------
peterwwillis
I didn't see a falsehoods list from the perspective of a sysadmin. Time to
start writing a blog post....

------
Mikhail_Edoshin
Isn't it strange that bits of essential knowledge ends up being either opaque
(some form of code that does some magic) or negative, as in these
'falsehoods'?

------
tus88
There are things more important than working software.

~~~
vosper
A fair few of the these lists (which are fascinating, by the way - just pick
one at random) are ultimately about respecting people and culture. Being
bothered to correctly handle calendars or naming conventions (and lots of
other things) that may be unfamiliar is helping to include people who might
otherwise be excluded, or treated as second class.

~~~
InfiniteRand
I think this can be compared to the standard trade off of doing things the
right way or the quick way, which is not necessarily wrong but can blow up
spectacularly. Unfortunately, sometimes the work to do things the right way is
an infinite pit of edge cases and domain knowledge.

You can use libraries to paper over this trade off but then you have other
trade offs like needing to worry about bugs you have no control over.

In the end, if you don’t feel at least a little bad about the software you’ve
written or how long it took to write it, you are probably doing something
wrong

~~~
tus88
Do your users care if it's done the "right" way so long as it works the way
they want?

Cannot tooling and methodology make up for poor documentation or "code
quality"?

