Hacker News new | comments | show | ask | jobs | submit login
List of falsehoods programmers believe in (github.com)
255 points by edward on Feb 13, 2017 | hide | past | web | favorite | 118 comments

If we really aspire to be an information-driven culture who makes informed decisions about important topics, we really ought to curate lists like these down to actual informed discussions on the items in the list.

Printing out a list of "falsehoods" littered with personal opinions and calling them "demonstrable" [1] without a shred of evidence is not going to contribute to general knowledge. There may be some utility in getting me thinking, but for the most part I just find it self-aggrandizing. For what it's worth, I do find this [2] format much more helpful. I'd very much rather see valid counterexamples proving a sweeping statement false than yet another sweeping statement that happens to cover a few sweeping statements.

Of course, the above is my personal opinion and may not be shared by the rest of the community, but please can we do better than curating a list of lists we agree with?

[1] https://chiselapp.com/user/ttmrichter/repository/gng/doc/tru...

[2] https://www.mjt.me.uk/posts/falsehoods-programmers-believe-a...

Similarly, the famous list of falsehoods about names: http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-b...

Even aside from the politicized content, I found it jarring that this bounced between clear and 'demonstrable' statements and entries like "you do not need a postal address", which was just a Twitter link to some letter delivered without one. It's not how I'd prefer to see this stuff curated, either.

> There exists an algorithm which transforms names and can be reversed losslessly. (Yes, yes, you can do it if your algorithm returns the input. You get a gold star.)

What am I missing here? Any lossless compression algorithm works, even "string.reverse" would apply.

I think the implication is that a transformation won't be reversible if you don't know exactly what was applied?

Like, I've had the situation of seeing a poorly-marked name string and going "I think that's 'Last, First' formatting, but I'm really not certain..." Or if you do any formatting change you'll get bizarre problems like apostrophes becoming directional quotes.

But yes, the rule-as-written isn't right, and I'm not sure what was meant.

I think what is meant is, if you combine a first name and a last name, you can't break the full name back into first and last name. Because people have names with spaces in them, among other problems.

I think the challenge is to find a way to tranform the name so that it's still legible by a human.

Like transform "Mister" to "Mr"

I'm not sure I understand either. Any one-to-one function where the name falls under the inverse image of the range works (notably, stupid things like the identity mapping along with, you know, binary representations, etc).

What if the name has spaces in it? How do you differentiate reliably between a middle name and a name with spaces? It gets even worst when there may have been previous transforms, like forcing family name to be the last name.

The answer is; you don't.

Always ask the question you actually mean to ask.

Frequently someone's /name/ is part of a mailing ADDRESS. You can, and maybe should, limit this to a individual lines. However the only way to be SURE is to give an actual freeform multi-line input box, like a textarea.

If you want a string to greet someone as, ask for that, and store it separately.

If you want a title to display within a directory listing, accept that.

Generally, do not decompose things for your users, EXCEPT possibly client side while populating form elements for their convenience (and verification/correction).

Exactly that's the point of the falsehoods article on names. Now if I could just convince management...

Postal addressing is actually a very interesting topic; that statement is demonstrable. Many smaller towns internationally do not require an address line; all letters to the city are delivered to a central hub and distributed based on name/other information.

My complaint was with the wording and generalization. "Not everyone has/needs a postal address" is true, "you do not need a postal address" is not - living in a big city, I don't get mail that's even slightly mislabeled. And yes, that's a nitpick, but I was mostly objecting to a sweeping statement with a Twitter link that doesn't clarify much.

(As for the specific question, postal addressing is fascinating. Someone in my family has an unusable address with an alphanumeric house number, and Amazon won't ship to it. And in the last one of these threads, someone mentioned living in a country which uses standardized directions-from-landmark as a formal address!)

There's an apocryphal story of someone writing a letter and placing it in an envelope with a picture of Alfred E. Neuman on it and mailing it (with appropriate postage). It was delivered to Mad Magazine, the intended recipient.

Similarly, a letter supposedly addressed:




Which was delivered to John Underhill, Andover, Mass. Unlikely to be true, but a great story.

>Many smaller towns internationally do not require an address line

Citation needed. I've sent mail all over the world, and the only place I've seen where you can get away with that is Ireland. It's actually famous for this. Unless you're talking about some really backwater developing nations, every nation I've sent mail to does require an address line of some kind. However, in the UK, there are a lot of funny-looking addresses in the more rural areas. They have an address line, but it's really just the name of some hamlet or house or something, then they give the name of some farther-away but larger town, and finally the name of some main town/city where the mail is first routed. But they also have a detailed postal code that can route things by itself. Out of reasonably-developed nations, Ireland is the one that really stands out for having an antiquated postal system, but it does seem to work fine for them. It's also different and more modern for addresses within Dublin, where they do have a (very short) postal code.

The only other place where I've seen a place where you don't need a street or house number is for addresses at Kibbutzes in Israel.

And you illustrate the smug unhelpfulness of the falsehood lists at their worst.

Not only that, but a fair number of them are actually true statements (e.g. People have names). I can only assume someone got overzealous trying to prevent assumptions about other cultures, but I still don't really understand the train of though that lead to their inclusion.

I suppose that could be referring to "John/Jane Doe" situations - when someone turns up who may have a name, but you have no way of ascertaining it (because they're not carrying ID, and are unresponsive, dead, or otherwise unable to tell you themselves). Not really a relevant concern for most programming situations, but it could catch you out if you're writing software for medical/police reports, for instance.

Or, for a rarer subcase where the person genuinely has no name at all - consider "feral children": https://en.wikipedia.org/wiki/Feral_child

This is specific case of the more general flaw in a lot of design work - "I have a collection of things, these things always have an X with property Y, so I'll just organize/key them that way".

This works great until property Y is violated... or until you run into one without "an X". Most times you are better off to just start with unique identifiers and avoid breaking logic.

The most common example is actually scandinavian unbabtised babies. It's quite common to keep the name secret until the big reveal in the church when the child is several months old. I believe that here in denmark you aren't required by law to submit a name until the child is 6 months old.

you can go past 6 months without naming your child but you are fined, I can't remember at which point the child will be given a name if you won't provide one.

Also of course if you give your child a name that is not recognized as a name by the name register there is a process for okaying that name, so in that case your child can have a name in the family but not an official governmental name yet.

It is fairly common for newborn babies not to have names, which can cause issues with medical record systems. In most cases a workaround is used, such as "Baby {{Lastname}}". In the US, I suppose, the baby would at least have a last name, so not entirely nameless.

Agree with everything except the last name part. A baby isn't born with a last name and there's no obligation for a baby to have the same last name as either parent.

Our daughter was "Baby Girl <Mother's Last Name>" in the EMR system for a few days even though no part of that was her name and that isn't the last name she ended up with.

That's also an example where trying to model the strict reality (having no known name) might be worse than just using a workaround in a simpler model. As long as system acknowledges that different people can have the same name and duplicate profiles may be found and need to be merged, John/Jane Doe and Baby <Lastname> seem perfectly fine to me.

Only if you're sure you can treat the unnamed the same as the named. If there's logic anywhere in the system that treats the unnamed babies differently (e.g. delaying filing for a birth certificate) that would be a problem if and when someone actually does want their child's first name to be "Baby"

> That's also an example where trying to model the strict reality (having no known name) might be worse than just using a workaround in a simpler model.

A bit of an orthogonal example to the thread, but I ran into an issue like that, with a solution that worked just like you described. I was using msgpack in Python 3 to send data to the browser in JSON format.

msgpack in Python 3 decodes into bytes, as (hopefully) expected by Python 3 users.

Except the data was somewhat arbitrary in terms of how it was organized or what was included (but was predictable in type), so decoding it was a bit of a pain. The problem of course was that my strings were wrapped in b'', and that was being sent to the browser, causing the JSON parsing in the browser to fail.

My first two attempts were recursive algorithms that would try and decode strings and skip everything else (we only worked with strings and numbers, or lists of strings and numbers, hence "predictable in type"). Both times it almost worked, but didn't. Even with predictable types of data, I had to try and cover a number of edge cases.

The solution that ended up working 100% of the time in my particular case was just using regex to remove the occurrences of b'' from a string version the object, then send that as JSON. Looked like an ugly hack, and probably was, but it damn sure worked like a charm. It was much easier to just ensure I built a valid dictionary before encoding it with msgpack rather than try and "properly" decode all the strings after decoding it with msgpack.

You could subclass the json.JSONEncoder class (and implement the default method) to decode bytes objects, along the lines of the example in the documentation:


Does the baby have a last name, though? Think about a baby born to parents with different last names. The baby may take the father's last name, the mother's last name, one of at least two different hyphenations, no last name (maybe they think their child will be famous), or an arbitrary last name they make up. And that's assuming they're just plain vanilla "Americans" with no strong ethnic reason for an alternate naming scheme...

It's a placeholder until a name is chosen. It is temporary and expected to be revised. At the hospital I know, the mother's (who carried the baby) full name is selected, e.g. "Baby Jane Doe." Notice no name transformation is required so it generalizes. I don't know what is used in the case of a surrogate mother.

If parents get offended by this straightforward default-naming algorithm, the recommended workaround is to name the baby.

So, now your little ad-hoc substitute-naming scheme has legal ramifications. Good luck getting this to play along with actual letter of the law (at $location1, not to mention $anotherlocation2).

It's not my naming scheme, it's what the hospital I know uses. Can you explain the legal ramifications you are concerned about? The baby does not receive a SSN until it is named by the parents. If the parents were to die before the baby is named, the temporary name used "in the system" doesn't magically become a legal name.

Sorry, I have misunderstood as "this is what I would do." I suppose that's a process descended from the local legal framework, then - as you say, SSN is assigned after name, and the name has some flag meaning "temporary". I would assume there's some handling in place for that eventuality.

(I have seen a different protocol: state ID number is assigned after birth, parents choose name independently of the process, the two are only linked ex post)

How do you think having an identity separate from name make things worse? Any concrete examples?

The assumption you've just made is exactly why the list of falsehoods needs to exist.

What assumption are you referring to?

Name one person who doesn't have a name.

Well I'm not trying to troll you or anything but my friends haven't named their 2 day old baby yet. That's a person, right?

And apparently you can, in some jurisdictions, re-name your baby within a few days of birth, replacing the original name (it's not even considered a rename - it voids the previous naming, as if it had never happened).

Artist-Names are possible. In germany, atleast, you can carry it on your Passport with ur real Name.

That's one of my concerns too.

We're currently in Phase 1 of the project. Which is all about collecting, in the wild, instances of this trending writing style.

Now if it get traction and grows out of a gimmick, and if Falsehoods articles become a new literary form in its own right, then maybe we'll have enough resources in the community to start Phase 2. Which might see more curation happens, as well as writing guidelines and add more structure.

This list is an awkward mix of posts containing easily-verifiable but surprising claims about various technical specifications, and posts which just make a variety of contentious claims with no particular evidence provided (I think the economics one is possibly the worst).

Falsehoods programmers believe about gender is a pretty great example.

A great example of what? From what I read in there, each of those falsehoods is verifiable. Which one(s) (is/are)n't?

It's a pretty great example of falsehoods programmers believe.

The ideological ones are really horrible.

Which ones are ideological?

Most of the Human Identity section and the Society section.

There's a danger in mixing technical spec information (a phone number can contain non-numeric characters) and non-technical spec information (women...). Some falsehoods are provably correct/incorrect, statements about people don't fit into that mold.

All the falsehoods texts discuss the same kind of issue: the domain doesn't match the model. And computer models are imperfect a priori for modelling empirical data(natural language, human identity, relationships, continuous functions, physics calculations). We can get those closer in approximate so that the error is negligible, by testing, collecting feedback about our error, and iterating on the model. All of these - even the genders and natural language - can be tested for error, we just can't ensure 100%, and our measurements for the human things necessarily rest on sampling opinion.

Well, it is rather unfortunate that our data models need to deal with people. Without them, all those irregularities and impurities are eliminated. [lightbulb-animation-above-robot-head]

Ok, that list is awful, with all the ideological nonsense.

But besides that, for stuff that is actually empirically testable and relevant it would be nice to put them into a unit-testing library to automatically check certain functionality you build (for example to check functions that deal with time or dates or names).

The sociological "falsehoods" of this list ought to be renamed:

"Postulates I choose to consider as universal and objective truths as to silence any discussion or debate around them"


"List of opinions I have that are facts by virtue of being on this list"

Of course, you're wrong for disagreeing with OP.

The lists are cautionary tales, things to consider before you code. "May those who have never sinned ignore these posters, suggestions, and lists."

Ok, but I must be right about something? - "Nope, not if you disagree with me about the other thing"

Was reading the post about names... My girlfriend has a family name with two letters. In Asia, it's common enough so it's not an issue but in Europe there are some systems that refuse my girlfriend's name because it's too short.

It's kind of frustrating.

I remember hearing from a classmate of mine in college that 'O' is a legitimate Korean surname. Blew my mind! And what about people without a surname? Mononyms are common in some parts of the world, like my native state in South India.

Interesting. I'm curious, what's her usual workaround for this problem?

Her first name has two parts, so she just combines the first part of her first name with her last name. It's incorrect but still matches more or less what she has on her password.


Typo? or Freudian slip :)

Hahaha, most likely passport!

yep, was supposed to be passport :)

Most of these things are more about 'not knowing' ot 'forgetting to take into account' than 'not believing'? (not a native English speaker but surely those are not the same semantically right?)

Yea, as discussed in other parts the idea that programmers believe "People's names do not change" is demonstrably false. I am a programmer, I understand names change but you are right, there is a chance I don't account for it.

Semantically believing something is false != I didn't take it into account.

There is some obvious "activism" behind such list, mixing valid technical specifications, politics, opinions and gender theory. But I guess it is a political correctness guide for people who believe in a specific political line, also a way to shame publicly those who do not by using that list as an argument of authority.

You're downvoted, but I agree that the e.g. women-in-tech one -- while worthwhile to be aware of -- does not fit the general "falsehoods programmers believe" theme/meme. The meme refers to assumptions that lead to technically-inaccurate programs.

IOW, there's a difference between "everyone has two names, so you can assume that schema for a database" vs "all women in tech are designers". The latter is false, but it's not a programming error. The stuff on this list [1] does not fit the pattern of programming errors.


You seem to imply that there is a non-political, non-activistic way of doing it. How would that work?

Because from where I'm standing, if you e.g. implement gender as binary, you're not being more objective, you're just supporting the opposite side of the people you perceive as activistic.

Which is exactly the point: it's not a falsehood it's a contention. Listing your political beliefs as truisms without explanation isn't just lazy, it's annoying.

They should at least put a warning when it's opinion and not fact. Programmers with a limited world view may end up following these instructions - putting themselves at risk for taking a stance on sensitive cultural and political issues.

>Time passes at the same speed on top of a mountain and at the bottom of a valley

Woah, what? Like, they're talking about effects of relativity because someone is traveling "faster" as the earth spins at the top of a mountain?

You are going to love this:


The short version is that this guy hauled three cesium atomic clocks up a mountain while on vacation with his kids. They returned 23 ns older then the guy's wife (who stayed home to study for her nursing board exams).

Note that time dilation isn't just caused by motion (special relativity), but also by proximity to a gravity well (general relativity).

Time dilation of everyday life is so small that it's not detectable to all but atomic clocks, over the course of your entire life. So the differences between 10 miles is so negligible it doesn't even exist as an issue.

It doesn't practically exist as a problem for things (like people) for whom a meaningful time scale is around a second, and a lifetime 100yrs.

If you are operating on very small time scales, or if you're going to be around a long time, it can matter a lot.

Good luck with your GPS navigation then. "You are here. (Give or take 10 miles, no biggie, eh?)" There is a surprising number of applications in your everyday life for which Newtonian physics are not precise enough.

Why would it be relevant, then, for a programmer to know that?

A programmer working on something like the OPERA neutrino experiment would need to account for those factors [1].

GPS devices also have to account for relativity [2], so programmers working on those also have to factor that into the logic.

[1] http://www.telegraph.co.uk/news/science/science-news/8905322...

[2] http://physics.stackexchange.com/questions/1061/why-does-gps...

Mostly it's not. Charitably, someone writing software for e.g. a particle physics experiment might actually need to account for altitude to produce accurate results.

Of course, that's the sort of domain-specific demand that might not belong in a general-use list. People writing sound engineering software have to worry about all kinds of subtleties with sample rates and harmonics, but I wouldn't put that in a guide to basic data I/O - I might not even put it in a basic analog-to-digital guide.

Who knows, the next great interview question could be asking it.

The writer may have been exaggerating their point.

Also, you have to account for the different gravitational field in both situations, which affects time flow.

Yep. And it's measurable, it's just very, very small. It normally doesn't really matter for all but the most sensitive applications.

"tax - A PHP 5.4+ tax management library"


Libraries trying to solve these falsehoods are considered good candidates for inclusion in that list: https://github.com/kdeldycke/awesome-falsehood#libraries

was trying to figure this one out as well, best i could come up with is that the library is a bit ambitious (reliably meeting tax requirements for many countries).

If we are to remember the truth, it's better to list the truth rather then the falsehoods.

Example: Partner says "Don't buy the red one", then a few days later you go and buy the red one, while you should have bought the blue one. It would be better if your partner had said "Buy the blue one".

There's the rub: if trying to implement, e.g., the whole Names thing, you'll find that the requirements are impossible to satisfy - if you avoid the Scylla of one, the Charybdis of another will get you. The point is to be aware of the limitations, and know which ones you are accomodating, which you are avoiding, which you are breaking, and why. The other option is akin to "MEH EVERYTHING IS UPPERCASE ASCII CHARACTER OR A SPACE NOTHING ELSE MATTERS", which sounds a bit...backward...in 2017, and will come back to bite you.

One interesting question from http://haacked.com/archive/2007/08/21/i-knew-how-to-validate...: is “Fred Bloggs”@example.com actually a valid email address, given that the plain ascii double-quotes seem to have been converted to "fancy" ones by the blogging software?

It is a legal email address per RFC 5321.

In practice such email addresses are not possible in many server configurations, and it usually makes sense to reject such email addresses.

> In practice such email addresses are not possible in many server configurations, and it usually makes sense to reject such email addresses.

I would put more weight on closing the loop than filtering on the front end. I'd wager that the vast majority of sites that gather an email address do not send a verification email that bars further progress on their site. It's especially critical if it becomes the underlying trust mechanism for your site.

IMO you should only work on filtering fancy quotes out if you've already got a loop-closing verification email path. And yes, I recognize that it's really nice to catch these errors earlier. But the failure mode where people enter someone else's valid email address rather than their own is more common than you might assume.

How is it legal per RFC 5321? The quotation marks are notionally valid, yes, but as posted it had the ASCII double-quote (legal) converted to stylized left and right quotes (non-ASCII, illegal).

I assumed that the question being asked was is the style with ASCII quotes legal.

If the question involves the use of the non-ASCII quoting style, the answer is more muddled. RFC 6531 generally repeats the RFC 5321 mantra of "don't interpret the local-part", prohibiting only ASCII C0 control codes explicitly [1]. RFC 6530 suggests that C1 control codes should also be prohibited, and suggests that non-NFC is highly likely to cause problems. It further suggests that NFKC-normalized and excluding punctuation and whitespace is risky.

In general, a lot of email address handling advice requires ignoring what the dictums of the RFCs state. You should treat email addresses as case-preserving (i.e., compare ignoring case but don't change the case), and it's inadvisable to have a case-sensitive email server. Similarly, quoted local parts and domain literals should be rejected by almost all software that's not in the guts of the email system. Extending similar rules to EAI is difficult because it's unclear how the system will work in practice, but my libraries start by force-converting the localpart to NFC.

[1] The actual text is "ASCII graphics or control characters." This could be interpreted to mean "(ASCII graphics) or (control characters)" or "ASCII (graphics or control characters)." Given the text of RFC 6530, assuming that C1 is forbidden should generally be a safe assumption.

> I assumed that the question being asked was is the style with ASCII quotes legal.

My question was actually about the fancy quotes; I found it amusing that they got fanci-fied by the blog software.

Thanks, this is a good in-depth summary of some stuff I didn't know.

...assuming that you are being asked to create it. If you want to deliver to another server, keep the left-hand-side exactly the way it is.

Per RFC 5321 (Simple Mail Transfer Protocol), e-mail addresses are ASCII-only.

Per RFC 6531 (SMTP Extension for Internationalized Email), this is not a valid address either.

Wait a moment, they have a Falsehoods series on all sorts of subjects. The Fake News fact checkers are going to try globbing on to every niche they can...

> The shortest path between two points is a straight line


Depends on your situation. In non-Euclidean geometry (such as on the surface of a sphere, which the Earth approximately is), it's not as simple as taking a straight line between two lines.

Also look at Manhattan/taxicab geometry. When you can only move along one axis at a time, you usually can't move in a straight line between two points.

Both of these situations have common applications in the real world. For example, in mapping and navigation.

Triangle with 3 right angles. Geometry is weird.


Also, taxicab geometry: https://en.m.wikipedia.org/wiki/Taxicab_geometry

Right?! Even on maps it is true.

It doesn't mean that there is a viable route on that line or that that line is the preferred one.

That one is really bad, imo

Depends on the map projection.

On the surface of a sphere, the shortest distance between two points is along a great circle route (which is the equivalent of a line in spherical geometry). Polar map projections preserve straight lines as great circle routes, but, e.g., Mercator projections do not.

If you're looking at a 5 km × 5 km topo map, the difference isn't particularly significant. If you're looking at a map of Europe or the United States, it is (note that many airline routes seem to follow curved lines--it's because the shortest distance is that curved line).

It's only true on certain maps.


For regional maps it doesn't particularly matter, the map will be in a projection that has relatively minimal error, but a straight line in most projected coordinate systems will not be the shortest path between those points.

   Even on maps it is true.
That really depends on your map. Or rather, your projection. Which is sort of the point.

When it comes to "curated" list like this, the who the curator is more important that what they curate. Who is Kevin Deldycke, and why should I value his judgement?

>"Falsehoods Programmers Believe - A brief list of common falsehoods. A great overview and quick introduction into the world of falsehoods."


>"Falsehoods programmers believe about names" >"People have names"

Is this a joke?

Newborn baby, unidentified deceased person, person in police custody who refuse to give name, feral children.

I assume OP didn't mean to link to the #postal-addresses anchor?

Yeah, could a moderator please remove that?



Please stop posting unsubstantive comments to HN.

We detached this comment from https://news.ycombinator.com/item?id=13637838 and marked it off-topic.

Under "Falsehoods Programmers Believe About "Women In Tech"":

> We're only in tech to find a husband, boyfriend or generally to get laid.

If you flip the genders around I'm pretty sure that would be true for quite a few men (at least the last part).

You think men go in tech to get laid?

Of course. You don't?

Maybe not directly (i.e. literal rock star) but as a means to an end sure.

You can extend this logic indefinitely. From the Futurama episode I Dated A Robot:

> All civilization was just an effort to impress the opposite sex.

What about all the people in tech who are already married?

Flipping the genders around doesn't make it any less sexist.

It doesn't make it any more sexist either.

Thanks for stating the obvious.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact