Hacker News new | past | comments | ask | show | jobs | submit | jkrejcha's favorites login

The curly braces themselves are 100% irrelevant, as evidenced by the many, many successful and well-liked languages which don't use them, including Python, which is in the running for the most-used language these days. They're an implementation detail.

What's closer to innate is the Algorithmic Language, Algol for short, the common ancestor of the vast majority of languages in common use (but not, notably, Lisps).

Algol was designed based on observational data of how programmers, who had to somehow turn their ideas into the assembler to run on machines, would write out those ideas. Before it was code, it was pseudocode, and the origins predate electronic computers: pseudocode was used to express algorithms to computers, when that was a profession rather than an object.

That pseudocode could have been anything, because it was just a way of working out what you then had to persuade the machine to do. But it gravitated toward a common vocabulary of control structures, assignment expressions, arithmetic as expressed in PEBCAK style, subroutine calls written like functions, indexing with squared brackets on both sides of an assignment, and so on. I revert to pseudocode frequently when I'm stuck on something, and get a lot of benefit from the practice.

So I do think that what's common in imperative languages captures something which is somewhat innate to the way programmers think about programs. Lisp was also a notation! And it fits the way some people think very well. But not the majority. I have some thoughts about why, which you can deduce an accurate sketch of from what I chose to highlight in the previous paragraph.


I was very pleasantly surprised recently when I installed Animal Well[1] and it was only 32MB. With the way things are going I was expecting a 1GB executable or something.

[1] https://store.steampowered.com/app/813230/ANIMAL_WELL/


MSVC and ICC have traditionally been far less keen on exploiting UB, yet are extremely competitive on performance (ICC in particular). That alone is enough evidence to convince me that UB is not the performance-panacea that the gcc/clang crowd think it is, and from my experience with writing Asm, good instruction selection and scheduling is far more important than trying to pull tricks with UB.

Why oh why does "undefined behavior in C" come up constantly here? I wrote C every day for decades. I was aware there's such a thing as undefined behavior. I can't remember it ever being even a miniscule factor in my daily work. What's changed?

People tell me that the problem with this code is that casting one pointer type to another and then dereferencing is undefined behaviour. The problem is that C code seems to do this all the time. For example, fread() in the standard library takes a void* as its first parameter. So at first glance, most code that uses fread() must exhibit undefined behaviour. eg this function:

    int foo(FILE *f) {
        int val;
        fread(&val, sizeof(val), 1, f);
        return val;
    }
Closer reading of the spec says something about it being fine to cast, say, an int* to a float* and then cast back before dereferencing it. In this example, we're OK, because we don't reference via the void* . But the fread implementation must do, right?

I'm left feeling like it is impossible to implement fread() in C and that the standard library's API is no longer considered a good example of how to construct an API.

I guess the original motivation for it being UB to cast is to do with alignment constraints on some hardware. For example, a machine might be able to read an int8_t from any alignment but might require 4-byte alignment for int32_t. If so, then casting a non-aligned int8_t* to int32_t* and deferencing it would indeed fail on that hardware.

But some would argue that it is not right that GCC 7 breaks such code on a machine where that problem doesn't exist.


Have you thought about why having a standard causes all these problems with undefined behavior? The point of a standard is to be less specific than a reference implementation. That's why it's written in English, not code. That means definitions are fuzzy, which is why every kind of English-language law, rule, or spec is vulnerable to malicious compliance.

If C had a judge, like the legal system, he could deny attempts to use fuzziness in the standard (intended to prevent specifying bugs) to introduce bugs. Unfortunately, C doesn't have a judge with common sense, it has people who think the standard is code and anything that is undefined really is a license to do whatever you want.

I guess this problem was inevitable as soon as the words "nasal demons" were put to keyboard. Now the only thing that can save us is a reference implementation.


I think what really annoys me is that this looks like actual malice on the part of the standards writers, and less severe malice on the part of the compiler authors.

I know I should attribute it to stupidity instead but ...

The standards writers could have made all UB implementation defined, but they didn't. The compiler authors could have made all Uab implementation defined, and they didn't.

Take uninitialised memory as an example, or integer overflow:

The standard could have said "will have a result as documented by the implementation". They didn't.

The implementation can choose a specific behaviour such as "overflow results depend on the underlying representation" or "uninitialised values will evaluate to an unpredictable result."

But nooooo... The standard keeps adding more instances of UB in each revision, and the compiler authors refuse to take a stand on what overflow or uninitialised values should result in.

Changing the wording so that all UB is now implementation defined does not affect blegacy code at all, except by forcing the implementation to document the behaviour and preventing a compiler from simply optimising out code.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: