“C is how the computer works” is a dangerous mindset for C programmers

oconnor663 · on March 31, 2020

My favorite take on the severity of the UB problem, https://blog.regehr.org/archives/1520 :

> Tools like [Valgrind] are exceptionally useful and they have helped us progress from a world where almost every nontrivial C and C++ program executed a continuous stream of UB to a world where quite a few important programs seem to be largely UB-free in their most common configurations and use cases...Be knowledgeable about what’s actually in the C and C++ standards since these are what compiler writers are going by. Avoid repeating tired maxims like “C is a portable assembly language” and “trust the programmer.” Unfortunately, C and C++ are mostly taught the old way, as if programming in them isn’t like walking in a minefield.

shadowgovt · on March 31, 2020

> Be knowledgeable about what’s actually in the C and C++ standards

... turns out to be an extremely tall order; the 2017-11-17 working draft of the C++ standard is 1,448 pages in PDF format.

At some point, I wonder when programmers who are required to care about correctness throw up their hands and say "This language isn't reasonably human-sized for a user to know they're using it correctly." At which point the immediate next question should be "Why don't you use something else?"

(In my experience, the only sane counter-answer to that last question is "I'm programming on an architecture where the {C/C++} compiler is the only one with any real expressive power that anyone has ported to this physical system." This is an answer I understand; most other answers will raise an eyebrow from me).

z92 · on March 31, 2020

> most other answers will raise an eyebrow from me

Portability. A C library can be trivially linked with any other language. But one will be hard pressed using a python library form Ruby, for example.

geofft · on March 31, 2020

This is literally the reason I most recently wrote some C, but - do you really need linking? You can pretty easily use a Python library from Ruby by writing a little loop in Python that accepts JSON input and produces JSON output, and calling that as a Ruby subprocess.

It's fairly rare that you actually need to be in the same process. (The thing I wrote was a wrapper for unshare(), so it did strictly need to be in-process, but that's an unusual use case.) Even if you want to share large amounts of data between your process and the library, you can often use things like memory-mapped files.

And there are some benefits from splitting up the languages, including the ability to let each language's runtime handle parallelism (finding out that the C library you want to use isn't actually thread-safe is no fun!), letting each language do exception-handling on its own, increased testability, etc.

evil-olive · on April 1, 2020

> You can pretty easily use a Python library from Ruby by writing a little loop in Python that accepts JSON input and produces JSON output, and calling that as a Ruby subprocess.

If you're a Ruby application, sure, you can use a Python library by forking off a subprocess.

Just make sure to document the installation requirements - in addition to having Ruby and the correct gems installed, you also need Python (which version?) and the correct pip libraries installed.

If you're a Ruby library, instead of an application, forking off a Python process is a nonstarter, unless you want to propagate those requirements out to every single application using your library.

geofft · on April 2, 2020

Sure, but there's an equivalent problem in the C world - go back in time a couple years before we figured out how to do precompiled binary-compatible C extensions in Python and try running 'pip install numpy'.

I'm mostly thinking that if you already have a Python library you really like, chances are you already have a Python environment capable of running it. (The environment doesn't have to be related to your Ruby environment at all! If you want to get Ruby through your OS and Python through a Docker container, or whatever, that should work fine.)

pjc50 · on March 31, 2020

Inter-linking is a superpower, as is having a very thin runtime that ships with most operating systems.

I realized recently that Python could have saved a decade by having a means to "interlink" modules allowing the use of python 2 modules in python 3. Much easier in the C world; there have been incompatible syntax changes and incompatible link changes BUT not both at the same time.

jamesdutc · on March 31, 2020

Embedding Python 2 inside of Python 3 (or vice-versa) is not very hard to do on Linux. Simply `dlmopen` the .so in a new linker namespace and write a little bit of bridging code to interface the two object layouts.

Python has a very rich runtime, so there are some tricky problems to solve if you want this to be perfect. e.g., circular references between the GCs. However, there are simplifying assumptions that can dramatically reduce the difficulty, and these assumptions might not significantly hinder the language-upgrade use-case.

We've known about this approach to embedded interpreters for a couple of years, but we've not found a sufficiently compelling use-case for anyone to develop it beyond a simple proof-of-concept.

In general, it seems like there most compelling use-case for inter-language interfaces is writing core libraries in something fast, (mostly) runtime-free, and portable, especially for codes that no one wants to write twice.

Though I spend most of my time writing codes in slow, rich runtime languages, I've been on the lookout for a better technology to replace C. There are quite a few interesting options, and some have even less of a runtime than (dynamically-linked) C!

pjc50 · on March 31, 2020

> Embedding Python 2 inside of Python 3 (or vice-versa) is not very hard to do on Linux. Simply `dlmopen` the .so in a new linker namespace and write a little bit of bridging code to interface the two object layouts.

Where can I pip install this from?

Or is there a reason that none of the huge numbers of python users delaying their transition as much as possible until all their libraries were updated built it?

jamesdutc · on March 31, 2020

> Where can I pip install this from?

A couple of signatures have changed, but the general approach looks like: https://gist.github.com/dutc/eba9b2f7980f400f6287 or https://gist.github.com/dutc/2866d969d5e9209d501a

The above will launch ("embed") a Python 2 or Python 1.5 interpreter from within a Python 3 interpreter. If you use `dlmopen` and `LM_ID_NEWLM`, the guest interpreter will have its own linker namespace. In other words, the guest interpreter (and any DSOs it opens) will be totally isolated from the host interpreter.

The above shows the use of `PyRun_SimpleString`. Since `PyRun_String` accepts a `PyObject* globals` and `PyObject* locals`, you could build a very basic bridge in <30 lines of code by using serialisation to copy and convert. (For user-defined classes, you would need something better.)

I've given a many talks about this at various Python and PyData conferences. (I've mentioned it a number of times on HN, both in response to complaints about the Python 2→3 transition and in response to comments like yours suggesting this solution to that problem.)

Though presented as a joke, the approach could be made to work with some effort. I can't speculate on why no one has ever followed-up on it, and I can't speculate on why the Python 2→3 transition has been so difficult for some users. Perhaps in some places, financial or organisational arguments are more influential than technical arguments.

cozzyd · on March 31, 2020

Indeed. C++ is relatively portable too (or at the very least, can be made to hide behind a C ABI in many cases) and it seems like Rust is as well. Whereas, if you have a library written in python that you want to leverage, you're forced to include a python interpreter. This sucks.

goatlover · on March 31, 2020

Julia lets you run Python, R and Fortran libraries.

cozzyd · on April 1, 2020

Fortran has a simple ABI. How does it run R and python libraries? Probably through an embedded Python interpreter?

dagw · on March 31, 2020

Only if you have a complete Python/R environment installed.

adev_ · on March 31, 2020

> At some point, I wonder when programmers who are required to care about correctness throw up their hands and say "This language isn't reasonably human-sized for a user to know they're using it correctly."

That's a bullshit argument very close to FUD.

You don't need to know the 1400 pages of the C++ standard to use safely a subset of it. Specially with proper tooling to help you.

Do you really think that every web frontend developer knows every W3C/ECMA standard every time they create a website ? Or that they have even an idea of feature the complexity of the Web browser they use. One tips: no they don't.

shadowgovt · on April 1, 2020

Those standards are built to be default-sound. It shouldn't be possible to do wild memory access in JavaScript or via novel application of CSS or DOM structure.

Most importantly, if someone does find a way, it's an error in either the spec or the browser implementation. It's not flagged as "undefined behavior" and we go on with the assumption some web pages just break your browser or the OS hosting it.

I can tolerate a thousand-page spec for a system that can't crash unsafely; it's a much bigger risk for one with unchecked pointer indirection as a feature.

adev_ · on April 1, 2020

> It shouldn't be possible to do wild memory access in JavaScript or via novel application of CSS or DOM structure.

This is a narrow view where you focus only on memory safety and buffer overflow. Undefined Behaviours goes way wilder than that.

Many problems coming by UBs do not goe into any crash, just wrong results and wrong behaviour which is something even worst.

And currently, JavaScript is full of that. Almost everything that JS can not handle (or do not know how to handle is undefined) or purely implementation specific.

shadowgovt · on April 1, 2020

And for JavaScript, that space doesn't include buffer overruns or memory safety, unless you do something very exotic and implement your own indirected memory access in JavaScript (in which case, you've sort of dug your own grave). because undefined behavior literally means undefined behavior, C++ gets to bring to the table all of the failure modes JavaScript brings to the table in addition to the plethora of ways unchecked memory indirection can go wrong.

There are circumstances where you want the features C++ brings to solve a problem. But solving a problem with dynamite is still solving a problem with dynamite; there's a lot more ways it can go wrong than solving the problem with a steam drill, to torture an analogy a bit.

adev_ · on April 1, 2020

> But solving a problem with dynamite is still solving a problem with dynamite; there's a lot more ways it can go wrong than solving the problem with a steam drill, to torture an analogy a bit.

Excepted that a language is not absolute, is not dynamite or not dynamite. It is what you do of it depending of the subset you use. C++ is not exception.

shadowgovt · on April 1, 2020

Agreed. But I won't reach for a language that lets me treat all my program's working memory as an undifferentiated integer-addressable array of bytes if I don't need that to solve the problems I'm trying to solve.

... which goes back to the topic of the HN post; that's only one way to look at the state of a running program, and it's a way that has strengths and weaknesses. The utility of it comes at the cost of the program failing in ways programs in other languages structurally cannot fail (unless you do something truly exotic, like implement a subset of a C++ compiler or interpreter in that language).

spopejoy · on April 2, 2020

C++ has always been the language that "has a safe subset to use but nobody can agree what that subset is".

JS is ... well, JS

This is at least for me the first time I've seen them discussed in the same breath, which is revolutionary: C++ the language bringing a blue screen to a desktop near you; JS the language bringing your startup to its knees.

I guess I never realized that while they are polar opposites, there are probably some subtle cultural similarities that would make for some hilarious unearthing

jcranmer · on March 31, 2020

> the 2017-11-17 working draft of the C++ standard is 1,448 pages in PDF format.

To be fair, that contains both language documentation and library documentation. The language specification portion itself is only ~400 pages, which is smaller than the Java Language Specification and JS's specification (both about 700-800 pages, although JS does include its [meager] standard library in there).

armitron · on March 31, 2020

When it comes comprehensibility, the C++ standard is orderS of magnitude more difficult than the JLS due to the numerous interdependencies and layers upon layers of cruft. There is no single, cohesive model to understand. That is the fatal flaw of C++ and I guess what led to Stroustrup's couple-of-decades-too-late "Remember the Vasa!" proclamation.

pjmlp · on April 1, 2020

As someone that has been using Java, .NET, C++, OpenGL, DirectX, Win32, Linux and Web since their early days, you will never have a cohesive view in any of them, specially if not being there when that decision X was discussed in the community and there are no written traces of why it was taken that way.

lisper · on March 31, 2020

> I wonder when programmers who are required to care about correctness throw up their hands

I did that over 20 years ago and have never looked back.

amiga_500 · on March 31, 2020

C++ should only be used as a last resort. It's complex, development is slower, the tooling is poor, some universities have stopped teaching it. It's very hard to hire good C++ devs, and they are typically not cheap.

I manage a c++ team btw. Our app has to be fast.

pdimitar · on March 31, 2020

As your parent said, this is also an answer I can understand.

Still, have you guys looked at alternatives -- and seriously evaluate them? Rust, Zig, Nim, D, others?

If you tell me "we can't afford to, we have too much work" then that's also a valid answer (for a while at least).

amiga_500 · on March 31, 2020

The bespoke libraries etc have god knows how many man hours invested, it would be possible but gargantuan.

pjmlp · on April 1, 2020

Our alternatives are Java and .NET languages, with native bindings to C++ libraries.

C++ alternatives still need to grow up to this kind of mixed language development, where I can have .NET code with C++ libraries and easily debug across them on the same Visual Studio session.

Same applies to Java and C++ development experience.

pdimitar · on April 1, 2020

Not sure how fair to newcomers (as in languages) this requirement is.

F.ex. when coding in Erlang/Elixir I can use an excellent bridging library between their VM and Rust. There's also an excellent support for working with C libraries. Not sure what else can be expected.

Maybe I am not reading you correctly and I apologise if so. It just kind of sounded like "the newer languages must be compatible with 20+ different ABI standards if they want us the older programmers working in C/C++ to adopt them"?

Back on the original topic, I completely understand if a project has too much baggage and sunk cost so as to make even a partial migration (and hardening via using memory-safe languages and/or professional paid code analysis tools) unrealistic.

pjmlp · on April 1, 2020

Naturally it isn't fair, but one cannot expect teams to drop velocity and diminish their productivity only on basis of adopting a new language that doesn't support existing workflows, IDE tooling and libraries.

It is like telling someone to delivering games in Unity and Unreal that they should use Amethyst instead, without understanding what it means to actually having a team working in those ecosystems.

Naturally Rust will get there, however it took C++ 30 years to be where it is today, and that is what any replacement attempt should take into consideration.

pdimitar · on April 1, 2020

This is probably a much larger and not very relevant to the topic here discussion but I feel a lot of the newer breed of languages don't emphasise IDE tooling very much (not in the sense of Visual Studio / Eclipse and the like at least; VS Code gets a lot of attention though). So I fear if that's an expectation from a certain group of devs that this might already be a generational difference that's impossible to overcome.

That being said, I agree -- existing teams can't be expected to just run away to the new thing, that much is true.

pjmlp · on April 1, 2020

Depends where you look, Swift, Kotlin and TypeScript live from their tooling.

pdimitar · on April 1, 2020

My favourite language -- Elixir -- is the same. I also quite admire Rust's and Golang's tooling.

CLI tooling is the best it ever was in history (from where I stand at least). But IDE tooling isn't a first class citizen for many languages these days, is what I was saying.

pjmlp · on April 2, 2020

I beg to differ, I rather enjoy languages that pursue the dream from Xerox PARC than working like when I was in high school during the late 80's.

Just for reference, my first UNIX was Xenix.

Don't see the point why people buy expensive computers to just use them like we did at the university lab with IBM X terminals.

The main difference was that instead of Slack we had xterms dedicated to run talk sessions.

otabdeveloper2 · on March 31, 2020

> C++ has a standard that's too long and hard to understand, so let's use a language without any standard whatsoever instead!

Good lord, if that's the kind of human capital that's involved in making next-generation languages, then I have no doubt what will still be used to write serious software in 2080.

(Hint: not Rust.)

pdimitar · on April 1, 2020

You'd be more helpful if you specified what else would you deem appropriate -- as opposed to resorting to snarky sarcasm that brings nothing interesting to the discussion except supposedly degrade people whose tech choices you disagree with.

otabdeveloper2 · on April 3, 2020

The phrase 'undefined behavior' only makes sense in the context of an ISO standard.

For any particular combination of compiler, OS and computer architecture there are no 'undefined behaviors'; you only care about the concept if you want portability to a different compiler/OS/architecture.

So the idea that C++ sucks because of 'undefined behavior', so let's use some random language without an ISO standard or any portability guarantees at all is completely and utterly insane.

saagarjha · on March 31, 2020

What are you comparing it to? I think C++ tooling is fairly decent. (By the way, what do you work on?)

amiga_500 · on March 31, 2020

I think java has very good tooling.

Java's main issue is the barrier to entry is lower because it has good tooling, which means really bad, inexperienced programmers can write bad code, and have more tooling to bail them out, allowing them to continue down that path until they have a huge ball of mud.

There ought to be some kind of phrase for this "the tooling curse".

GDB isn't bad, IMHO, but the inability of IDEs to deterministically parse templates / index code without huge problems is problematic.

saagarjha · on March 31, 2020

> the inability of IDEs to deterministically parse templates / index code without huge problems is problematic.

More like the inability of text editors to parse templates; most IDEs rely on a full compiler backend that can usually figure things out.

amiga_500 · on March 31, 2020

But it shouldn't require a full compilation.

The ultimate result, given the slow compile times of c++, is that you cannot know in your editor if your code is correct.

saagarjha · on March 31, 2020

The IDEs I have used, or IDE-ified editors, run compilation ("indexing", or whatever) in the background to make this work.

still_grokking · on March 31, 2020

Maybe you want to share which IDEs you've been using? It sounds like a wonder if there is one that can handle C++ in a way comparable to, say, IntelliJ for Java, or VSCode for TypeScript.

My experience is frankly more that the usual IDEs I know of can't even handle syntax coloring 100% correct when it comes to C++…

saagarjha · on March 31, 2020

Xcode for well-structured (read: Xcode) projects. Sublime Text with clangd for random editing (not an IDE, but it gets good autocompletion and syntax checking from it).

amiga_500 · on March 31, 2020

This is problematic on large code bases. Huge mistakes were made on C++ design.

shadowgovt · on April 1, 2020

I completely forgot about this aspect of C++. I remember being surprised the first time I tried compiling a C++ library on my workstation and found I couldn't do so because my workstation ran out of memory. The library was Panda3D; I can't remember what kind of workstation I was running on at the time.

It turns out if you use enough layers of templates, the compiler has to carry around an awful lot of state to figure out what program it should actually output.

This is another downside to a 1400 page language specification; whether the user can hold most of it in their head or use a reliable, safe subset to avoid sharp edges, the compiler and tools still have to be aware of all of the layers of complexity in both the code a developer is writing and often whatever tricks and quirks developers chose to usein the libraries the developers code is depending on.

saagarjha · on March 31, 2020

How large are we talking about? I've been able to use these tools on codebases such as WebKit and LLVM…

amiga_500 · on April 1, 2020

LLVM has 2.5 million lines of code, but I think it also depends upon the number of templated methods etc, not just LOC.

I'm surprised you haven't seen this point made many times. It would be a full-time job on the internet asking people why they have code completion issues with C++ IDEs.

saagarjha · on April 1, 2020

No, I know what you're talking about; this is an issue in general if you'd not using tooling that's aware of the ins and outs of C++. The point is that the IDEs I have used essentially run the compiler on the file, so they can actually understand what the templates are doing and get through them.

Too · on April 1, 2020

Templates can inherently not be understood. It's duck typing 3 degrees of freedom with 3 unknowns. You need concepts which adds 1 known into the equation, (the final missing degree of freedom is the dreaded ctrl-click-into-an-interface-instead-of-the-concrete-class which is hard to avoid). Tell me one IDE that will give any meaningful information about buzz in the example below:

    template<typename T>
    void foo(T bar) { bar.buzz(); }

Other than that, finding foo<T> is usually easy, i've had good experience with qtcreator or clangd, you just have to be very very sure that your include paths are correct, qtcreator is good in that it uses CMakeLists as project-files so all this is done automaticaly.

twic · on March 31, 2020

Not in my experience. IntelliJ can't even figure out where a make_shared call goes.

saagarjha · on March 31, 2020

Does IntelliJ actually support C++?

ivankolev · on March 31, 2020

Probably talking about CLion, their C/C++ offering.

twic · on March 31, 2020

Yes, CLion, my mistake, sorry.

saagarjha · on April 1, 2020

Hmm, that's a bit disappointing…might I suggest trying out a LibClang-based IDE, if CLion isn't using it already? It can do wonders even to "stupid" applications. For example, my Sublime Text (by itself, with only basic knowledge of C++ keywords and such) with clangd tells me that std::make_shared comes from this code in <memory>:

  template<class _Tp, class ..._Args>
  inline _LIBCPP_INLINE_VISIBILITY
  typename enable_if
  <
      !is_array<_Tp>::value,
      shared_ptr<_Tp>
  >::type
  make_shared(_Args&& ...__args)
  {
      return shared_ptr<_Tp>::make_shared(_VSTD::forward<_Args>(__args)...);
  }

twic · on April 1, 2020

Oh no, what i mean is that it can't help me find the constructor of the thing i'm creating. Resolving a make_shared call to the implementation of make_shared is completely useless; the only thing you ever want to do is find the constructor it's forwarding to.

jkoudys · on March 31, 2020

Lots of languages suffer there -- eg there's no reason you _couldn't_ be a really excellent PHP developer who specializes in WordPress sites, but it's an uphill battle to prove you're a good dev because there so many bad ones in that space.

For that matter, most of the so-so devs resumes that come across my desk will list "HTML and CSS" as skills. Even among experienced devs, being legitimately good with HTML or CSS is extremely rare. I'd give my eye teeth to have a dev on staff who was an actual HTML expert (knows the w3c docs, can write clear, correct, semantic html using the right tags, organizes and expresses their intent, etc.), but since the barrier to entry is barely above basic literacy, so finding one is tough.

stjohnswarts · on March 31, 2020

I find modern c++ tooling to be great compared to even 10 years ago. I love the language. Between it and python I stay very busy with both.

edwintorok · on March 31, 2020

Wouldn't something like Rust meet your performance concerns?

amiga_500 · on March 31, 2020

The sunk cost is huge. My main point was: if you don't have to do it in c++, for the love of god do it in something like java. Yes java is not heaven on earth, but still...

StillBored · on April 1, 2020

  Why don't you use something else?

Because, most (all?) of those other languages fall into one of three traps, they also have a lot of undefined behaviors, they define the behaviors in ways that don't reduce actual programming bugs (javascript!), or they have so rigorously defined the language with underlying assumptions (say a really strong memory model, signaling NaN, overflow exceptions, etc) that the performance is sub-optimal on any platform that doesn't exactly fit the expected machine description.

Really, C isn't hard if you burn your copy of K&R and stop trying to be so damn clever. It also turns out to be a lot more readable, if a bit more verbose, if one ignores a lot of its "features" and pretends its pascal with a single statement, without side effects per line. That includes using pointers in any kind of arithmetic (or type casting), instead using them only as though they were C++ references. (and a few other basic rules).

So, while I don't think rust is a particularly good language, I'm also starting to think that everyone should be forced to use it early in their career so they are forced to consider object ownership and lifetime in a rigorous way. Then when they move to C/C++/etc they wont be foot-gunning themselves at every turn.

jonhohle · on March 31, 2020

Static analysis tools should be the standard part of any build process, but especially with C/C++.

I’ve mentioned before here, but when I found a reasonable number of bugs in critical library used throughout one of the FAANGs written by their most senior and brilliant engineers by running a fizzier on it for a few hours, my opinion of those languages changed.

Integrating Valgrind and Clang’s checker, and running AFL for many CPU years was critical in unearthing some really funky crashes, hangs, overflows. Without those tools it’s not clear that those bugs would have been fixed. In fact, one of them affected me a few months prior and AFL helped uncover the codepath that was failing.

shadowgovt · on March 31, 2020

Every language has some expression weaknesses, and C/C++'s are such that the languages make static analysis tools mandatory for correct code (much like Python makes every-line-is-executed unit tests mandatory for correct code).

enhray · on March 31, 2020

I don’t think that this bugginess is an inherent property of these languages, because there are other practices that could lead to reduction in total bug count and severity, apart from integrating additional tooling.

Out of curiosity, how many bugs you found by using these tools could have been avoided by using a “watertight” memory management system [0], with strong decoupling of pointer and object lifetimes?

[0] https://floooh.github.io/2018/06/17/handles-vs-pointers.html

AaronFriel · on March 31, 2020

I'm curious who the target demographic is for this, people who think they need bare metal performance in a language that gives them access to bare pointers, know that most humans aren't capable of following Uncle Ben's adage[1], and then voluntarily give it up while insisting they need it?

I think almost any JITed memory safe language will be faster than using handles for all object access. At least Java, .NET, JS, etc under the hood can avoid "double dispatch" of memory access. And you can use things like arenas to ensure same objects are allocated adjacently, etc.

[1] With great power comes great responsibility

saagarjha · on March 31, 2020

The solution presented solves "memory corruption" from the point of not having undefined behavior, but it doesn't really protect against "I accidentally created an index that has nothing to do with memory I manage, but is still 'in bounds' to the code that handles lookup".

stiray · on March 31, 2020

What is profoundly pissing me off are bunch of hipsters on the internet that are using xyz language and are constantly trying to compare everything to C. We are faster, we are better, we are more portable, we have JIT, we have reflection... and so on.

Who cares what your tool for solving the problem is. Use Java, use JS, use python, go, rust... whatever suits for the task. Why do you want to compare yourself to C. Just stop it. No one with enough expirience in coding will ever care what language you have used to acomplish the task. Unless the task is going to be purely written, slow, buggy, CPU consuming, will have security issues and so on. This can be acomplished with half-baked (I hate the expression, but sorry) coding monkey in any language. Why do you even care about C. It doesnt matter, all that are complaining about it in whatever way possible or compare to it, will probably never be involved in a task where C excels, so why bother?

---

jeffdavis: the problem is that people would like to use wrong tool for the job. And the fact that they cant frustrates them, blaming everything else except the fact that they picked the wrong tool. As you need expirience for picking the right tool. A lot of expirience or rather beeing fluent in all the tools. Which makes it a problem if you dont dedicate a lot of time to this.

---

jeffdavis #2: > choosing the wrong tool really is a great way to understand your tools and the problem more deeply.

YES! Exactly! I did mistakes, a lot of them, like pragma packing the structure and send it to system on other side of network in other endianness. Overwritting EIP number of times before I understood it. Or best one, translating command.com to my language using edit :D Used wrong call convention. But my goal was always to understand assembler. To understand C. To understand C++. And I have made quite a lot of code in all 3. Then learnt any language that crossed my way and it looked like interesting (Rust is still on my list) and used each in at least one project, hobby or not. Now everyone learns one or two languages, without any background knowledge and then starts to preach it, prove that language written in X is faster than language that X is written in and so on. Who cares! It makes me puke.

jeffdavis · on March 31, 2020

The "right tool for the job" is a myth (EDIT: exaggerating; clearly it applies in some cases). Languages don't work together very well, so you need huge, thick boundaries between them, such as serialization/deserialization. Data types don't match up, GCs require special data formats and don't line up with the GCs of other languages, other runtime weirdness, etc. At minimum, you need lots of copying/transformation of the data.

Have a gnarly problem in a Python program that Haskell would be perfect for? Too bad. You'll spend more time figuring out how to transform the data and get the Python code to call into Haskell than you'll save by using the right tool. And in the process, the overhead, complexity, and bugs introduced in this process turn the right tool into the wrong tool.

And that's the reason why C is so important. It actually does work with pretty much any language, because it has a defined ABI, simple data types, minimal runtime, can work on any data formats unmodified, no GC. Very few languages have this superpower.

C++ mostly has that superpower, but some features don't entirely work with unmodified structures (RTTI and virtual functions both require adding magical fields to the class structure). Rust seems to have this superpower (does dynamic dispatch a different way than C++, so can always work on unmodified structures).

The JVM is an interesting case and the ecosystem of languages built on it do work better together because they have the exact same runtime (GC, etc.).

To summarize: we'd all like to use the right tool for the job. But we can't, except for the languages that go out of their way to make this effective, and that list of languages is very short.

ChrisSD · on March 31, 2020

To be precise, C does not have a defined ABI (the spec goes out of its way to avoid this). Platforms (OS+Arch) define a C ABI for their own tools. Since this ABI is required to interop with the OS APIs, this will usually become the de-facto standard for the platform.

This might seem like nitpicking but it's an important distinction because you can't assume that C data structures can be passed between different platforms (e.g. over a network or even stored in a filesystem).

jeffdavis · on March 31, 2020

What is the right way to say that without sounding too pedantic every time? "Most platforms define a stable C ABI"?

pjmlp · on April 1, 2020

The only platforms that define a stable C ABI are those whose OS is written in C and exposes a C API to userspace.

Symbian did not had a C ABI for example, nor do mainframes, in ChromeOS or Android you also don't have a C ABI exposed to userspace, in Windows a C ABI also won't help you much if you need to speak COM or now UWP.

sseagull · on March 31, 2020

This really needs to be said more. For those of us not in web devel or microservices, the lower level of coupling between components really makes using multiple languages a pain. Plus, you have to learn all those languages -- sometimes I just want to actually get stuff done.

I consider it significant code smell when a project has multiple languages, particularly when each language seems tied to a particular developer.

lliamander · on March 31, 2020

A really thought-provoking argument. I suppose that when it comes to choosing "the right tool" the important thing is to optimize globally across the problems that your organization will be solving.

But if you want to be able to optimize both globally and locally, well that's just an engineering problem!

Two examples spring to mind:

1. Lisp. Lisp's metaprogramming facilities allow you to build the language into the specialized tools that you need, while still having the common runtime and base language.

2. Unix. There a hundreds (possibly even thousands) of little programs running on my Linux machine written in probably dozens of languages. For the most part, I have no idea (nor do I care) what language they are written in. That's because the Unix model puts a strong emphasis on the protocols for communicating between programs.

continuational · on March 31, 2020

Those programs communicate by serialization and deserialization, usually in bespoke, poorly documented data formats.

Unix doesn't put a strong emphasis on protocols. It just says "everything is text, except when it's not". It's not very helpful.

effie · on March 31, 2020

Modern unix gives you mostly low-level mechanisms, not inter-application protocols. It is better to leave it up to applications/administrators to figure out the best protocol (policy) for their use case, to make the best use of those mechanisms.

Those mechanisms aren't textual, they are byte-sequence based.

It turns out that many text-based protocols are more popular than binary/object alternatives. The penalty for textual redundancy is negligible/acceptable for many of them. Where not, Unix allows people to use unix/create new mechanisms to implement specialized protocols (rpc, protobuffers,...).

lliamander · on March 31, 2020

> Unix doesn't put a strong emphasis on protocols. It just says "everything is text, except when it's not". It's not very helpful.

OK, I probably misspoke by claiming it put an emphasis in protocols, but I think it's fair to say that Unix does emphasize component integration by sharing data instead of trying to integrate through a common runtime.

I do think that having everything be (mostly) text is helpful though. It's true that the data formats are often poorly specified, but that is compensated for by the fact text is a format that has a very rich set of tooling. text editors, regular expressions, parser-generators, etc. all make it possible to capture, analyze, and manipulate the text data exchanged.

Perhaps a better example of protocol-based integration would be the internet. It has it's flaws as well, but it's also enabled collective engineering projects on a vast scale.

adev_ · on April 1, 2020

You got it yes. And very few Software Engineer understand that unfortunately.

The reason C is so successful is named ABI.

C (and C++ to some extend) can be used to produce library that can be used from any higher level language in existence. And this is strictly required in the heterogeneous world we live in.

If you have to write a system components, long lifetime, implementing a protocol / database / format / any-low-level-staff: You do not have choice, safety or not. It is going to be C or C++, because it is the only thing reusable in most other environment.

Python has pybind11, Java has JNI, Lua operate beautifully with C/C++, Node is itself C++, go has cgo, Ruby is in C, any proper programming language can interface with C or C++.

They are the only languages allowing that, and that is why they are still alive, very well alive, even if they are unsafe.

As long as new language authors will be more obsessed about supporting new fancy feature instead of providing a well defined ABI (C compatible). The situation will not change.

Rust might be the new comer in this area that has its chance. But for the time being, it is still too young: you have to map every Rust API to a C one manually to make it exportable.

jeffdavis · on March 31, 2020

Lisp is an interesting point. I guess the argunent is that it's the right tool for every job?

Unix is a reasonable example of a polyglot environment, but at a high cost. Lots of serialization/deserialization through text. That has a high cost in terms of bugs, complexity, lines of code, and inefficiency.

lliamander · on March 31, 2020

> Lisp is an interesting point. I guess the argunent is that it's the right tool for every job?

Yup, that's more or less the argument, though the argument need not only apply to lisp.

> Unix is a reasonable example of a polyglot environment, but at a high cost.

I'd rather say that Unix is an example of how to effectively drive down the cost of a polyglot environment to a point where it is outweighed by the benefits.

Now, of course the cost is still there, but I would argue that beyond a certain scale you cannot optimize globally on a single runtime (perhaps not even with something like lisp) and so the cost of global consistency is outweighed by cost of being unable to optimize locally.

To put it more succintly, you could not build a system as complex as Unix in a single language[1]

[1] Although Ala Kay's work at VPRI suggests you can, if you choose a sufficiently expressive base language. But even that may have it's limits.

saagarjha · on March 31, 2020

> Rust seems to have this superpower (does dynamic dispatch a different way than C++, so can always work on unmodified structures).

This isn't really true.

jeffdavis · on March 31, 2020

Can you explain?

saagarjha · on March 31, 2020

Rust has both static (based on monomorphization, similar to a template instantiation in C++) and dynamic (based on vtables, similar to inheritance in C++) dispatch: https://doc.rust-lang.org/1.8.0/book/trait-objects.html. Both Rust and C++ are moving in the direction of encouraging static dispatch.

steveklabnik · on March 31, 2020

It may be similar, but it is implemented very differently, in exactly the "so can always work on unmodified structures" sense.

Trait objects are a double pointer: one to the vtable, and one to the data. C++'s dynamic dispatch uses a single pointer to a structure that has the vtable and the data. The memory layouts are very different.

jeffdavis · on March 31, 2020

"The memory layouts are very different."

Clarification: the memory layouts between C++ dynamic dispatch and rust dynamic dispatch are very different.

Rust imposes no requirements on the structure layout, whereas C++ adds some magic data. The magic data means it's awkward to make C++ do dynamic dispatch on a C struct pointer; but rust can do dynamic dispatch on a C struct pointer with no problem.

munificent · on March 31, 2020

> Use Java, use JS, use python, go, rust... whatever suits for the task.

A primary factor that determines whether a particular language is suitable for a task is its runtime performance.

> constantly trying to compare everything to C.

C has little runtime overhead and incredibly mature compilers, so it is an excellent target to use when measuring another language's runtime performance. "Close to C" is simply a synonym for "the language adds little runtime overhead".

tux3 · on March 31, 2020

> all that are complaining about it in whatever way possible or compare to it, will probably never be involved in a task where C excels, so why bother?

I don't think that's a fair assumption. You may be underestimating the number of developers who have a need for a systems programming language.

stiray · on March 31, 2020

Sure, I do. I am just hooking linux kernel calls, I will never take JS for this. Pure C. Valgrind? I have a header with #define MALLOC/FREE/REALLOC/... which returns larger buffer with guards at beginning and the end while after 25 years of C I hardly do anything more problematic that this. But seriously, would you give JS developer making kernel module? Ruby? PHP? No. So I wont compare with them and I would expect they would stop comparing to C. It is not the same task they are made for. I am sick of constant moaning about who is faster and who is better and who is cutter,...

System language? Make it. I dont care, if it will work better than C for the task, I will take it. If it is just a newage nonsense wasting resources for the sake of deallocating and taking care about buffer boundaries as it is so hard to take care and track your memory I will skip it. I dont care if I take a hammer or brick for using a nail, I will use what works best and my requirement is that I dont need 20 people to hammer in one nail for the sake of no one getting hurt due to putting its finger between hammer and nail. Someone would rather have a safety inspection. Also a fine decision as far as I care. Not my choice but I couldnt care less.

---

Anyway, kids, downvoting me to censor what you cant handle, I am over 40 years old, developing for my whole my life. I have seen hypes, new "revolutionary" ideas, repacking old technologies as new, stealing ideas (I see this lately 24/7) at least 20 new "revolutionary" languages, cpus, mainframes, clouds, any possible way of another human trying to rip you off (google, fb,...), lies, deceit, preaching, evangelism,... Do you really think I will care about it? It is just sorry truth that you will have fun of enjoying kernels written in JS on a browser, on a linux kernel that no one develops any more. Have you seen movie Idiocracy? You really should. I cant wait for it to happen. And then will I ask you if you are sorry, that 20 years back you didnt rather decide to learn instead of copy/pasting and spitting over everything you are unable to handle. (your parent should told you that, but they didnt, I am sorry)

----

saagarjha: sure, ignore the part about censoring, that is not part of debate. About everything else. You get used to it in a same manner as using floats in js. But this is my expirience. It is completely ok, to have different one. Some people are memorizing decks of cards, i consider that something difficult. They probably dont. They have learned to do it. And I am so sorry if dont feel like I need to prove it to you. Anyway, thank you for the cycript.org link, the only problem is that i cant use userspace solution, but it might come handy some day.

saagarjha · on March 31, 2020

I'm just going to respond to the top part of your post, as it's at least somewhat relevant rather than being a disconnected rant about the kids censoring you.

> I am just hooking linux kernel calls, I will never take JS for this. Pure C.

Why not? Hooking native code in other languages is already fairly common: http://www.cycript.org

> Valgrind? I have a header with #define MALLOC/FREE/REALLOC/... which returns larger buffer with guards at beginning and the end while after 25 years of C I hardly do anything more problematic that this.

Valgrind does significantly more checks than your solution.

> System language? Make it.

"Systems language" is a very ill-defined term, often used to gatekeep people.

> If it is just a newage nonsense wasting resources for the sake of deallocating and taking care about buffer boundaries as it is so hard to take care and track your memory I will skip it.

Keeping track of your memory is hard. If you say "no, it's easy for me", I'd like to see a non-trivial sample of your code that doesn't have memory safety issues.

ncmncm · on March 31, 2020

It is easy for engineers, because engineers are trained to make things correctly even when that is inconvenient.

Engineers make bridges that don't collapse (unless not maintained for decades) and planes that don't fall from the sky (unless overruled by management). By the evidence, it is not easy for people who just can't be bothered to take the time to make anything correctly.

So, there are languages for engineers to use to make things where it matters if they are right, and languages for everybody else, where apparently it doesn't. Now, all we need is for engineers not to need to call into libraries not written by engineers.

wasted_intel · on March 31, 2020

> it is not easy for people who just can't be bothered to take the time to make anything correctly

Laziness may make the problem more likely, but even a disciplined team working on security-critical software can make mistakes[0].

[0] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-0777

jeffdavis · on March 31, 2020

"people would like to use wrong tool for the job"

That's actually a great way to learn. I do that intentionally sometimes to force myself to understand a problem better. Sometimes it even works[1]!

I'm half serious here. Clearly when something is important and needs to be timely, choosing the right tool is important. But my comment is not sarcastic, either: choosing the wrong tool really is a great way to understand your tools and the problem more deeply.

[1] https://github.com/jeff-davis/postgres-extension.rs

hermanradtke · on March 31, 2020

C is the lingua franca of the programming world. Of course people are going to compare xyz language to C.

jcranmer · on March 31, 2020

One of the blog posts I keep meaning to write is in the vein of "Why C is not portable assembly." Ironically, most of my points about C failures aren't related to UB at all, but rather the fact that C's internal model doesn't comport well to important details about machines:

* There's no distinction between registers and memory in C. A function parameter that is "register volatile _Atomic int" is completely legal and also makes absolutely no sense if you want C to be a "thin" abstraction over assembly.

* There's no "bitcast" operator that changes the type of a value without affecting the value. The most common workaround involves going through memory and praying the compiler will optimize that away (while violating strict aliasing semantics to boot).

* No support for multiple return values (in registers). There's structs, but that means returning via memory because, well, see point 1.

* Missing machine state in the form of vector support (which is where the lack of the bitcast becomes particularly annoying). The floating-point environment is also commonly poorly supported.

* Missing operations such as popcount, cttz, simultaneous div/rem, or rotates.

* Traps are UB instead of having more predictable semantics.

* No support for various kinds of coroutines. setjmp/longjmp is the limit. You can't write zero-cost exception handling in C code, for example. Even computed-goto (a source-level equivalent of jump to a location held in a register) has no C representation, though it is present in some compiler extensions.

Animats · on March 31, 2020

No support for multiple return values (in registers). There's structs, but that means returning via memory because, well, see point 1.

Return values in registers are a machine level optimization. Whether something is in a "register" is up to the CPU today on most CPUs. If you set a value in the last few instruction cycles, and are now using it, it probably hasn't even made it to the L1 cache yet. Even if it was pushed on the stack. That optimization made sense on SPARC, with all those registers. Maybe. Returning a small struct is fine.

There's an argument for multiple return values being unpacked into multiple variables, like Go and Python, but that has little to do with how it works at the machine level.

Missing operations such as popcount, cttz, simultaneous div/rem, or rotates.

Now that most CPUs have hardware for those functions, perhaps more of those functions should be visible at the language level. Here's all the things current x86 descendants can do.[1] Sometimes the compiler can notice that you need both dividend and remainder, or sine and cosine, but some of those are complicated enough it's not going to recognize it.

Traps are UB instead of having more predictable semantics.

That's very CPU-dependent. X86 and successors have precise exceptions, but most other architectures do not. It complicates the CPU considerably and can slow it down, because you can't commit anything to memory until you're sure there's no exception which could require backing out a completed store.

[1] https://software.intel.com/sites/landingpage/IntrinsicsGuide...

jeffdavis · on March 31, 2020

"There's no "bitcast" operator that changes the type of a value without affecting the value."

Technically, you are supposed to use unions for that, though it's a pain. The PostgreSQL codebase is compile with -fno-strict-aliasing so that a simple cast will get the job done, but obviously that's technically out of spec.

jcranmer · on March 31, 2020

Actually, the guideline I've heard is that memcpy should be used, since memcpy has the magic property of copying the bytes without affecting the destination type. Unions are only legal in C99 and newer (although C99 erroneously includes this in its list of undefined behavior--this was fixed in C11); C89 and any version of C++ don't permit this behavior.

aidenn0 · on March 31, 2020

I keep hearing this rumor, but C99 did not erroneously include unions-as-type-punning in a list of undefined behavior. The normative language is unclear, but clarifications were added1][2].

1: http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_257.htm

2: http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm

quietbritishjim · on March 31, 2020

The parent comment seems to be saying that C99 does allow type punning with a union, with an explicit clarification as you say. In C89 it was ambiguous and in C++ it is explicitly disallowed (you are supposed to use memcpy).

quietbritishjim · on April 1, 2020

Sorry, I misread the parent comment. Now it's too late to delete mine, please ignore.

pwdisswordfish2 · on March 31, 2020

Or better, memcpy.

makapuf · on March 31, 2020

Which in this case may or may not copy memory. You're only able to express changing the meaning of some specific bytes by moving them which is the contrary of what you meant. Sure you're not moving things because normally the compiler will (and does) optimize, but you must say that you need the bytes to be copied.

dirtydroog · on March 31, 2020

> * Missing operations such as popcount, cttz, simultaneous div/rem, or rotates.

gcc/clang offers builtins for most of these. And if you give the compiler the right march parameter it'll quite likely turn your hand-rolled code for these operations into the x64 instruction you're targeting. Try it on Compiler Explorer.

andrepd · on March 31, 2020

There's no "bitcast" operator that changes the type of a value without affecting the value. The most common workaround involves going through memory and praying the compiler will optimize that away (while violating strict aliasing semantics to boot).

Access through a union allow the same region of memory to be interpreted as different types.

No support for multiple return values (in registers). There's structs, but that means returning via memory because, well, see point 1.

No mainstream compiler will ever not optimise that.

saagarjha · on March 31, 2020

> Access through a union allow the same region of memory to be interpreted as different types.

…only in C.

> No mainstream compiler will ever not optimise that.

Actually, it's mandated by the System-V ABI for structures of an appropriate size.

pjmlp · on April 1, 2020

Which is not supported in every OS out there.

saagarjha · on April 1, 2020

Right, but System-V is fairly mainstream, which is why I brought it up.

kg · on March 31, 2020

Using a union still has the going-through-memory problem, though. I spent a day or two trying to coax clang into doing the efficient thing with a union a couple years back when I could have gotten the job done in 30 minutes with raw assembly (alas...)

RE: Return values, you'd be surprised. You can't assume you'll get properly optimized code for something like that in WebAssembly for example, despite the fact that you're using an industry-standard compiler (clang) and runtime (v8 or spidermonkey).

saagarjha · on March 31, 2020

> Using a union still has the going-through-memory problem, though.

Not with modern compilers: https://godbolt.org/z/e6sRqh

kg · on March 31, 2020

That's a compile-time constant, effectively. What matters is real code.

saagarjha · on March 31, 2020

https://godbolt.org/z/MPCuw-

DerekL · on March 31, 2020

> * Missing operations such as … simultaneous div/rem

Doesn't the div function do this?

> * There's no "bitcast" operator that changes the type of a value without affecting the value. The most common workaround involves going through memory and praying the compiler will optimize that away (while violating strict aliasing semantics to boot).

> * Missing operations such as popcount, cttz, … or rotates.

Thankfully, these are coming in C++20.

aidenn0 · on March 31, 2020

Don't forget arithmetic operations that can detect various corner cases.

Writing something like:

  sum = x + y;
  if(sum < x || sum < y) {
      ...
  }

And then hoping the compiler will optimize your if statment to a single overflow CF check is a bit silly.

jstimpfle · on March 31, 2020

That would not prevent the overflow from actually happening, which would be technically UB if the sum is a signed type. It is possible to catch overflow in the last instant like this (in the last instant before executing the addition instruction) but the test is more involved. It's a pity that there are no standardized tools (AFAIK) but it's not possible to write a substantial piece of code in this way, anyway.

What you would generally do is assume that x and y can't be overflow. I.e. you need to have a rough idea what quantities you will process. And put a few checks and assertions in strategic locations.

aidenn0 · on March 31, 2020

Right, I was assuming x and y were unsigned types at least as large as an integer. Tests for other types are different.

comex · on April 1, 2020

If you just use `sum < x` or `sum < y` rather than both, C compilers do reliably perform that optimization. (You don't need both; `sum < x` and `sum < y` are always both false or both true.)

vbezhenar · on March 31, 2020

Is there any low level language, which is actually close to modern x86_64?

jcranmer · on March 31, 2020

Probably the closest you'll find today is LLVM IR (or maybe there's a slightly closer IR in some other compiler). In terms of languages you'd actually like to program yourself, Rust and C++ are slightly closer, but that's mostly a factor of trying to incorporate more hardware features rather than modelling hardware better.

badtuple · on March 31, 2020

Does anyone code directly in LLVM IR? I've had the thought of toying with it as a better portable assembly with a crazy optimizer behind it...but since the LLVM folk want it to be an implementation detail they warn against it due to how volatile it is across versions.

I'd be really interested to hear experiences of people who have done it though, even as a toy.

jcranmer · on March 31, 2020

I've coded directly in LLVM IR a few times (mostly for testing). It's not pleasant, primarily because you have to manage the SSA construction yourself, although the CFG can also be annoying.

There is a small, very restrictive class of LLVM IR that is going to be effectively stable, and even portable. Stripping debug information goes a long way to making your IR readable by newer versions of LLVM, and staying away from C ABI compatibility (especially structs) can make your code somewhat portable.

steveklabnik · on March 31, 2020

LLVM IR isn't exactly portable, so that doesn't really work.

pjmlp · on April 1, 2020

The bitcode variant used by Apple for their OSes tends to be more clean in that regard.

saagarjha · on March 31, 2020

The only language close to modern x86_64 is x86_64 itself. Well, no, not really, because assembly instructions get dispatched as micro-ops…

overgard · on March 31, 2020

To be fair, even assembly language isn't how the computer works (gets translated into micro code). Not to mention other integral components like the GPU that are coded in an entirely different model.

I think what's fair to say is: the "abstract C machine" tends to have a minimal amount of concepts on top of the machines instruction set, compared to other languages, and generally provides the least friction if you need to poke a memory address or layout a structure in a very specific way. It's just easier to say it's closer to the metal because that's a mouthful.

dooglius · on March 31, 2020

Not really, assembly language is extremely close to microcode, much much closer than it is to C. C has a whole bunch of baggage like pointer provenance, inability to read uninitialized memory, doesn't understand cache aliasing, etc that don't fit hardware well.

vardump · on March 31, 2020

No, it really isn't on anything modern, like ARM SoCs or x86 CPUs. The number of times I've hit my head to wall trying to understand what the heck the CPU is doing for a given sequence of instructions, like why the number of cycles or memory bandwidth required is way different from my expectation.

A modern CPU willy nilly reorders instructions, stores, and creates new "virtual" registers out of thin air to dissolve stall inducing dependency chains. It can also split instructions, fuse them together and even entirely remove some instructions -- as long as it doesn't affect the end result.

Generally (within limits of CPU memory model) the only thing you're guaranteed is that eventually the result is what you'd expect from sequential execution. That of course applies on the core you're running on, otherwise you naturally need to synchronize with other cores.

Luckily there are tools to find out, which work to some extent, IF you can run it on same CPU model (and really, whole system, incl. memory subsystem!) as what you're interested in.

dooglius · on March 31, 2020

Can you give an example? The most complicated assembly->microcode examples I can think of are floating-point instructions like FSIN that run CORDIC loops under the hood, and those are still pretty easy to reason about.

vardump · on March 31, 2020

Pseudo-code:

  a = 10;
  b = 20;
  c = b + 30;
  bar = a + b;

  x = 1;
  y = 2;
  z = y + 3;
  foo = x + y;

... in order to remove dependency stalls might be executed as:

  b = 20;
  y = 2;
  a = 10;
  x = 1;
  c = b + 30;
  z = y + 3;
  bar = a + b;
  foo = x + y;

This would still be true, even if 'x' and 'a', 'y' and 'b' etc. variables (well, registers) had same name. In that case CPU would just make up new "variables" as required and do the same transformation!

dooglius · on March 31, 2020

Okay, you're talking about OoO execution here, and you're right that this can be hard to reason about (in terms of stalls/latencies), however that's orthogonal to microcode translation.

vardump · on March 31, 2020

OoO and the fact that CPUs really are free to do whatever transformations they deem fit, as long as the end result within limits of memory model is same.

jcranmer · on March 31, 2020

The CPU must conform to the strictures of the ISA, which is a much more stringent specification than the abstract machines of most language specifications. In particular, the state of registers and flags at any given time need to be preserved, even in the presence of external interrupts.

Note that several decades ago, there were processors that weren't capable of actually keeping the state correct after a processor exception, so if you got a division-by-0 error, your program counter had advanced by 30 or so. (This is, I believe, part of the reason why traps are undefined behavior in C: if your processor can't guarantee any state after a machine trap, it's impossible to implement a programming language that has even the loosest guarantees).

vardump · on March 31, 2020

> The CPU must conform to the strictures of the ISA, which is a much more stringent specification than the abstract machines of most language specifications. In particular, the state of registers and flags at any given time need to be preserved, even in the presence of external interrupts.

https://en.wikipedia.org/wiki/Tomasulo_algorithm.

So no, the consistent serialized ISA-conforming state might be reconstructed by replaying after the exception (interrupt) arrives.

Alternatively, interrupt might be served only after execution reaches the next checkpoint.

overgard · on April 1, 2020

I don't understand how you can acknowledge the cpu is literally rewriting your code on the fly by changing the order of operations and making things massively parallel and rewriting things in a proprietary language, while holding the belief that "microcode is really not that different from assembly". At least you can inspect what a C compiler emits to assembly.

Jasper_ · on March 31, 2020

This has nothing to do with microcode. It doesn't even have anything to do with microops. Reordering instructions (likely register renaming or load-store forwarding here, though the pseudo C code doesn't let me determine) is just part of how the processor retires instructions.

vardump · on March 31, 2020

The example I gave, yes. No assembler code can have anything to do with micro-ops; they're implementation specific.

On x86, a lot of instructions are close matches, yet some are removed entirely (say "xor eax, eax") or fused into one micro-op, (like "cmp #123, eax / je <address>").

Future CPUs might even do some data flow analysis to optimize code even further.

Say speculative "constant" folding based on runtime profile to remove chunks of code from hot inner loops.

Or to replace longer instruction patterns with HW optimized implementation, if that's what it takes to get some extra performance for the next year's model.

https://en.wikichip.org/wiki/macro-operation_fusion

Jasper_ · on March 31, 2020

Registers are allocated (and renamed) after instruction decode (e.g. what creates micro-ops), in a separate unit. Micro ops themselves do not have renamed registers.

See the diagram here https://software.intel.com/sites/default/files/managed/9e/bc...

vardump · on March 31, 2020

True. I simplified implementation details for clarity.

8K832d7tNmiQ · on March 31, 2020

You are mixing up how branch prediction works with how processing unit works.

One of those only worth of time to learn as hardware engineers, but understanding both of them will give you an introduction to Spectre vulnerability.

vardump · on March 31, 2020

No, branch prediction is not involved with what I was talking about.

radarsat1 · on March 31, 2020

> pointer provenance, inability to read uninitialized memory

what do you mean by these two? not sure about the former. as for the latter, of course you can read uninitialized memory, unless you mean something else?

steveklabnik · on March 31, 2020

For the first, https://blog.regehr.org/archives/1621

For the second, https://www.ralfj.de/blog/2019/07/14/uninit.html

(the second one is a bit more Rust focused but the core idea is the same)

umanwizard · on March 31, 2020

No, reading uninitialized memory in C is UB.

tspiteri · on March 31, 2020

As far as I know (and I could be wrong), reading uninitialized memory in C is not UB, but gives an indeterminate value, which may be either an unspecified value or a trap representation. If it is a trap representation, accessing it is indeed UB, but if all possible values of your memory are valid, e.g. if you have an unpadded integer where every bit representation means something valid, then it is not UB, though it is still an unspecified value which does not even have to be consistent: you could get different values when reading the same uninitialized location twice.

Edit: Some excerpts from the C standard:

6.7.8 Initialization

10 If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.

3.17.2 indeterminate value

either an unspecified value or a trap representation

3.17.3 unspecified value

valid value of the relevant type where this International Standard imposes no requirements on which value is chosen in any instance

6.2.6 Representation of types

6.2.6.1 General

5 Certain object representations need not represent a value of the object type. … Such a representation is called a trap representation.

lmkg · on March 31, 2020

Not a C expert, just reading what I can find online:

This page says value is indeterminate, which is either unspecified or a trap, as you say: https://wiki.sei.cmu.edu/confluence/display/c/EXP33-C.+Do+no...

But.

This part says that reading an indeterminate value is, in fact, undefined behavior (line 11 in the table): https://wiki.sei.cmu.edu/confluence/display/c/CC.+Undefined+...

tspiteri · on March 31, 2020

That says used, not read. And indeterminate, not unspecified. My point was on reading a value, not on acting on the value. I know it looks like nit-picking, and you should just not read uninitialized memory, but I don't think it's consistent with the standard to say that reading uninitialized memory is always UB.

lmm · on April 1, 2020

What kind of read would not constitute "using" a value? And per your previous post an object that was not initialised is indeterminate, not just unspecified. So yes, reading uninitialized memory is always UB.

madmax96 · on April 1, 2020

A read into a character type.

From 6.2.6.1¶5:

>Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. ... Such a representation is called a trap representation.

A read from uninitialized memory is not always UB.

lmm · on April 2, 2020

What you quoted doesn't actually say anything about what happens when a trap representation is read into a character type. Such a read is still "using" the value at least in the everyday sense of the word, so in the absence of something explicitly to the contrary, as far as I can see the part of the standard that states that using an uninitialised value is UB stil applies.

Per http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_451.htm , the current standard is unclear in some respects, but the latest committee view is that that under the current standard any library function (including memcpy) may exhibit undefined behaviour when called with uninitialized memory, even when the uninitialized memory is of character type or is propagated through values of character type.

umanwizard · on March 31, 2020

Thanks, it seems you are right. Sorry that you got downvoted whereas my wrong statement got upvoted; such is the nature of HN.

dooglius · on March 31, 2020

That's the way I think it _should_ work, but sadly does not. For instance, see https://godbolt.org/z/ent-xp

tspiteri · on March 31, 2020

Your code is

    int x;
    if(x == 0) foo();
    if(x != 0) foo();

That is reading the uninitialized value twice. Since it is unspecified, it does not have to be consistent, so you could get the same behavior as non-zero for the first reading, and zero for the second reading. Changing your code to:

    int x;
    if(x == 0) foo();
    else foo();

will give different output (same if you use !=).

lmm · on April 1, 2020

Unspecified values are constant (since they are values), just unspecified. This behaviour is only permitted because reading uninitialised memory is UB; you won't see the same thing if x is an unspecified value.

ncmncm · on March 31, 2020

A good optimizing compiler will just elide the whole code block.

tspiteri · on March 31, 2020

Only if there was UB, and the point is that there probably isn't. (I'm not really knowledgeable on the C standard, so I might have misinterpreted something.)

comex · on April 1, 2020

Indeed, but in fact there is UB:

https://stackoverflow.com/questions/11962457/why-is-using-an...

ncmncm · on April 1, 2020

Making an "if" statement depend on the value of the uninitialized memory is an example of what is meant by using the value, and thus UB.

In the compiler discovers UB, the Standard places no requirements of any kind on the program or compiler. It is free to launch missiles, or (more likely) assume this code cannot be reached, and omit it from the program, along with any code that reaches it unconditionally, and any check that would send control that way. Such elision is the basis for many important optimizations.

Implementations are free to define things left undefined by Standards. For example, "#include <unistd.h>" is UB by the ISO Standard, but defined by Posix, which implementations also adhere to.

eternalban · on March 31, 2020

No. It is not "undefined behavior". The "behavior" here is assignment (reading memory). It always will behave precisely and consistently: it _will_ assign the "indeterminite" value of the un-initialized memory. Using such an "indeterminite value" in other operations (e.g comparison per the link Steve B. posted above: https://www.ralfj.de/blog/2019/07/14/uninit.html) is the UB bit.

throwaway_pdp09 · on March 31, 2020

> [reading memory} always will behave precisely and consistently

What is the point of not making it UB? You can't AFAICS possibly rely on any useful way on it behaving 'precisely and consistently' so just make it UB anyway?

saagarjha · on March 31, 2020

I think this lets you copy it around, e.g. when you have a struct that you have partially initialized.

throwaway_pdp09 · on March 31, 2020

That makes sense, finally. Thanks.

chrisseaton · on March 31, 2020

Can you write a well-defined C program that reads uninitialised memory?

saagarjha · on March 31, 2020

Reads? I think so; I believe the bad part is when you try to use it.

chrisseaton · on March 31, 2020

No, it is undefined behaviour. Check the spec.

saagarjha · on March 31, 2020

I couldn't find the verbiage for that. The standard seems to have this under undefined behavior:

> The value of an object with automatic storage duration is used while it is indeterminate.

But reading the value without using it seems fine?

chrisseaton · on March 31, 2020

The 'object' in this context is the storage location, not the value you read from it. You're 'using' the 'object' when you read from it.

saagarjha · on March 31, 2020

Oh, not that kind of read. I meant "read" in the context of "without actually using its value in a way that affects your program execution"–not, in what I believe the C++ term is–an ODR use. As in here: https://news.ycombinator.com/item?id=22740582

dooglius · on March 31, 2020

Not portably, you could of course use inline assembly in practice, but the normal way of doing it is UB.

BenjiWiebe · on March 31, 2020

Wouldn't this do it?

int a; printf("%d", (&a + sizeof(int)));

chrisseaton · on March 31, 2020

...but that isn't a well-defined program, is it?

umanwizard · on March 31, 2020

Another major way in which assembly language is closer to the hardware than C is the presence of explicit SIMD instructions.

overgard · on March 31, 2020

I'm intrigued that you make that claim when as far as I know most microcode is proprietary and not necessarily documented. Also that claim is highly dependent on which cpu and manufacturer you're talking about, if the cpu needed to be updated, etc.

dooglius · on March 31, 2020

The micro-architectures are reasonably documented enough (e.g. we know what each the execution ports do, that register renaming is a thing) and if you look at e.g. Agner Fog's instruction tables [0] that map macro-instructions onto latencies and port counts, that gives a pretty reasonable picture of what's going on.

[0] https://www.agner.org/optimize/instruction_tables.pdf

overgard · on March 31, 2020

Let me up the ante a bit. You also have to consider out-of-order execution, and instructions being executed in parallel, and speculative execution.

Let me quote wikipedia here:

> The Pentium 4 can have 126 micro-operations in flight at the same time. Micro-operations are decoded and stored in an Execution Trace Cache with 12,000 entries, to avoid repeated decoding of the same x86 instructions. Groups of six micro-operations are packed into a trace line. Complex instructions, such as exception handling, result in jumping to the microcode ROM. During development of the Pentium 4, microcode accounted for 14% of processor bugs versus 30% of processor bugs during development of the Pentium Pro.

Like, that's one paragraph, there are others I can cherry pick to show just how complicated the execution of microcode is. Unless you're an actual engineer on processor development and have insider info, I find it highly suspect that you could look at a decently sized chunk of assembly code and really know what happens with the microcode.

Sure you can have a table that says, it takes ballpark about this long. Ok fine, but that's not the same thing as assembly mapping cleanly to what the processor is actually doing.

compiler-guy · on March 31, 2020

In other words, the assembly code itself isn't really a good picture of what is happening.

Jasper_ · on March 31, 2020

Microcode is not used for the majority of operations. Are you confusing it with micro-ops?

dooglius · on March 31, 2020

Maybe you have a different definition in mind than I do? I'm using microcode to mean sequences of micro-ops. Wikipedia [0] seems to agree, "the microcode is a layer of hardware-level instructions that implement higher-level machine code instructions or internal state machine sequencing in many digital processing elements". My understanding is that with a couple exceptions (IIRC mov and zeroing xor are treated specially as part of register renaming) all assembly instructions get translated into microcode (i.e. a sequence of micro-ops).

[0] https://en.wikipedia.org/wiki/Microcode

Jasper_ · on March 31, 2020

I don't have the patience to go fix Wikipedia, but microcode is a patching system (it's what "processor microcode updates" means). Most of the time, that's adjusting chicken bits and other flags. Instructions can be implemented in microcode, but they are really, really slow so it's typically done for security reasons or to emulate some new features that don't require fast performance.

Micro-ops are part of the micro-architecture of the processor, and are in hardware. They are not patchable and are not software.

Twirrim · on March 31, 2020

An example of Instructions being implemented in Microcode is AMD's implementation of PDEP and PEXT on the Zen and Zen2 chips, leading to shockingly bad performance of 289 cycles vs 1 on Intel:

https://twitter.com/uops_info/status/1202950247900684290 https://github.com/llvm-project/llvm/blob/master/lib/Target/...

vardump · on March 31, 2020

Microcode on x86? Don't forget to also greet our friends such as RDTSCP, CPUID, RDMSR, POPCNT (on some models) etc. Also remember to check ENTER, BOUND, etc. out in the museum vitrine.

But yeah, microcoded instructions are relatively rarely executed.

Jasper_ · on March 31, 2020

I can't find any evidence that RDTSCP is microcoded. That would defeat the whole purpose of a high-performance counter. Any source?

PeCaN · on March 31, 2020

Agner Fog's instruction tables list it as issuing ~23 fused uops (a bit more or fewer depending on generation) and a throughput of 1 per ~32 cycles. That seems like it could be microcoded.

vardump · on March 31, 2020

I can't find any primary source, but I'm pretty sure about it.

throwaway_pdp09 · on March 31, 2020

I don't think wikipedia is broken, so plz don't fix. The definition of microcode matches my understanding. I've never heard it used as a patching system per se, ever.

Jasper_ · on March 31, 2020

What do you think those "processor microcode updates" are, then? They don't have anything to do with micro-ops, or really have any influence over the core micro-architecture. It would be way too slow to make that programmable.

People have this common misconception that the programmable micro-code is what your CPU is actually executing, and x86 somehow translates into instructions for it, and this was really just because of a conflation of the terms "micro-code" and "micro-op".

Admittedly, Intel isn't the best at this term either. They have several places in the Architecture Manuals where they refer to the "micro-code synthesizer" when they mean "micro-op synthesizer"; this really has nothing to do with the micro-code ROM.

throwaway_pdp09 · on March 31, 2020

They're for updating the microcode. That is tangential to their use AIUI, but useful.

Also AIUI the microcode controls the issuing of the micro-ops.

> It would be way too slow to make that programmable

Then what is the "processor microcode updates" updating? I think this may just be a terminology mixup.

Dunno if this helps, FYI from https://stackoverflow.com/questions/17395557/observing-stale...

...and I can't copy/paste it. In the above link, look for 'embarrassing' by Krazy Glew.

Symmetry · on March 31, 2020

Back in the days of VAX practically everything was done via microcode instructions, perhaps that's what they're thinking of?

gruez · on March 31, 2020

>inability to read uninitialized memory

    printf("%x", *(int*)0x12345678);

chrisseaton · on March 31, 2020

This is not a well-defined C program though.

fennecfoxen · on March 31, 2020

At some point, the question of "well-defined C program" is philosophical. Let's get down to brass tacks: If someone sat down and wrote this, would it happily compile? Would it compile under some compilers and configurations but not others? Would it throw warnings instead of errors under some configurations? Which of these configurations are the default configurations? Which non-default configurations are considered best practices? How complex are the best practices, and how widespread is knowledge about them?

In short, how much does the concept of a "well-defined C program" differ from the concept of a "C program" as implemented in practice? Can we say that most C programs, or even a medium-sized percentage of C programs, are well-defined?

overgard · on March 31, 2020

Most C programs are indeed not well-defined. Probably most of them.

hexane360 · on March 31, 2020

Even casting 0x12345678 to int* is undefined behavior

NobodyNada · on March 31, 2020

It's implementation-defined, not undefined (see [0]). That means the behavior is well-defined by your implementation rather than by the C standard, so the code may work on one implementation but not on others.

[0]: https://stackoverflow.com/q/2397984/3476191

nkurz · on April 1, 2020

What makes you sure of this? I'm reasonably familiar with modern C, but I don't feel confident of the answer here. A search of Stackoverflow doesn't bring up anything that seems authoritative for C. The most relevant quotation I can find is in the Rational for C99, where Section 6.3.2.3 has:

Implicit in the Standard is the notion of invalid pointers. In discussing pointers, the Standard typically refers to “a pointer to an object” or “a pointer to a function” or “a null pointer.” A special case in address arithmetic allows for a pointer to just past the end of an array. Any other pointer is invalid.

An invalid pointer might be created in several ways. An arbitrary value can be assigned (via a cast) to a pointer variable. (This could even create a valid pointer, depending on the value.) A pointer to an object becomes invalid if the memory containing the object is deallocated or moved by realloc. Pointer arithmetic can produce pointers outside the range of an array.

Regardless how an invalid pointer is created, any use of it yields undefined behavior. Even assignment, comparison with a null pointer constant, or comparison with itself, might on some systems result in an exception.

I'm not a language lawyer, but I suspect this means that even initialization to the wrong literal might well be undefined behavior. What makes you confident that it's not, and instead is merely implementation defined?

icedchai · on April 1, 2020

I found this: https://stackoverflow.com/questions/51083356/does-the-c-stan...

What if it's not an "invalid pointer", but a pointer to a memory-mapped IO address, ROM, etc? I grew up learning C on 16-bit machines in the early 90's. Hard coded pointer values were very, very common.

nkurz · on April 1, 2020

Good find! I don't think there is anything "authoritative" there, but the discussion seems high quality. My take is that a lot of smart people disagree on which parts of that example are implementation-defined, implementation-undefined(!), or undefined-behavior. Most (but not all) think that the initial assignment is implementation defined, but 'davislor' suggests in his answer that "the line void * ptr = (char * )0x01; is already potentially undefined behavior, on an implementation where (char* )0x01 or (void* )(char* )0x01 is a trap representation".

> What if it's not an "invalid pointer", but a pointer to a memory-mapped IO address, ROM, etc?

Yes, this is central to the question. And how is the compiler to know? Is it safe to presume that the compiler can't know, and thus can't presume undefined behavior? I think the answer is in the comments you linked where 'supercat' replies to 'Peter Cordes':

The Standard makes no attempt to mandate that all implementations be suitable for low-level systems programming, nor does it in any way imply that it's possible to have a quality implementation that is suitable for low-level or systems programming without it supporting behaviors beyond those mandated by the Standard (and which might not be processed predictably by implementations that aren't suitable for systems programming).

Which is to say, yes, for a compiler implementation to actually be useful for low-level programming, it must behave in a predictable manner when given literal addresses. Unfortunately, it may be possible for a C compiler to be "standards conforming" without actually being useful for this purpose. One can only hope that at least some compilers will continue to "do the right thing" despite that lack of explicit requirements.

comex · on April 1, 2020

> Unfortunately, it may be possible for a C compiler to be "standards conforming" without actually being useful for this purpose.

It is possible for a C implementation to be conforming witout supporting low-level programming. Sometimes it even makes sense, like if you're running C code on GraalVM's LLVM bitcode interpreter. But there is no trend of "standard" C implementations following this route. While modern C compilers like to aggressively exploit undefined behavior, they generally make reasonable decisions for implementation-defined behavior. In this case, all major compilers will compile

    *(int*)0x12345678

to the obvious assembly, and what happens then depends on what your memory map looks like.

(Caveat: the compiler will still perform normal optimizations like removing unused loads or redundant stores. If the address actually points to memory-mapped I/O, you need `volatile` to prevent that.)

icedchai · on March 31, 2020

This happens all the time on embedded systems, when you're writing a device driver, etc. Practically speaking, it's well defined what will happen.

pjmlp · on April 1, 2020

Except that those concepts are also available in Modula-2, Ada, Object Pascal, PL/I, Basic among other possibilities.

spicymaki · on March 31, 2020

Wait until they learn that machine code is not exactly how the computer works either. The hardware is doing things to your code you might not expect.

ThrowawayR2 · on March 31, 2020

Agreed, The example the author uses in his previous post on the topic talks about cache-unaware code but it's perfectly possible to write cache-unaware code in machine language as well.

I'd say "C is not how the computer works ... but it's much, much closer than nearly every other language."