Set e-graphs apart from e-matchers. An e-graph is a vector-of-sets, where each set contains equal representations of the same thing. What makes it space efficient is that your representation refers to things by index number in the vector, so if you have "3 * (x + 1)" as an expression, you store "x + 1" in vector #1 and "3 * #1". That way if "x + 1" is also known to equivalent to "load ptr" then you add that to #1 and don't need to add anything to #0. In e-graphs, #0 and #1 are e-classes, the sets holding equivalence classes. There's no limit to what transforms you can use, and let the e-graph hold them efficiently.
E-matchers are an algorithm over e-graphs to efficiently search for a pattern in an e-graph, for instance if you want to find "(load ptr) * (load ptr)" then you have to find "#x * #x" for all x in all e-classes, then check whether "load ptr" is a member of e-class #x. This is where you get limits on what you can match. "Pattern matching" style transforms are easy and fast using e-matchers, things like "x * 2" -> "x << 1", but beyond that they don't help.
There's an optimizer problem where you have "load ptr" and you solve ptr and figure out it's pointing to a constant value and you replace the load instruction with the constant value. Later, you get to code emission and you realize that you can't encode your constant in the CPU instruction, there's not enough bits. You now need to take the constant and stuff it into a constant pool and emit a load for it. If you had stored them in an e-graph, you could have chosen to use the load you already had.
Suppose you wanted to do sparse conditional constant/constant-range propagation but your IR uses an e-graph. You could analyze each expression in the e-class, intersect them, and annotate the resulting constant-range on the whole e-class. Then do SCCP as normal looking up the e-class for each instruction as you go.
> I was bitten by one that injected timing attacks into certain integer operations by branching on the integer data in order to optimize 32-bit multiplications on 8-bit microcontrollers.
FWIW, I think this should be considered a language design problem rather than an optimizer design problem. Black box optimizer behaviour is good for enabling language designs that have little connection to hardware behaviour, and good for portability including to different extensions within an ISA.
C doesn't offer a way to express any timing guarantees. The compiler, OS, CPU designer, etc. can't even do the right thing if they wanted to because the necessary information isn't being received from the programmer.
To a large degree, constant-time programming is hampered by the fact that even hardware is often unwilling to provide constant-time guarantees, let alone any guarantees that the compiler would care to preserve. (Although, to be honest, constant-time guarantees are the sort of things that most compiler writers prefer to explicitly not guarantee in any circumstances whatsoever).
Your comment makes me wonder about the idea of building a superscalar, out of order, speculative implementation of the 6502 or 8080 instruction sets. Might make a good educational project.
Few languages provide such guarantees. But, there really was no way with this particular compiler to pass a hint to generate constant time code.
Black box designs work until the knob or dial you need to control it isn't there. I would have taken a pragma, a command-line option to the compiler, or even a language extension.
This is one example of many as to why I think that user-guided code generation should be an option of a modern tool suite. If I build formal specifications indicating the sort of behavior I expect, I should be able to link these specifications to the output. Ultimately, this will come down to engineering, and possibly, overriding or modifying the optimizer itself. An extensible design that makes it possible to do this would significantly improve my work. Barring that, I have to write assembler by hand to work around bad assumptions made by the optimizer.
> No, literally nobody has ever wanted to be directed to "foundem", which is a scam.
Could you defend the statement that "foundem" is a scam? If it really was then that adds important context to the ruling, but I'll need something more concrete than hyperbole.
It turns out it's really hard. Let's take an easy-to-understand example, signed integer overflow. C has unsigned types with guaranteed 2's complement rules, and signed types with UB on overflow, which leaves the compiler free to rewrite the expression using the field axioms, if it wants to. "a = b * c / c;" may emit the multiply and divide, or it can eliminate the pair and replace the expression with "a = b;".
Why do we connect interpreting the top bit as a sign bit with whether field axiom based rewriting should be allowed? It would make sense to have a language which splits those two choices apart, but if you do that, either the result isn't backwards compatible with C anyways or it is but doesn't add any safety to old C code even as it permits you to write new safe C code.
Sometimes the best way to rewrite an expression is not what you'd consider "simplified form" from school because of the availability of CPU instructions that don't match simple operations, and also because of register pressure limiting the number of temporaries. There's real world code out there that has UB in simple integer expressions and relies on it being run in the correct environment, either x86-64 CPU or ARM CPU. If you define one specific interpretation for the same expression, you are guaranteed to break somebody's real world "working" code.
I claim without evidence that trying to fix up C's underlying issues is all decisions like this. That leads to UBSan as the next best idea, or at least, something we can do right now. If nothing else it has pedagogical value in teaching what the existing rules are.
I'm not sure I understand your example. Existing modern hardware is AFAICT pretty conventionally compatible with multiply/divide operations[1] and will factor equivalently for any common/performance-sensitive situation. A better area to argue about are shifts, where some architectures are missing compatible sign extension and overflow modes; the language would have to pick one, possibly to the detriment of some arch or another.
But... so what? That's fine. Applications sensitive to performance on that level are already worrying about per-platform tuning and always have been. Much better to start from a baseline that works reliably and then tune than to have to write "working" code you then must fight about and eventually roll back due to a ubsan warning.
[1] It's true that when you get to things like multi-word math that there are edge cases that make some conventions easier to optimize on some architectures (e.g. x86's widening multiply, etc...).
The wording change from Permissible to Possible and making it non-normative was an attempt to clarify that the list of behaviors that follows is a false range and not an exhaustive list.
It's a submarine change because in the eyes of the committee, this is not a change, merely a clarification of what it already said, to guard against ongoing misinterpretation.
This is the most normal case though, isn't it? Suppose a very simple compiler, one that sees a function so it writes out the prologue, it sees the switch so it writes out the jump tables, it sees each return statement so it writes out the code that returns the values, then it sees the function closing brace and writes out a function epilogue. The problem is that the epilogue is wrong because there is no return statement returning a value, the epilogue is only correct if the function has void return type. Depending on ABI, the function returns to a random address.
Most of the time people accuse compilers of finding and exploiting UB and say they wish it would just emit the straight-forward code, as close to writing out assembly matching the input C code expression by expression as possible. Here you have an example where the compiler never checked for UB let alone proved presence of UB in any sense, it trusted the user, it acted like a high-level assembler, yet this compiler is still not ignoring UB for you? What does it take? Adding runtime checks for the UB case is ignoring? Having the compiler find the UB paths to insert safety code is ignoring?
> > the epilogue is only correct if the function has void return type
> That's a lie.
All C functions return via a return statement with expression (only for non-void functions), a return statement without an expression (only for void functions) or by the closing of function scope (only for void functions). True?
The simple "spit out a block of assembly for each thing in the C code" compiler spits out the epilogue that works for void-returning functions, because we reach the end of the function with no return statement. That epilog might happen to work for non-void functions too, but unless we specify an ABI and examine that case, it isn't guaranteed to work for them. So it's not correct to emit it. True?
Where's the lie?
> > Adding runtime checks for the UB case is ignoring? Having the compiler find the UB paths to insert safety code is ignoring?
> Don't come onto HN with the intent of engaging in bad faith.
Always! You too!
The text you quoted was referring to how real compilers handle falling off the end of a non-void function today with -fsanitize=return from UBSan. If I understand you correctly, in your reading a compiler with UBSan enabled is non-conforming because it fails to ignore the situation. That's not an argument as to whether your reading is right or wrong, but I do think UBSan compilation ought to be standard conforming, even if that means we need to add it to the Standard.
To the larger point, because the Standard doesn't define what "ignore" means, the user and implementer can't use it to pin down whether a given UB was ignored or not, and thus whether a given program was miscompiled or not. A compiler rewrites the code into its intermediate IR -- could be Z3 SMT solver or raw Turing Machine or anything -- then writes code back out. Can ignoring be done at any stage in the middle? Once the code has been converted and processed, how can you tell from the assembly output what's been ignored and what hasn't? If you demand certain assembly or semantics, isn't that just defining undefined behaviour? If you don't demand them, and leave the interpretation of "ignore" to the particular implementation of a compiler, yet any output could be valid for some potential design of compiler, why not allow any compiler emit whatever it wants?
> The continuity of physical processes is not something that can be proved.
I agree with you, but is there any peer-reviewed publication that can be cited? The idea makes sense to me, firstly the Reals \ Inaccessible Reals = Computable Reals, secondly you can't ever input an inaccessible real to an experiment nor retrieve one out of an experiment -- but then I'm not completely certain in making the conclusion that no experiment can be devised which shows that inaccessible reals exist in physical space.
I am concerned about this in the field of complexity analysis of quantum computers too, I think that the use of reals in physics is leading to mathematically correct but non-physical results about complexity theory of quantum computers. Having a paper to point at and say "look, stop assuming your Bloch spheres are backed by uncountable sets, it's leaking non-computable assumptions into your analysis of computation" would be helpful.
From the abstract, "simulations of such systems are usually done on digital computers, which are able to compute with finite sequences of rational numbers only."
Not at all! Digital computers can use computable reals which are defined as any function from a rational value (the error bound) to a rational value which is within the supplied error-bounds from the true value. Do not mistake this for computing in the rationals, these functions which perform the described task are the computable real numbers. There are countable-infinity many of these functions, one for each computable real. For instance, note that you can always compare two rationals for equality, but you can't always compare two computable reals for equality, just like reals.
Since it is UB to dereference a null pointer, the compiler can assume that ptr isn't null after it is dereferenced[1]. Therefore, the if condition will always be false.
In fact if ptr is null, unless the foo field has a very large offset, the behavior you would probably expect would be for the dereference to segfault before reaching the if, so it doesn't really matter if it is optimized away.
E-matchers are an algorithm over e-graphs to efficiently search for a pattern in an e-graph, for instance if you want to find "(load ptr) * (load ptr)" then you have to find "#x * #x" for all x in all e-classes, then check whether "load ptr" is a member of e-class #x. This is where you get limits on what you can match. "Pattern matching" style transforms are easy and fast using e-matchers, things like "x * 2" -> "x << 1", but beyond that they don't help.
There's an optimizer problem where you have "load ptr" and you solve ptr and figure out it's pointing to a constant value and you replace the load instruction with the constant value. Later, you get to code emission and you realize that you can't encode your constant in the CPU instruction, there's not enough bits. You now need to take the constant and stuff it into a constant pool and emit a load for it. If you had stored them in an e-graph, you could have chosen to use the load you already had.
Suppose you wanted to do sparse conditional constant/constant-range propagation but your IR uses an e-graph. You could analyze each expression in the e-class, intersect them, and annotate the resulting constant-range on the whole e-class. Then do SCCP as normal looking up the e-class for each instruction as you go.