
Tis-interpreter – find subtle bugs in programs written in standard C - osivertsson
https://github.com/TrustInSoft/tis-interpreter
======
quotemstr
Many of these bugs exist only because compiler writers take unjustified
liberties with the C++ standard. That the standard permits compilers to
interpret memset(0, 0, 0) as __builtin_unreachable() does not justify
compilers actually doing so and violating programmer expectations.

I'm sick of victim-blaming here. Packages like tis-interpreter are very clever
ways to solve problems that shouldn't exist in the first place. The C standard
needs to be changed to redefine a bunch of currently undefined behavior as
unspecified or completely defined.

~~~
PeCaN
I'm gonna rant for a bit. I see posts like this from time-to-time, and while
they're certainly well-meaning there's a good reason why you're not writing
the C standard. :-)

> compiler writers take unjustified liberties

How are implementations wrt undefined behaviors unjustified? It's spec'd that
way, it's documented that way, there's a good reason why it is that way.

> That the standard permits compilers to interpret memset(0, 0, 0) as
> __builtin_unreachable() does not justify compilers actually doing so

__builtin_unreachable() is a compile-time hint to the compiler, so for (say)
GCC to interpret it that way you'd have to literally write memset(0,0,0) with
all constants. Weird example.

Oh by the way, C11 has memset_s, where you can be explicit about what you
want. C is good for being explicit. memset() should not try to guess what you
mean in an error condition.

> violating programmer expectations.

It's hard to violate expectations that a programmer doesn't have.

While we're talking about “expectations,” let's redefine floating point so
that 0.1+0.2==0.3.

> The C standard needs to be changed to redefine a bunch of currently
> undefined behavior as unspecified or completely defined.

I'd be interested to know how you define, say, signed integer overflow (keep
in mind, most DSP architectures don't use two's complement) and invalid
pointer dereferences (many embedded CPUs don't have MMUs, please remember
them). How about shifting a uint32_t by 32 bits? (x86 and POWER differ on what
this does, and I imagine others do too)

C runs on a lot of stuff. Undefined behavior is undefined explicitly _because_
there's not a good solution for every architecture. C gives the programmer the
freedom to do what makes the most sense on their architecture and for their
use case. Build different higher-level abstractions for desktops and the
little signal processor in your phone's modem. There isn't gonna be a solution
that works for both. C is for writing fast code in a moderately high-level
language. It's not here to make your assumptions for you.

> I'm sick of victim-blaming here.

Alright, I'm sick of this shit on _Hacker_ News. Say something intelligent or
don't say anything. Don't use loaded terms like “victim blaming” to cover for
your ill-informed opinions on how C should be. If anything, that only
downplays _real_ victim-blaming. I don't give a shit what you think C should
be. _You're not a victim, you're ignorant_.

~~~
kibibu
You got so worked up about specific examples that you missed the point
somewhat.

Consider this mini example:

    
    
        void myfunc (int * p)  {
         int t = *p;
         if(p == NULL) { return;}
         *p = t + 1;
       } 
    

Accessing p before null check is undefined, and causes compilers to deduce
that the null check isn't required. It can therefore be stripped, which most
programmers would find very surprising.

Edit: yes, the replies are right, I meant to dereference *p rather than just
copy the pointer. Fixed and somewhat simplified.

I was trying (badly) to remember this example:
[http://blog.llvm.org/2011/05/what-every-c-programmer-
should-...](http://blog.llvm.org/2011/05/what-every-c-programmer-should-
know_14.html)

~~~
makomk
One that often gets people is this:

    
    
        struct foo {
            int bar;
        }
    
        void myfunc(struct foo *p) {
            int *t = &p->bar;
            if(p == NULL) return;
            *t = 5;
        }
    

Is this undefined? Probably. Absent compiler optimizations, it won't cause a
crash on any system you're ever likely to encounter because it's not actually
dereferencing the pointer, but some versions of gcc have caused security
issues in Linux by optimising away NULL checks like this.

~~~
PeCaN
I'm not sure that's undefined though. It's equivalent to

    
    
      ...
      
      void myfunc(struct foo *p) {
        int *t = (int *)((char *)p + offsetof(p->bar));
        ...
      }
    

right? In the C abstract machine taking the address of an offset of a pointer
is always defined, I think.

~~~
__s
It is not always defined. It's defined to take an address to a pointer with an
offset within the object's space plus one. Subtracting two pointers to
distinct objects is undefined. This is to allow for segmented memory
architectures. It's also to allow implementation of garbage collectors by the
compiler

