Hacker News new | past | comments | ask | show | jobs | submit login

Double-free's can be tracked by doing data flow analysis on the function. This is how D does it in its nascent implementation of an Ownership/Borrowing system. It can be done without DFA if getting it only 90% right and having numerous false positives is acceptable.

I've used many static checkers in the past, and the rate of false positives was high enough to dissuade me from using them. This is why D uses DFA to catch 100% of the positives with 0% negatives. I knew this could be done because the compilers were using DFA in the optimization pass.

In order to get the tracking to work, one cannot just track things for a function named "free". After all, a common thing to do is write one's own custom storage allocators, and the compiler won't know what they are. Hence, there has to be some mechanism to tell the compiler when a pointer parameter to a function is getting "consumed" by the caller, and when it is just "loaned" to the caller (hence the nomenclature of an Ownership/Borrowing system).

One of the difficulties to overcome with D in doing this is there are several complex semantic constructs that needed to be deconstructed into their component pointer operations. I noticed that Rust simplified this problem by simplifying the language :-)

But once done, it works, and works satisfyingly well.

Note that none of this is a criticism of what GCC 10 does, because the article gives insufficient detail to draw any informed conclusions. But I do see this as part of a general trend that people are sick and tired of memory safety bugs in programming languages, and it's good to see progress on all fronts here.

> This is why, for D, I was determined to use DFA to catch 100% of the positives with 0% negatives. I knew this could be done because my compilers were using DFA in the optimization pass.

Is this really true? I thought this was impossible due to Rice's theorem

Fortunately, I am unaware that it was impossible and did it anyway :-)

But it is possible I made a mistake.

It is also true that for it to work, one has to change the way one writes code, like Rust does. This is why D requires and @live attribute be added to functions to enable the checking for just those functions, so it doesn't break every program out there. It will enable incremental use of the checking at the user's option.

You're probably using a different definition of 100% than any impossibility proof would use.

Consider some code:




if (hashfn(first_factor(huge_static_rsanum1))&1){needfree=false;free(a);}

if (hashfn(first_factor(huge_static_rsanum2))&1){needfree=false;free(a);}



The decision if this has a double free or not depends on the factorizations of two huge difficult to factor constants. It either double-frees or not depending on those constants.

Surely your software cannot decide that...

What you probably mean is something like "100% on real programs rather than contrived cases". Of course, in that case, your definition of 'real programs' is the catch. :P

Sometimes things that seem like they should always work except on contrived junk like the above example actually run into limitations in practice because macros and machine code generation produce ... well ... contrived junk from time to time.

> Surely your software cannot decide that...

The D implementation would reject such code. The DFA assumes all control paths are executed. For example,

    if (c) free(p);
    *p = 3;
is rejected as use-after-free.

    if (c) free(p);
    if (!c) *p = 3;
is also rejected as use-after-free. If the DFA is done properly, you will not be able to trick it.

Then that doesn't mean "0% of the negatives".

No, it means you have false positives. But no false negatives.

I think you are both right. It's confusing because the original claim uses different terminology than usual. It said

> 100% of the positives with 0% negatives [are treated as positives]

* The "100% positives" are actual positives that show up as positive, so it's saying that there are no positives that show up as negatives i.e. "no false negatives" (even though it uses the word "positive")

* The "0% negatives" are actual negatives that show up as positives, so it's saying that there are "no false positives" (even though it uses the word "negative").

So UncleMeat's comment 'Then that doesn't mean "0% of the negatives"' and your comment 'it means you have false positives' are actually in agreement.

And that's a "negative" in a practical sense.

An abstract interpretation that outputs Top for all programs is sound but useless. In practice, most sound static analyses for complex problems aren't too far from that.

It's not a "negative", it's a disadvantage. "Negative" has a specific meaning that should not be used in this context.

Safe Rust is also in the same boat: it has a lot of false positives that are rejected by the borrow checker even though they would be okay, and yet it's being used just fine. Think of doubly linked lists which are pretty much impossible to implement in safe Rust unless you replace pointers with integer IDs which basically disables borrow checking. Non-lexical lifetimes are an example of downright changing the definition of the language in order to remove some of these false positives.

I didn't it that way. I read it as "downside" rather than "false negative", especially because a sound static analysis is trivial and not something to be proud of in the abstract.

"Output Top" is sound for all non-inverted lattices and takes constant time. Woohoo! But it is also useless.

Ok. And if you call into non-D code, like C or assembly?

I think this is just moving potential problem under rug, which is best it can do, given Rice Theorem

Calling non-D functions means the compiler cannot check them, and will rely on the user having correctly specified and annotated the function signature.

I.e. you can have "unsafe" annotated code which the compiler doesn't check, which is indeed where the dirty deeds get swept. For example, I doubt someone could write a guaranteed safe implementation of malloc().

The idea, naturally, is to specifically mark such sections so that your crack QA team knows what to focus on, and to minimize the amount of such code.

The trick is to make the safe sections expressive enough that it is practical to write the bulk of the code in safe sections, thereby minimizing the "threat surface".

I personally would want code like this flagged by static analysis. I would fix the code by not calling free in the second 'if' body if needfree was already false (just like the call at the end of the function is guarded).

You might want it flagged-- I would--, but it would still be a false positive if it did so (assuming the constants didn't result in a double-free). :)

I think the argument is that static analyzers don't have to work on completely unchanged code, but rather rely on humans to structure the code in a way to avoid issues.

Many years ago people regularly wrote things like "if (retval = f())" but after compilers started to complain about it people changed to write "if ((retval = f()))" instead, if they want an assignment and truthfulness testing at the same time.

There’s definitely a double free, provided that nowhere you proved that your conditions can’t be simultaneously true.

Imagine this pseudocode:

    void main() {
        check_goldbach(); // iterates over every integer > 2
                          // to see if it is an exception to 
                          // the goldbach conjecture, and 
                          // returns if so
This program may be memory safe. It's possible that there is no exception to the goldbach conjecture, so check_goldbach will never halt, so use_after_free will never be called. It also may not be memory safe. If the goldbach conjecture is not true, check_goldbach will eventually return and then we'll call use_after_free.

You say your compiler accepts all programs which are memory safe, and rejects all that aren't. If so, you can use it to prove or disprove the goldbach conjecture. Since proving or disproving that conjecture comes with a one million dollar prize, it is surprising that you can solve it with your method

edit: I didn't mean to disparage WatlerBright's work, I only was taking issue with the idea that any method of static analysis can accept 100% of correct programs and reject 100% incorrect programs. This is an unavoidable consequence of the halting problem and being aware of it allows you to decide where you want the tradeoff to be. it sounds like his implementation has no false negatives at the cost of a few false positives that are rare enough for them not to really be an issue, which is still really cool

> You say your compiler accepts all programs which are memory safe, and rejects all that aren't.

It does it by implementing O/B rules, which force changes in the way one writes code. This may sidestep the goldbach conjecture, which I don't understand.

It is possible that the compiler simply rejects any definition of a function that has a use after free, which would both uphold the compiler's claim and also not prove or disprove the goldbach conjecture.

Which it does, if such use is on any control flow path after the free(). For example,

   bool first = true;
   while (f()) {
     if (first) free(p);
     first = false;
is rejected as double-free.

What about

    while (f()) {
        p = NULL;

It'll compile because although free(p) consumes p and p is undefined immediately afterwards, p then becomes defined again via the assignment and so the while loop always enters with a defined value for p.

If that is the case then his approach has false negatives, because a function that is never called cannot introduce memory unsafety but would incorrectly be flagged as unsafe

> incorrectly be flagged as unsafe

It depends on what that means. For example, one can write perfectly correct code in a block of code marked "unsafe", it's just that the compiler cannot check it. In D, these are marked with the @system attribute.

We actually had a discussion about what to call such unsafe blocks. Other languages call them "unsafe", but I didn't care for that as it implied the blocks were actually broken. Hence we call such code "system" code.

There aren't false negatives by the design of the rules. This is different from what linters do, as linters issue warnings for code that is acceptable by the language rules.

> Other languages call them "unsafe", but I didn't care for that as it implied the blocks were actually broken.

Climbing a ladder on a table is unsafe. Does that imply the ladder or table is broken? No.

Now you're entering a bit of a philosophical or subjective area in my opinion - I personally would want any code which is incorrect by construction to fail to compile (if possible) even if it was impossible to reach the code in question at runtime. A "false negative" like that is a feature IMO.

having your static analyser solving million dollar problems seems like an unnecessary high bar.

how is your comment, under any reasonable interpretation of what Walter wrote, not pointless pedantry?

What Walter meant in his comment was that his static analyzer has no false positives (it never allows a memory unsafe program to compile). He's provided a few simpler examples in this thread of cases where his analyzer produces false negatives. I knew that was going to be the case because no program can flag 100% of negatives without also flagging a few positives by accident (as a consequence of Rice's theorem). I just meant to demonstrate that, but I don't know anything about D so I wanted to pick something that I was certain his analyzer would fail one way or the other.

That doesn't mean that his analyzer isn't useful. I find rust extremely useful because it catches almost me whenever I accidentally write memory unsafe code without burdening me too much otherwise. But almost every rust coder has run into a case where they wrote some code that was safe, but not allowed. That's just a fact of life and it serves no one to make incorrect claims about one's software

> (it never allows a memory unsafe program to compile

Those would be false negatives, not false positives. FP = flagging what should not be flagged, FN = not flagging what should be flagged.

> I don't know anything about D so I wanted to pick something that I was certain his analyzer would fail one way or the other.

What he is talking about is a proposed implementation of the solution. It isn't actually in the D language yet. No one's tested it (except for Walter). So I would definitely take its' validity with a grain of salt.

> It isn't actually in the D language yet.

It is in the released compiler as a preview feature.

Oh that buggy mess got merged huh.

> This is why D uses DFA to catch 100% of the positives with 0% negatives.

For example if we have a graph data structure and we are visiting it and freeing nodes, how can your static analysis know that nodes will not be visited twice by my logic and freed twice?

Because the DFA follows edges, not logic. If free appears twice along any sequence of edges, the compiler rejects it.

I think that's why you're getting the response you are in this thread - if it really did '100% of the positives with 0% negatives', it'd have to consider the logic. There's an impossibility proof for that which has already been cited in the thread. It's usually illustrated using a function that solves the halting problem rather than the Goldbach conjecture.

If the compiler rejects things that have free appear twice on branching control flow paths but aren't actually broken due to dynamic control flow (like your looping example given elsewhere) then that's a false positive, isn't it?

And people thought '0% negatives' was meaning 'no false positives'.

I see your point, and it's a valid criticism of my statement. I meant it more in the sense that the static checkers I have used would use hackish ad-hoc heuristics which did not work very well, and gets constantly tweaked by the implementor.

The D implementation, in contrast, uses a well-defined rule outlined above, and it works predictably and reliably (absent implementation bugs, rather than ad hoc rule shortcomings).

Another way of putting it is that the rules of the language are changed to make 0%/100% results workable. The compiler never says "there may be a problem here". It passes or is rejected. If it is passes, it is memory safe.

That approach is a really good point in the design space IMO. It is also the tradeoff that people have to make when designing type systems. Many programs are correct but rejected by their language's type system, but if the language is well designed it was probably a really weird and unintuitive program to begin with, so it's okay to reject it anyway.

I think the terms normally used for this is "completeness" and "soundness", as in, the D type system is "sound" because it rejects all programs with use-after-free, but it is "incomplete" because there are some programs that cannot exhibit use-after-free for any input values, but that the type system rejects.

Given these more precise definitions, the analogous theorem would be that there is no type system (or static analysis) for use-after-free which is both sound and complete.

For example, consider:

    for (i = 0; i < 10; ++i);
some C compilers issue a warning for the trailing ; as do some static checkers. If the programmer intended that, the warning is a false negative because the C language rules allow it.

However, such a construct is a hard error in D. Whether you meant it or not, it is not allowed. This is what I meant.

A construct in D that would pass is:

    for (i = 0; i < 10; ++i) { }

That would be a false positive, if I am understanding things correctly. The warning is about code that is correct, yet it warns, hence false positive. If the warning was missing for incorrect code that would be a false negative. At least that is the terminilogy used for static analysis.

> I noticed that Rust simplified this problem by simplifying the language :-)

That's a bit of an open-ended qualifier. Would you say that is a positive simplification (an increase in elegance) or one that has a trade-off in reducing its expressive power?

I mentioned it as another way of solving the problem. Which way you prefer is a matter of professional taste, something I'd really prefer not diverging into in this thread.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact