
Catching use-after-move C++ bugs with Clang's consumed annotations - akling
https://awesomekling.github.io/Catching-use-after-move-bugs-with-Clang-consumed-annotations/
======
matthewbauer
Perhaps the future of software isn't "rewrite everything in Rust", but instead
we end up annotating existing C/C++ with borrow checking information. This
would be similar to how there is a push in JavaScript to add TypeScript
annotations everywhere. It gives you the flexibility of the existing language
and ecosystem while still allowing you to use new techniques in programming
languages.

~~~
rubber_duck
My problem with C++ isn't the lack of borrow checker - this is the feature I
like the least in Rust (I know it's their core design goal but frankly the
inconvenience and limitations it imposes don't seem worth it for my use case,
and then theres the compile times).

C++ lack of modules and package management on the other hand is a huge PITA
and I'm not optimistic either of those bolted on so late in to the language
lifecycle will provide a useful solution.

It's a pity D took it too far in the other direction with GC and runtime - I
really could use a C with classes and modules.

~~~
abraxaz
I think conan has real potential when it comes to package management

Some example of what can be done with it
[https://gitlab.com/xadix/xonan](https://gitlab.com/xadix/xonan)

It is kind of like gentoo portage or nix pkg and can be used to manage your
tool chain also

~~~
plq
I don't get conan. Portage can already create a linux root in your homedir
(see the prefix project) and has a huge package repo. What am I missing?

~~~
pjmlp
Portage doesn't work across Windows, QNX, IBM i, IBM z, Android, iOS, macOS,
mbed, RTOS, ClearPath, Aix, HP-UX, INTEGRITY, FreeBSD, NetBSD, OpenBSD, PS4,
Switch, XBox, Tizen,...

------
sudeepj
With C++ even if your project follows this, there is no way to _enforce_ this
across your project's dependencies & wider eco-system.

With Rust, its entire ecosystem (however nascent) is subjected to the same
strict rules. This is _big_ plus when Rust eco-system matures functionality
wise.

~~~
akling
My project doesn’t have this issue ;)
[https://github.com/SerenityOS/serenity](https://github.com/SerenityOS/serenity)

~~~
logicprog
Wait, we enforce borrow checking rules on SOS?

~~~
akling
At the moment it only shows up as an on-screen warning in my Qt Creator thanks
to its clang integration. The default Serenity toolchain is using GCC-8.3.0
which doesn't support this trick at all, so we don't get any compile-time
enforcement. :( I keep thinking about switching over to a clang toolchain to
be able to do more stuff like this.

------
haberman
Why isn't a moved-from object _always_ considered "consumed"?

~~~
jdsully
Because its still a legal and valid object according to the standard.

A moved from vector might get reused to store new things for example.

~~~
ndesaulniers
Are you sure? I thought a moved from object was left in an indeterminate state
and further use was undefined behavior. Move constructors and move assignment
operators that zero out the rhs members try to prevent duplicate references
from an object likely to be destructed soon, which might lead to dangling
pointers and then use after free in lhs. Can someone cite the spec and prove
me wrong, please?

~~~
jdsully
A move constructor is supposed to leave the donor object in a valid state. All
STL objects with one will do so.

If you write a poor move constructor that doesn’t do this you will get exactly
what you asked for. However the compiler has no way to know you half assed it
and must support a legal valid case.

See this stack overflow: [https://stackoverflow.com/questions/9168823/reusing-
a-moved-...](https://stackoverflow.com/questions/9168823/reusing-a-moved-
container)

To quote office space, “Your moved from object only needs to be destructible.
But some people choose to allow more and we encourage that.”

~~~
comex
"if (x = y)" is also a legal valid use case (it sets x to y, and then tests
whether it's nonzero), but that doesn't stop compilers from complaining if you
write it, on the grounds that it's most likely a mistake. They should do the
same with use-after-move. There could then be some workaround or annotation to
disable the warning if you really did want a use-after-move, as there is for
the warning I mentioned (adding extra parentheses).

------
innot
> Once you std::move an object and pass it to someone who takes an rvalue-
> reference, your local object may very well end up in an invalid state.

As far as I remember, move constructors/assignments must leave the moved-from
object in a valid state - just that the standard doesn't say anything about
what that state is for standard classes.

Also, I have seen code where some kind of "result" can be moved from an object
and then re-generated from scratch with different inputs. In that case it was
perfectly valid to use it after move. But that's nitpicking, anyways.

~~~
foota
I believe the standard says that move does "valid but unspecified" for
standard library objects, but does not generally guarantee that moved from
objects must be valid.

~~~
ori_b
The destructor gets called on them. They need to be valid enough for that.

~~~
foota
Hm, technically don't think this would be required. Take for instance:

    
    
      auto no_destroy = new MoveNoDestroy();
    
      MoveNoDestroy* moved_into;
    
      *moved_into = std::move(*no_destroy);
    

Wouldn't call the destructor of the moved from MoveNoDestroy.

~~~
0xffff2
Yes, if you leak resources, their destructors won't be called. That really has
nothing at all to do with move semantics though. I think the important point
is that move semantics don't alter the lifetime management of the moved-from
object.

------
saagarjha
Looks like a lightweight borrow checker, although I wonder how well it fares
in places where lifetimes are difficult to track. Or is there a way to
annotate methods with this information as well?

~~~
cesarb
To me, it looked more like Rust's move semantics: in Rust, when an object is
moved it's "consumed" and cannot be used anymore. The borrow checker is for
when the object is not "consumed", only temporarily borrowed by some other
code.

~~~
saagarjha
I'm using the term "borrow checker" to encompass Rust's whole memory model,
but yes: this only seems to provide information about when something's been
"moved" (or "dropped") rather than "borrowed" in the temporary sense.

------
gumby
Very nice!

------
pjmlp
Cool idea.

------
rsp1984
I've been writing C++ for 21 years now (started when I was 14). To be honest,
I have never seen a solid case where move semantics provided added value (in
terms of code readability and maintainability) over just passing object
references as function parameters.

That big ugly object that would get copied on function return -- just create
it before the function call and pass it in as a reference! No copy required.

~~~
jcranmer
C++ has a few major flaws with respect to move and copy semantics. The biggest
one is that copy semantics are default and silent: it requires less work to
copy something than it does to use it by reference, and there is no visual
indication if the value is being copied or accessed via reference. This means
that it is way too easy to accidentally copy large objects (such as
std::vector) without realizing you're copying them.

Most newer languages have realized that implicit copy semantics is usually a
bad thing, and duplicating objects requires explicitly saying that you're
duplicating them (such as calling a .clone() method). Of course, some types
are value types, where copy is cheap and better than reference, but such
mechanisms are opt-in. C++ automatically generates these mechanisms, and gives
you nasty error messages when you opt-out of default copy.

Move semantics are almost always better, but in C++, with its historical
baggage of opt-out-of-implicit-copy-semantics, it means that constructing
move-only types requires a lot of excessive calls to std::move. Compilers do a
good job of telling you when you put one too many calls to std::move in, but
the code is definitely verbose compared to C++, to the point that it tends to
strongly weigh against actually using C++'s ability to annotate move-only
methods. Furthermore, without something like the mechanism in the blogpost
here, compilers don't give any indication of API misuse, so you can't leverage
move-only types to construct safe-by-construction APIs.

This is something I've been tripping over a lot recently, as I have a type
system where calling most methods makes the original object unusable.

~~~
rsp1984
I appreciate your reply. But frankly, why would someone "accidentally" copy
something by value?

I don't mean to sound arrogant but if someone is writing C++ at a level where
he/she does accidental copies then that's a _very_ clear sign to me to stay
the hell away from move semantics and other advanced features.

I personally find things being copied by default a nice feature. It's more
consistent than "PODs by value, objects by reference" such as in Java.

~~~
damnyou
There is no such thing as a bad programmer, only bad tooling. C++ is bad
tooling.

~~~
patrick5415
A craftsman doesn’t blame his tools.

~~~
damnyou
A great craft recognizes how important good tools are.

