Hacker News new | comments | show | ask | jobs | submit login
Modernize C++: Sink Functions (bfilipek.com)
51 points by joebaf 4 days ago | hide | past | web | favorite | 33 comments





I didn't see any valuable information here besides "Use unique_ptr instead of raw pointers". The rest is noise, and a contrived example. It seems like a profile post written to impress recruiters.

I learned about `.release()`, so that's something.

a shared_ptr would probably also be a better choice if the ptr should go through several nodes of the pipeline

A shared_ptr is a reference counted allocator, which has nontrivial runtime overhead. It's not what you want when you just want to pass down a reference to some object allocated as part of an enclosing stack frame.

The problem in the article is that the code has a firm RAII promise that the higher level object will be destroyed correctly (i.e. never before any called functions use it). But the author wants to add safety by ensuring that the called functions aren't able to store or copy the pointer or otherwise muck with that allocation promise. And the chosen syntax with unique_ptr works well enough I guess (though it will break if you need the same object used across two disjoint subtrees of the calll tree).

Personally? If this is really what you want then Rust is probably a better proposition as it doesn't require extra syntax for this case. Is it really what you want? Dunno. A simple application of "const" to a few places can get you maybe 70% of the way to the goal here in designs like this.


> though it will break if you need the same object used across two disjoint subtrees of the calll tree

Yes, while I prefer to use plain values or unique_ptrs when I can, I will use C++ references (or plain pointers) in this sort of situation. Basically in C++ I think an of old-style pointers/refs as Rust-like "borrowed" references.

This is an example of how the Rust safety model is really about compiler enforcement of good practices from C and C++ (as opposed to, say, GC which is a distinct practice). I appreciate what Rust is doing, but it is also true that good coding conventions in C++ will get you 90% of the way there.


> good coding conventions in C++ will get you 90% of the way there

For those that need it, SaferCPlusPlus conventions will get you even further.

> Basically in C++ I think an of old-style pointers/refs as Rust-like "borrowed" references.

They key safety feature of Rust references is that they have "scope lifetime" and their target is guaranteed to be alive for the duration of that scope lifetime. You can use SaferCPlusPlus' "scope pointers"[1] to achieve this in C++.

Scope pointers can be explicitly converted to raw pointers, so they can be used with existing "legacy" functions. But you get even more benefit when you change your function interfaces to support scope pointers directly[2]. For example, it would prevent the function from (inappropriately) putting a copy of the scope pointer into "long term storage" (like Rust does).

[1] shameless plug: https://github.com/duneroadrunner/SaferCPlusPlus#scope-point...

[2] https://github.com/duneroadrunner/SaferCPlusPlus#safely-pass...


Not if shared ownership is never actually required, though -- it's possible that ownership could transfer when moving through those different stages.

I'd think of smart pointers as function parameters as a code smell.[0] Heap allocations should be managed with an RAII class (of which smart pointers may be an internal implementation detail), and "sink functions" should take an rvalue reference (which semantically means "take ownership of this parameter").

[0] Whether a resource is allocated on the heap or elsewhere, or, whether a heap-allocated resource has owned or reference-counted semantics, are implementation details that should be abstracted away from your top-level APIs, so you have the flexibility to change them later.


Honestly, everyone reaches for shared_ptr way too quickly/easily. In the majority of cases it's a sign that you don't have a clear owner of a component and you've just deferred the decision to whatever (non-)deterministic behavior your code has.

I've found using unqiue_ptr + proper move semantics tend to lead to cleaner, single-owner architectures that are much more sane to maintain/scale.


Agreed; probably upwards of 90% of the uses of shared_ptr I've seen (and 70% or so with unique_ptr) are by people who clearly don't understand the C/C++ memory model and are trying to write Java in C++, and half the remainder are by people who've lost track of ownership and lifetimes and paper over it by randomly sprinkling smart pointers throughout the codebase until it sort of works.

I'm not against smart pointers where they make sense, it's just whenever I see them I start bracing myself for the horrors that seem to inevitably follow.


Yeah, it's also one of my big gripes about Java or any managed language. You never know if some jerk library grabbed a reference to your object when you passed it to a function and decides to leak it via holding a ref.

It's also why I really like that Rust decided to go RAII/single-ownership and add a ton of friction(via RefCell) to make sure that you really want shared ownership. It forces you to get your house in order up-front which I've found almost always leads to a better design overall.


RAII really is a brilliant design, it both (mostly) solves the problem of manual memory management and is easier (for my brain at least) to reason about than deterministic (shared_ptr, Obj-C/Swift) or nondeterministic (managed langs) garbage collection schemes. Once I grokked it I've never looked at managed langs the same way again.

In C++'s defense it's not like the language or the compilers encourage you to use shared_ptr, but it's also true (mainly for legacy C-interop cruft) it doesn't do enough to nudge you to write code in a proper RAII style, nor to enforce correctness WRT ownership.

I like that Rust forces you to explicitly opt out of RAII and/or single-owner semantics. (Those are the "defaults" in C++ too, but too easy to implicitly or accidentally opt out of). On the other hand you can write Java in any language, and from the little Rust I've read, newcomers from managed langs will often daisy-chain enough smart pointers in a row to put any 13-year-old who just downloaded Unreal Engine to shame.

The last time I took a serious look at Rust I thought it took things too far in the other direction; you weren't able to use two (nonoverlapping!) data members of the same struct in a function, and the suggested workaround was to wrap it in a smart pointer. Has that issue been fixed now?


> The last time I took a serious look at Rust I thought it took things too far in the other direction; you weren't able to use two (nonoverlapping!) data members of the same struct in a function, and the suggested workaround was to wrap it in a smart pointer. Has that issue been fixed now?

I think that's largely been addressed. Are you talking about doing something like this?

https://play.rust-lang.org/?gist=cf5835d28946df6c5093d57b7ca...


It's long enough ago I can't remember exactly what I was trying to do except that it was trivially correct, but yeah I think that was it. Anyway, I'm glad that's been solved; looks like I've got some new weekend plans! :)

Awesome :).

Something else to keep in mind that while it's nice to have the borrow-checker validate everything it's okay to step around it if you've got a case that's gnarly and you know is valid. I've had some C callback stuff where I run into that occasionally, I can't encode the lifetime semantics to the calling function so I have to side-step it.

In that case things like RefCell(which are separate from (A)Rc/Box) are a great "escape hatch".

Obviously you should try and keep it to a minimum but don't let it be a complete roadblock.


Closures still borrow whole structs from the captured scope when they could only borrow the fields they actually need:

https://play.rust-lang.org/?gist=26b2efa859589a8172cad982f0d...

This is especially frustrating when it's self that is being captured. There's an accepted RFC for fixing this but I'm not sure when it's going to be implemented.


You can work-around that by capturing the fields you need individually. It's a little awkward but does let you express what you're trying to do there:

https://play.rust-lang.org/?gist=2463ff559d7f96f916d246272b1...

Agreed that it's not the best default behavior though. Generally I tend to try to keep unrelated things in separate structs for this reason(with the nice side benefit of keeping concerns separate).


I agree the example looks "obviously" correct. But my paranoia worries that an aggressive optimiser might write to bytes next to the actual fields in ways that render the code incorrect.

While that worry sounds insane, it is exactly the kind of thing C compilers have retroactively defined as Undefined Behaviour. So kudos to Rust for having the compiler tell you up front where it reserves the right to do stupid stuff in future.


> everyone reaches for shared_ptr way too quickly

I dont't think it's as bad as it sounds, and I have an explanation: C++ has evolved to the point when the ease of coding in it is comparable with that of Java; so, more people use C++ as if it was Java, and shared_ptr is the closest thing to garbage collection that C++ standard library provides.


Oh no, it really is that bad.

You haven't lived until you've root-caused a ref cycle 3 libraries deep in a 200kloc+ codebase.

This problem also isn't exclusive to C++, I've had nasty libraries in Java add things to the root set silently to the same effect.


I find smart pointers super valuable for scenarios where you pass memory through a pipeline. A while I ago had to pass measurement data though a pipe of customizable filters and it was really nice to know that the data would be freed once the last filter was done without writing tracking code. But I agree that they tend to get overused.

I had a similar conundrum when I was trying to learn modern C++ by writing a network stack. On one hand just wrapping my packet objects with a {shared,unique}_ptr meant that I could freely pass these objects through a pipeline and not worry about lifetime or accidentally freeing somewhere I shouldn't have.

Although it seems to me that most people seem to be proponents of actually sitting down and working out object ownership of your classes rather than just being lazy and using smart pointers everywhere.

I'm still experimenting with this approach, definitely slows me down right now but I imagine that's because I haven't done it enough. Not going to lie, coming from Java/C#, it's really convenient just being able to 'new' an object and not worry much about its lifetime.


I don't think it's lazy to pass around smart pointers. Obviously you need to understand the consequences but in many cases like pipelines it's a pragmatic solution that saves a lot of code.

> "sink functions" should take an rvalue reference (which semantically means "take ownership of this parameter").

Doesn't that only work when the parameter is movable? For older classes that don't yet have move semantics, unique_ptr is an easy wrapper to give them efficient move semantics, is it not?


I'd argue that classes with a single field are a worse code smell. A using or typedef would be better, at least then your code isn't littered with unneeded field accesses.

I agree that these methods should just take rvalue references, though. Doesn't really matter for unique_ptr, but for other things like shared_ptr it adds up quick.


>I'd argue that classes with a single field are a worse code smell. A using or typedef would be better

Sometimes, but not in this case, because

- Smart pointers are nullable, which should be opt-in and not opt-out behavior.

- Smart pointers have pointer-like interfaces, which do not read like idiomatic modern C++. A wrapper class `GadgetHandle` can define conversion operators for `Gadget&` and `const Gadget&` so their interface reflects their semantic meaning (a `Gadget`) and hides irrelevant or unsafe implementation details (the operators and methods belonging to the smart pointer class).

- As a corollary to the above, you can create several classes `UniqueGadgetHandle`, `SharedGadgetHandle`, `CustomGadgetHandle` and so forth, which enable you to very easily and transparently intermix or alter allocation strategies (even on the stack or statically) for your `Gadget`s, usually without needing to alter any public APIs or more than one section of your codebase.

- When your heap-allocated data [structures] don't semantically correspond to the containers provided by the STL or your other library dependencies, I find that chances are you have or may add some custom constructor/destructor logic beyond just new and delete.


You cannot use a typedef if you intend to create a new type (e.g. one that provides a specific set of methods but otherwise identical to a built-in type).

> Heap allocations should be managed with an RAII class

Isn't this what a `unique_ptr` is (a RAII class that manages Heap allocation)?

> sink functions" should take an rvalue reference (which semantically means "take ownership of this parameter").

Why not take by value?


>Isn't this what a `unique_ptr` is (a RAII class that manages Heap allocation)?

See my reply to kllrnohj above: https://news.ycombinator.com/item?id=15872509

>Why not [write sink functions that] take by value?

That could work in the case where the parameter is a class which is movable but not copyable, but I'd still prefer an rvalue reference in that case because it makes your meaning obvious without having to refer to the class definition or documentation.


> "sink functions" should take an rvalue reference

You mean using move semantics?


Yes, two phrases meaning the same thing. ("Takes an rvalue reference" is slightly less ambiguous because by-value parameters can also in certain circumstances be move-constructed).


Isn't that just Rust's move semantics + `Box`?

Instead of improving a detail, you can just improve everything and move to Rust... :>




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: