Hacker News new | past | comments | ask | show | jobs | submit login
Wrong by Default (kevincox.ca)
105 points by nalgeon on May 16, 2022 | hide | past | favorite | 80 comments



The problem with RAII is that it is semantically coupled to destruction in most (all?) of the languages it is implemented in, which is tied to lexical scope.

In GC'd languages where lifetime is not meaningfully coupled to scope, RAII cannot work the way you expect it to. The most trivial way to handle it is to couple to finalizers, which is tricky for two reasons: firstly it is not deterministic when (or if, in case of an exception/fatal error) the finalizer runs, and secondly it means the GC must keep track of user defined finalizers to execute on collection (what if they trigger an error? What if they take a long time? What if they block?).

The solution is to find some way to tie RAII to lifetimes, which may require exit analysis and can be extremely difficult, if not impossible, to implement in the language. The "least bad" way to deal with the problem is defer-like semantics.

I'm not a huge fan of defer, but it's important to understand why it exists. It's hardly "wrong by default" and more "understandably different."


> which is tied to lexical scope

This isn't really true. It is kinda true in C++ but fairly well fixed with moves. In Rust it isn't true at all because moves are the default so any time you return a value or put it in a data structure the lifetime is no longer lexically bound.

> It's hardly "wrong by default" and more "understandably different."

I think it is both. What I mean by "wrong by default" is that if you write

    f = open(name)
The code is now wrong. This is the default behaviour. To make the code corred you need to add a second bit of code (a close call). Defer is a powerful mechanism to do that, but it is a tool to solve the core wrongness.


Note that they said the destruction is tied to lexical scope, in general, not necessarily the lexical scope of creation.

Their point is mostly that once you "drop" an object in a GC'd language, it will be collected at some later time independent of lexical scope.

Which I don't think is much of an issue, technically the language could "drop" (close) such objects lexically and leave memory reclamation for later. Not a huge issue.

The bigger issue is that GC-based objects have a strong tendency to share, a lot. That makes RAII way more challenging, when using the objects directly for starters, but even more so when composing RAII objects and non-RAII objects.


On both C++ and Rust the destruction is tied to the value's lifetime. On the two languages that everybody remember when they talk about RAII, the destruction is not tied to lexical scope.

(Although, on C++ that requires extra code for your class to behave correctly. The resource managing class is wrong by default, but the code that uses it is not.)


In Rust what you wrote is just how it works because it's how move was defined in the language, but I think in C++ at most you could say that this is how people conceptually try to arrange for things to happen, and at that point we're back to "Wrong by Default".

All the C++ compiler actually gives you is destruction at end of lexical scope, and then it's your job as programmer to do what people actually wanted instead.

Historically what you do in C++ is copy things a lot, or work with pointers. Modern C++ offers a move, but for backwards compatibility reasons it had to preserve that behaviour where the destruction happens precisely when you leave scope, so, move "hollows out" objects, leaving something which can be destroyed later.

Thus it's your job, when implementing a C++ type, to ensure that the "real" work happens whenever a programmer would expect, leaving just some shell to be properly "destroyed" later when scope ends. Whereas for Rust you just implement Drop correctly and the compiler arranges for it to happen promptly.


> All the C++ compiler actually gives you is destruction at end of lexical scope, and then it's your job as programmer to do what people actually wanted instead.

`new` and `delete` are built into the language (and smart pointers into the standard library). The statement that object lifetime is tied to lexical scope is plainly untrue.

> Thus it's your job, when implementing a C++ type, to ensure that the "real" work happens whenever a programmer would expect

The auto-generated (`= default`) copy and move constructors do the right thing by default if you compose your type from other (correct) types. Yes, in the rare case that you write a type that has to implement RAII, the compiler can do the wrong thing by default ("copy just copies the pointer"), but then you write a wrapper class once and you're done. If that's even necessary at all, since unique_ptr can have custom deleters. (You wouldn't violate Single Responsibility by messing with RAII in a class that also does other things, right?)


This is only technically true, in practice GC languages have workarounds. C# is almost entirely GC but has the IDisposable pattern and using statements to make up for this.


Even in GC languages, you almost always want to free the resource at the end of a lexical scope, usually the same one where it was acquired. What if instead of requiring special syntax (defer, try-with-reaource, with etc.) for the normal case where you want it freed at the end of scope, you use special syntax to specify it shouldn't be automatically cleaned up, because it is persisted somewhere else?


Also, `defer` is a patch applied to lame languages like C. It's fine as such -- the problem is the language, and that can't be fixed because the installed base is large.


Go and zig have had defer pretty much from the beginning.


I've been so annoyed with people falsely equating defer and RAII that it didn't occur to me this is yet another way it's inferior!

The thing that usually comes to mind for me is that defer only works for local scope. RAII is more general because RAII objects can be put inside other objects that have different lifetimes. I can have an OpenFile inside a struct inside a reference counted smart pointer which will make it so that when the last user using that file disconnects the file is automatically closed. Defer can't do that.


Indeed `defer` quickly gets real complicated when you need to return an object but you also may need to clean it up e.g.

    f, err := os.Open("file.go")
    if err != nil {
        return nil, err
    }
    
    data := make([]byte, 100)
    count, err := f.Read(data)
    if err != nil {
        return nil, err
    }
    
    _, err = f.Seek(0, 0)
    if err != nil {
        return nil, err
    }

    return f, nil
You can't just `defer f.Close()` after the first condition (regardless of flushing / write loss issues, we'll assume read-only) because you want to return it. You need to either:

* defer a callback and use a flag

* or specifically not use defer, remember to close the file on every path, and hope your linter supports that pattern

RAII? RAII just does the right hing there (disregarding the same flushing issue on writes). In case of error you drop the file and that closes it. In case of no-error you return the file object, that automatically makes it the responsibility of the caller.


Some languages (like D and Zig) distinguish between "error defer" and "success defer", which solves the issue above.

Not as automatic as RAII, but better than the single defer (which sucks).

EDIT: My personal preference is actually something like Python's "with" or C#'s "using", since they rely on "destructors" rather than explicit calls to close. But they have the exact same issue as simple defer. If you ever need to store or return the thing, you need to handle it manually instead.

In my own usage I almost never keep stuff open for long or store it in complex data structures where this is an issue in practice, simple "using" is almost always enough. Rust and C++ need RAII because they also manage memory like any other resource, otherwise it would get incredibly painful.

(Zig uses arena and pool allocators to manage memory, if you had to manage all memory with defers I think people would go nuts)


The issue I run into with Python's "with" is that you can't turn a class into a context manager without changing the calling scope. Suppose you have a class that initializes itself from a file, reads the entire file into memory, then accesses a few bytes here and there of the contents as needed. Somebody rightfully notices that the memory usage is way too high, so you instead open the file, and on demand seek to the location and read those few bytes.

The class now stores a file handle, which it should close. So the class should be changed to a context manager. So every call site needs to change from "x = MyFoo()" to "with MyFoo() as x:" in order to work correctly. With RAII, everything is already lifetimes, so the calling scope doesn't need to change if you add a destructor.


I commented something relating to this in another post, but since in Python/C# the resource and the container of the resource don't actually have tied lifetimes, you can do a different pattern where MyFoo() is a perfectly normal class, with a method within it that actually creates the context manager that loads the resource temporarily.

If you do this, you can move and store MyFoo's around, but in order to access the resource without getting an exception to the face you need to use with, e.g.

    myfoo = MyFoo()
    # the actual resource is only acquired for the duration of the with block
    with myfoo.acquire():
        myfoo.read_stuff()
    myfoos = [myfoo]

    # later
    with myfoos[0].acquire():
        myfoo.read_stuff()
Lifetimes are loose in GCed languages, so you shouldn't really think of some object's lifetime and the resources lifetime to be tied in any way, nor should APIs be designed with that assumption, even if the object represents the resource (confusing I know, specially since the languages themselves get this wrong).


> My personal preference is actually something like Python's "with" or C#'s "using", since they rely on "destructors" rather than explicit calls to close. But they have the exact same issue as simple defer. If you ever need to store or return the thing, you need to handle it manually instead.

Python's "with" doesn't. That is to say, when the with-block closes, it doesn't call any kind of destructor against the variable. It calls the __exit__ method, but it still leaves the variable in-scope. And if the __exit__ method tries to call an actual destructor against the variable, you'll hit a double-free bug when the GC runs later.

You can safely return to the variable after storing it, and do anything it supports after-closing. Including returning the variable.

    def retIOWrapper(x):
        with open(x) as f:
            pass
        assert(f.closed == True)
        return f
Python's with isn't introducing a scope lifetime.


C#'s using is also not calling the "destructor", it's calling .Dispose()

C# and Python have actual destructors "nobody" uses because they are GCed languages which make this discussion confusing (well, in Python they do sometimes, because it is reference counted and more predictable, making the discussion even more confusing).

I'm treating __exit__() and .Dispose() as "destructors" here, in the sense that they are implicitly called methods that free the resource (like drop in Rust), vs defering a call to close, which is explicit and "configurable" (you have to call the right method).

You get to keep the variable around, but the resource itself is gone after the with/using block, you have to reacquire it. With some changes to the way the APIs were written, it would make a bit more sense (IMO):

    files = [File("a.txt"), File("b.txt"), File("c.txt")]
    for f in files:
        with f.open('r'):
            print(f.read())
    return files
I personally extremely dislike the fact that "as f" is not scoped.


This can sometimes be useful, though I haven't yet decided if I think it's a dirty hack or not. At one point, I made a timer whose __enter__ method started the timer, and whose __exit__ method stopped the timer. After the "with", the elapsed time could be queried.


I feel like that would be better as

    t = timer()
    with t:
        ...
(I've done this exact thing for this exact use-case)


That is used by several context managers, mostly those related to intercepting exceptions and warnings.


RAII is superior, but there's also another go pattern for that which reads better than your two alternatives sometimes:

    // You _must_ 'defer closeFunc()' (with closeFunc being the second return value) after this function,
    // even if this function returns an error.
    // You do not need to call File.Close() on the returned file, only closeFunc()
    func openFileAndPeakHeader(path string) (*os.File, func() error, error) {
        closeF := func() error { return nil }
        f, err := os.Open(path)
        if err != nil {
            return nil, closeF, err
        }
        closeF = f.Close
        // ...
        if _, err := f.Read(data); err != nil {
            return nil, closeF, err
        }
        return f, closeF, nil
    }
Unfortunately, it relies on someone reading docs for them to determine that this isn't an Either<(File, func), error>, but rather an Either<File, (func, error)>, but that's how go works anyways. You have to read doc comments to know what types an 'error' might be, you have to read docs to know if return values are Either or not (i.e. file.Read returns n, err, and both are meaningful, not just one of them).

It's not good, but I think it's better in some cases than the options you presented.


> You can't just `defer f.Close()` after the first condition (regardless of flushing / write loss issues, we'll assume read-only) because you want to return it

You would never include Close in this function, period. In this example above, it would always be the caller closing the file:

    package main
    import "os"

    func hello() (*os.File, error) {
       return os.Open("file.go")
    }

    func main() {
       file, err := hello()
       if err != nil {
          panic(err)
       }
       defer file.Close()
    }


Now though this extra work (the thing I got has to have deferred work done when I don't need it any more) is given to your caller.

For a File Handle this might feel natural, but what about if I hand you a... 3D printer? a Bank Account? a Youtube Video? a Package Delivery? Which of these need some sort of "deferred close" action ? You probably need to read the docs and arrange to close or not close things accordingly right?

You're going to be Wrong by Default again.


The example above has an operation that might fail between the Open and the return. Leaving that out of course makes the example useless, but if you keep that part in, you can't use go's simple defer pattern to help you close the file if that intermediate operation fails.


The problem is, the programmer (caller) forgets and that’s the whole point of having a more or less automated way of releasing/closing/etc.

Apart from that, it’s bad design IMO (leaky abstraction).


> In this example above, it would always be the caller closing the file:

You missed the entire point, literally. Maybe try reading the original comment again?


What's so "complicated" about sticking a f.Close() before those latter two nil, err returns?


Getting it right once, in an example, where everything is visible within ten lines? Not complicated at all.

Getting it right every time, across thousands of lines of code, interspersed with other functionality? Very complicated.

If you design a system that requires multiple locations to be in sync with each other for correctness, that's a system that will quickly become incorrect.


If you're the 1% building wrappers around syscalls and IO, you're gonna have to think carefully about all these failure conditions.

Meanwhile, the other 99% of IO-invoking code gets to benefit from the syntactic sugar of `defer`.


Not the main point here, but I think you could almost fix the defer approach with types. Instead of:

    File f = File.open(...);
    defer f;
You could have:

    PleaseDeferMe<File> p = File.open(...);
    File f = defer p;
If you want to actually access that file you've created, your only path is through using defer.

One case this doesn't solve is creating an object but never using it. When writing new code, you probably won't make that error, but when ripping out old code you might miss a piece you should have deleted. If the compiler warns about unused variables (for every possible code path), it could catch that too.

(This could be generalized a bit. Call it CleanMeUp<T>, have a CleanMeUp.takeResponsibility() method that returns T, and maybe defer isn't the only thing that can call takeResponsibility().)


Java’s `throws` keyword comes to mind. It forces you to handle exceptions or explicitly declare that you expect the exception to be handled by a method higher up the call stack.

Your strongly-typed suggestion seems better. But you’d have to somehow allow for cases where a function opens a file, does something with it, and also expects the file to be closed by a higher function. Perhaps something like:

    PleaseDeferMe<File> BeginWrite() {
      PleaseDeferMe<File> p = File.open(...);
      // unsafe! Make sure to return p to the calling function so this gets called with “defer” at some point 
      File f = unwrapDefer p;
      f.Write(some header)
      return p
    }


Yeah, or I guess you could pass some object in:

   File BeginWrite(DeferTrustee t) {
     PleaseDeferMe<File> p = File.open(...);
     File f = t.defer p;
     f.Write(some header);
     return f;
   }
And then the caller would look something like:

   void WriteWholeFile() {
     DeferTrustee t = defer; // defers to the end of this function
     File f = BeginWrite(t);
     f.Write(the body);
   }
Then you can still defer before anything else and avoid unsafe situations.


I can’t believe that people still use opening/closing files as examples of RAII when that is one of the cases that RAII can’t handle correctly. A call to close is the last chance the operating system has to signal that an error occurred when writing. You can’t easily signal errors in destructors. In C++, you basically can’t throw an exception and there is no return value from a destructor. Rust suffers the same kind of problem. The only way to properly close a file is to use regular old explicit control flow with if statements. Then you can handle the errors in the normal way.


Handling an error from close is a bad way to handle the error, because the file is already closed, so you are pretty much stuffed.

You want to call fsync() instead, and grab the error there, which will prevent you from being able to get an error from close() - see "Dealing with error returns from close()" in https://www.man7.org/linux/man-pages/man2/close.2.html

If you get it on close, you have ~no idea which operation failed.

Obviously doing it on the operation allows you to wrap it up in the handling of the operation and allows you to do RAII (or DIRR as it probably should be known).

(The EBADF option is nonsense in a properly developed application, and should cause a panic. The EINTR position is just a mess.)


Error-on-close is basically another form of error-on-destruct, and like you say it doesn't make much sense to raise an error about something that no longer exists. It's not RAII that's wrong here, but the messy semantics of close() and its use as an ad-hoc asynchronous error reporting mechanism.


Of course, other flow control options are available which also prevent this mess - e.g:

  int failed = with_file("foo", |file|{
       write(file, "bar");
  });

  fn with_file(char* foo, lambda: Fn(File)){
    let f = open(foo);
    if !f { return f; }
    let err = lamda(f);
    let cerr = close(f);
    if !err {
       return err;
    }
    return cerr;
  }


Is this Rust or a different language?


Looks like a mix of Rust and C. Like a French person trying to speak English, but still saying half of the words in French.


I was aiming for pseudo-code. You can write it in C, but it's a little less readable due to the lack of first class lamdas.


There is a right way to do this using RAII, and it works for files and for any other resource that might fail on close: you simply explicitly do the close yourself, but only along the success/happy path. That way you get the best of both worlds: automatic clean-up (ignoring errors) along error paths, and proper close-and-check-for-errors along the happy path.

Of course it'd be even nicer if the implicit RAII close could somehow add its error to the error you are currently propagating, but that's beside the point for your example.


Sure, but defer still fairs worse than RAII does even in this example. In this particular example, yes, you'll need to call flush yourself and handle the error yourself: which you'd still need in a GC'd lang/lang w/ "defer", in addition to remembering to close the file (which the RAII example will have your back on).


> The only way to properly close a file is to use regular old explicit control flow with if statements. Then you can handle the errors in the normal way.

Other than crashing the program, what could you possibly do?

This is what portable software has to work with [1]:

       If close() is interrupted by a signal that is to be caught, it shall return -1 with errno set to [EINTR] and the state of fildes is unspecified. If an I/O error occurred while reading from or writing to the file system during close(), it may return -1 with errno set to [EIO]; if this error is returned, the state of fildes is unspecified.
And it's not just the standard being dumb. Here's what Linux's manual page says about it [2]:

       Retrying the close() after a failure return is the wrong thing to
       do, since this may cause a reused file descriptor from another
       thread to be closed.  This can occur because the Linux kernel
       always releases the file descriptor early in the close operation,
       freeing it for reuse; the steps that may return an error, such as
       flushing data to the filesystem or device, occur only later in
       the close operation.

       Many other implementations similarly always close the file
       descriptor (except in the case of EBADF, meaning that the file
       descriptor was invalid) even if they subsequently report an error
       on return from close().  POSIX.1 is currently silent on this
       point, but there are plans to mandate this behavior in the next
       major release of the standard.

       A careful programmer who wants to know about I/O errors may
       precede close() with a call to fsync(2).

       The EINTR error is a somewhat special case.  Regarding the EINTR
       error, POSIX.1-2008 says:

              If close() is interrupted by a signal that is to be
              caught, it shall return -1 with errno set to EINTR and the
              state of fildes is unspecified.

       This permits the behavior that occurs on Linux and many other
       implementations, where, as with other errors that may be reported
       by close(), the file descriptor is guaranteed to be closed.
       However, it also permits another possibility: that the
       implementation returns an EINTR error and keeps the file
       descriptor open.  (According to its documentation, HP-UX's
       close() does this.)  The caller must then once more use close()
       to close the file descriptor, to avoid file descriptor leaks.
       This divergence in implementation behaviors provides a difficult
       hurdle for portable applications, since on many implementations,
       close() must not be called again after an EINTR error, and on at
       least one, close() must be called again.
And, just to finish it off, Windows is completely silent on this matter. It mentions EBADF, and that's it [3]. The CloseHandle() function, which POSIX close() (a.k.a. _close()) uses under the hood, also doesn't seem to say anything about whether CloseHandle() failing on a file leaves the handle opened, closed, or in limbo.

And since all of these documentations reserve the right to not actually flush the file when it's closed, any warnings or errors are going to be "best-effort" anyway. If you really want to make sure the file actually gets written, call fsync().

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/c...

[2]: https://www.man7.org/linux/man-pages/man2/close.2.html

[3]: https://docs.microsoft.com/en-us/cpp/c-runtime-library/refer...


> Retrying the close() after a failure return is the wrong thing to do, since this may cause a reused file descriptor from another thread to be closed

Imagine how tricky it could be to have to debug this or try to reproduce it? Quite the nasty subtle buggy boundary case, especially in Golang where cleanup deferrment is encouraged and touted in lots of example code.

It's easy to imagine a segment of naive language users completely unknowingly risking creating environments where such a bug can thrive.

It goes like this: first deferring the call to close after opening a file and not receiving an error, then calling close again at the end of the happy path "just to be sure" or because "now I'm done with it". And then blundering along their merry way, believing there is nothing wrong with this and "go has my back".

(1) Common File API abstractions that taste more or less like OOPcombined with (2) defer-capable languages (like Go) where this is the "best practice": This class of bug should be considered common and expected.

The tricky part lies in the gap between being newbie-friendly vs. correct and safe. Language makers can't have it both ways.

Maybe static code analysis could detect it and refuse to compile, as it should be considered a foot-gun type of error.

p.s. notriddle: thank you for inlining complete-ish relevant detailed information in your post. It's nice of you.


> ... especially in Golang where cleanup deferrment is encouraged and touted in lots of example code

Go's os.File is not a plain file descriptor. It can be safely closed more than once and will close the underlying FD only the first time the .Close() method is called.


Eh, I get why people often don't like things like defer, but I like how versatile it is. In general, my ideal language would have both something like defer as well as proper RAII. In addition to being easier to write one-off cleanup statements, defer can be more readable if the cleanup is simple since it doesn't require looking at any data structure's code.

Finally, defer can be advantageous in certain scenarios since, at least in Go, it is somewhat dynamically scoped. Therefore, you can defer something in an if statement or in the middle of a loop and the defer will still be executed at the end of the function. This can also be a disadvantage with loops since if you do something like "defer C.free(tempMem)", the memory won't actually be freed until the end of the function.


The thing is, you can trivially implement a form of defer with RAII. Where I work we have a decades old C++ macro called ON_BLOCK_EXIT that people use to clean up C-style things that don't have a destructor. Today it's much better with C++ having lambdas.

It won't be "function-scoped" like Go, but (IMO) that's fairly useless feature if you have destructors anyways (you could just collect everything in a vector at function-scope, if you really want that behavior).


It depends on what you are working on, and then even "not wrong by default" might be... wrong.

For low level code, destructors/RAII are much harder to notice than a very explicit defer. Even the "kernel C" style "goto done;" to a unified exit point (sometimes multiple labels that fall into each other to clean up according to how far you got) wins out by explicitly telling you what will happen.

If you are working on projects where you often spend much more time reading code than subsequently writing code, and if the thing you are working on is not a high level application where the meat is in some business logic, then anything that reduces readability in the sense of "I can see everything that is going on" is very bad.

In low level code, the potential for icky problems is also much higher: You don't want to find out that some stray destructor did something that should not have happened in e.g. an interrupt context, or while the current thread's memory management is in an intermediate state. That is not correct by default, it is "wrong and you didn't even tell me".

In high level application software development, e.g. CRUD SaaS or a smartphone app, the opposite may be the case, and decluttering might well win out.


I don't get why you're getting down voted. This seems like a well balanced approach. I feel like so many programmers these days focus on "pretty" code and not readable code. Hidden complexity is hidden complexity. RAII hides complexity, and it is not like a function call as another user tried to say.

RAII implicitly happens at the end of an object's lifetime. In other words, there is no code that shows where or if this happens. Unless you look at the documentation or the code for the object you're using and manually check and see if it actually has a non default destructor. Whereas you can see exactly where a function is called and then inspect the details if you want.

I don't get why programmers so vehemently oppose a mix of good ideas. Why does everything _have_ to be RAII? Or everything _has_ to be OOP? I think we're finally making strides away from this mentality, but more people need to be thinking about the benefits of mixing several different styles. There are pros and cons to RAII and there are pros and cons to defer.

There is no silver bullet. I hope people quit pretending that there is a silver bullet. And instead embrace the good ideas that come out of different styles of coding instead of shoehorning everything into the patterns that we're already used to.


> then anything that reduces readability in the sense of "I can see everything that is going on" is very bad.

Nonsense. So called "kernel C" still uses functions, and lots of them, scattered over thousands of files. You never "see everything that is going on." And "kernel C" also uses function pointers and macros, further preventing you from "seeing everything that is going on."

Having written and read lots and lots of C++, RAII being "hidden" is hugely overblown by the diehard C crowd. It's really not hidden at all, first of all (it's often right there in the type - pretty hard to forget that unique_ptr is going to call delete, for example). And second it goes so so far in preventing resource leak bugs, which is also something kernels and low level code care a lot about. In fact, they almost always care far more about runtime correctness on this than readibility. The benefits RAII provide here so massively outweigh the "hidden" concerns it's almost laughable.

> You don't want to find out that some stray destructor did something that should not have happened in e.g. an interrupt context, or while the current thread's memory management is in an intermediate state. That is not correct by default, it is "wrong and you didn't even tell me".

RAII isn't just randomly injected. It's governed by the types you use. If a given type cannot be safely cleaned up in a given context, then it probably also wasn't safe to get in that context in the first place. Which the type should be communicating. So no it's not "wrong by default" since the initial (explicit!) acquisition was the "wrong" part, not the cleanup.

Regardless, it's not like C helps here at all. You still have to call the cleanup function for your given data, and that cleanup function may change without your interrupt handler knowing about that change.


> Having written and read lots and lots of C++, RAII being "hidden" is hugely overblown by the diehard C crowd.

I work on low level projects which use both at the same time.

> If a given type cannot be safely cleaned up in a given context, then it probably also wasn't safe to get in that context in the first place.

Not true. As a quick example, the destructor might try to take some statistics, which might not be safe in an interrupt context, or when a thread is (internally) in an intermediate state. There might also be multiple ways to get rid of the type. This was an "obvious" example, the ones that bite you are the non-obvious ones of course.

And yes, I have anticipated both the "but you still call functions in C" and the "functions may change" comment. True, but you call those functions explicitly, at a very explicit point (which you can also surround with other code), and anyone who works on a cleanup/destructor-like function can very easily figure out all call sites.

My point is precisely that you tend to run into these icky issues when deep in the bowels of a system, not when writing a smartphone app.

> And second it goes so so far in preventing resource leak bugs, which is also something kernels and low level code care a lot about.

Kernels do care about that, true. But if you exchange resource leak bugs for invalid state, which translates much more happily into security bugs, you are definitely worse off.


> Not true. As a quick example, the destructor might try to take some statistics, which might not be safe in an interrupt context, or when a thread is (internally) in an intermediate state.

Which is a problem you also have in C. If someone is going and adding, say, lock duration statistics they aren't auditing all the C call sites just like they wouldn't be doing that with C++ destructors. Which you can search for, fwiw. You seem to be under an incorrect assumption that you can't find those usages. You can.

> There might also be multiple ways to get rid of the type.

This is not incompatible with RIAA. See for example unique_ptr's optional destructor.

Of course, you're always free to just not use RIAA in those specific contexts.


> Which is a problem you also have in C.

Personal opinion will differ, but after reading and working on both C and RAII-heavy C++ low level code for many years (though I don't do the C++ one anymore), the low level C code just feels like fresh air when auditing.[1] It's all so explicit. For high level, as mentioned initially, I feel the opposite, as the boilerplate gets in the way.

> Of course, you're always free to just not use RIAA in those specific contexts.

That's my point, is it not? RAII in high level code, not in low level code. I did not claim they have to be in different code bases.

And I am really not afraid of abstraction, my favorite language is Haskell. But that's so high level, and so far from the metal, that every function looks pure, and every value is lazy by default.

[1] Personally, I think Rust would be my choice.


This actually brings me to one of the few (current?) limitations of rust that drive me nuts! In all the other modern languages I work with, my big projects end up with a utility RAII class that wraps a closure to execute when the scope expires. It’s great for cleaning up logical operations as opposed to what is traditionally done by RAII destructors in oop languages.

Eg if I’m creating a temporary file or directory in a function to handle some IO logic, I’ll create a RAII wrapper around a closure that deletes that folder when the scope expires - this might be a one-off thing so instead of creating an actual RAII “TemporaryFolder” object that wraps around a path, creates the folder in its constructor, and deletes the folder and its contents in the destructor (and have to reference the inner path instead of the object itself when needing it, etc), I can just stick a closure to delete the folder in my wrapper and call it a day.

You can’t really do that in rust because you’d have to give ownership of the file/folder/whatever to the closure - meaning you can’t use it yourself afterwards! Technically the compiler just has to guarantee that the object persists to the end of scope (isn’t destroyed or moved out earlier) and then there would be no borrow conflicts (since it’s only run after everything else has finished and can’t be used for anything else in the RAII wrapper) but it can’t really know that, so you’re disallowed from sharing it between your actual code and the cleanup closure.


> You can’t really do that in rust because you’d have to give ownership of the file/folder/whatever to the closure - meaning you can’t use it yourself afterwards!

The way to do this is pretty much what C++ has in unique_ptr(p, deleter): have the utility RAII class own both the object and the closure, dereference to the wrapped object, and on drop pass the wrapped object to the closure.

Altho

> Eg if I’m creating a temporary file or directory in a function to handle some IO logic, I’ll create a RAII wrapper around a closure that deletes that folder when the scope expires

There are crates for lots of those situations.

https://crates.io/crates/tempfile for instance.


> have the utility RAII class own both the object and the closure

Yeah, but the ergonomics suck. Especially for random cleanup tasks that aren’t necessarily one-object, one-task.

The temp file thing was just an example, actual use case was more domain-specific.


> Yeah, but the ergonomics suck. Especially for random cleanup tasks that aren’t necessarily one-object, one-task.

Doesn’t seem much more of a problem? Put a tuple in, get references out, cleanup gets the entire tuple.

If lifetimes get complicated, use refcells and / or Rcs.


If it's only a temp file (as opposed to a dir) and it's Linux, you may want to use O_TMPFILE (possibly O_EXCL too) when open()ing the file instead of the RAII wrapper. That ensures the OS will delete the file even if the program is killed.


An additional (and more important) constraint to being on Linux and creating a single file is that this assumes you are the one creating the file, which is oftentimes not the case. e.g. in my rust ffmpeg wrapper, I have pipelines that invoke ffmpeg and through it create multiple intermediate files that need to be cleaned up at the end of the run in all cases (success and failure).



I'm not sure what your performance constraints are, but you could try using RefCell<T> for objects which need to be shared with the closure. It's a fairly low overhead abstraction. Also, you can try creating a reference explicitly that gets used in the closure for objects which don't need mut references.

    let helperX = &x;
    let ycell = RefCell::new(y);
    let yref = &ycell;
    fn helper() {
      helperX.whatever();
      ycell.borrow_mut().ystuff();
    }
In this case, RefCell checks at runtime that you don't have an outstanding borrow in your regular code (which you can avoid via explicit drops or by carefully using blocks). It's essentially a very low overhead RW lock and unlike Rc, it doesn't even allocate on the heap.


> Even simply passing references to multiple places creates ambiguity on who is responsible for cleaning it up.

Learning rust, realizing how value ownership (borrow checker stuff) overlaps with conceptual ownership has been enlightening.


Some people criticize RAII for introducing hidden destructor calls into close-braces. I don't find this a problem in my experience (aside from having to make destructors do nothing in C++'s moved-from objects), but nonetheless I want to try using a language where instead of RAII as an automation system, it serves as a compiler check rejecting code which does not call a destructor exactly-once when needed (which may even return a value, as in linear types or fclose or Vale's Higher RAII). Linear types work better in a language without exception/panic unwinding (but even then you could use the standard destructor in case of crashes), and relegating RAII to purely a static analyzer is conceptually incompatible with having the presence of destructors affect stack unwinding. I'm fine with that since I don't like stack unwinding and don't find it useful in most (not all) cases.


This concept is almost the same as

trx = new DbTransaction();

...

trx.commit() or trx.rollback();

It's easy to be missed, but it's hard to detect without manually remembering where to put it in the correct place.


It's the same concept.

RAII can (and I have seen DB libraries that do) model a DB transaction as a resource. In the cases that I know of, they take the safe default of rolling the transaction back if the code fails to commit it. But the transaction is never leaked nor is the connection left in some unspecified state.


The best case is linear types + defer, so you can clean up however you want, but if you forget to do so it's an error.

(Also, hi! Kevin Cox also does good Nix things, folks. Be like Kevin and do so too.)


defer seems completely unnecessary there. It’s at best a trivial convenience, since the typesystem should flag any path where the resource is not cleaned.


Defer is still nice to avoid verbose boilerplate.


Defers are verbose boilerplate. Instead you allow relaxing linear types to affine.


You are assuming the thing can drop itself. When that is fine, sure, do RAII. But not all cases can be done that way.

For example, the Rust allocators working group has been contorting itself about deallocation. The right thing to do is to not store a pointer to a (de)allocator in e.g. every Box<T, A>, but just pass the (de)allocator when it comes time to deallocate a box. But Rust is completely incapable of expressing this and so we will get some crazy epicycle thing instead, or no good support for custom allocators with non-flat data structures.

Even if RAII is the 90% solution I think it is still good to layer the feautres in this way:

1. Linear types. Fundamental building block for that some types one cannot just forget the values.

2. Defer, sugar for scope-based running some code that will eliminate a linear variable. avoiding an error.

3. RAII, a way to fill provide a default defer command if none was manually provided. Default is "type driven" --- we associate a default defer command with the linear type that would be dropping.

From a Rust Perspective, we would want to distinguish Clone (an arbitrary convention for cloning) from Copy (and indication we can in fact copy for free. Saying types that need a destructor and types that don't are both affine is like adding some .clone() sugar and erasing the Copy--Clone distinction --- It is erasing a lot of essential nuance/complexity and I don't think a good idea.


There's a lot to be said for the semantics of the Python "with" statement as a way to do scope-based closeout. You get to handle errors at closeout. Nesting works. Nested errors work. Interaction with exceptions works. It's the only construct I've seen which handles the hard error cases.


> But the problem with most garbage collectors is that they only clean up memory!

Is this an inherent limitation of garbage collectors, or could they be made to clean up arbitrary resources?


It is obvious if you think of what a garbage collector is; it's a simulation of having infinite memory. So objects in a GC world live forever, and thus are never destroyed. (Of course, behind the scenes, the garbage collector will _deallocate_ them, but only when your program can no longer care because it doesn't have any way of reaching them.)

Of course, a garbage collector can be made to run finalizers, but then you either have to do a pure refcounting GC, at least for those objects (which is highly suboptimal for speed), or live with unpredictable finalization (which is incompatible with using RAII for e.g. locks, as the article points out).


Actually he later goes on to say the garbage collector can clean other resources by using the finalise method. The problem is that it doesn't necessarily clean them up at the right time. The garbage collector isn't triggered by having too many open file descriptors, it runs when you are low on memory. In theory it could look at other resources too, but for certain resources you don't want to wait at all.


There's also generally no guarantee that finalizers ever run, period.


That really depends a lot on the language. You could have a language which guarantees all finalizers are run so long as the process terminates normally (i.e. not by a power outage).


Some people consider reference counting a form of garbage collection, and in such case you can have deterministic destruction of any kind of resource (PHP and Swift use that).


It's just as easy to not write a destructor as it is to not write a defer.


Except the defer stuff is in addition to the "writing the destructor" part.

When you have defer, you still need to write code you'll be deferring. That's what the destructor is, no more. `defer` is strictly additional overhead.

Furthermore:

1. If you're just composing in destructible objects the compiler does the work for you, if I create a struct which contains a File, i don't have to do anything, rustc will make my struct implicitly Drop and release the `File`. If you do that in Go... you have to remember to write a deferrable method which'll close the underlying file.

2. You're only defining a type once, and you're thinking about its invariants right there, it's a lot easier to remember those invariants when creating the type than every time you're using an instance.

Unless you have linear structural subtyping and the compiler forces you to do it. Then it's fine.


But you only need to write the destructor once.


Fair point, but on the other hand, the destructor has to work for all contexts. The usual argument is that the creator of the class has the best vision for what should be in the destructor, but I don't agree with that either.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: