Hacker News new | past | comments | ask | show | jobs | submit login
Ownership and Borrowing in D (dlang.org)
283 points by systems 33 days ago | hide | past | web | favorite | 112 comments



I've been working on a method for adding ownership and borrowing to the D programming language, and people kept asking me to explain how it works, so I wrote this article about it. I had originally thought it would be impossible to add this to D, but the more I thought about it, the more tractable the problem became.


Not to go too far off topic, but I just wanted to say how much I've been initially pleased with D.

For a long time I've wanted a language that I could use with just a text editor and the command prompt or terminal.

I've been able to get programs to easily compile on both my Linux home and windows work machine without any complex install magic as it should be in my opinion. The programs produced were binary executables and tiny to boot (especially Windows where HelloWorld was ~367 KB iirc and ~1.1 MB on Ubuntu).

It is nice to not need an in-depth understanding of the JVM or CLR. The hardest part for me (I usually favor languages like Python) is not having a REPL, but I'll live.

The one place where I could use some help (similar to most of the advanced languages... although Rust is good here) is in the documentation which is more meant for power users than onboarding new folks. Compare and contrast the Nim section on types with D's classes/structs/unions. The D documentation is more like a reference manual and Nim is more tutorial. I think I'll try Andrei's book and see if that gets me far enough to where the technical doc becomes straightforward.

On another note, I see D has some functional programming support. If possible, I'd like to avoid having to model my own domain logic with OO classes. Is the proper way to do this with structs and the stdlib's map!, fold!, & filter! constructs.

Sorry for rambling!


Hey Walter, I really admire your work. I'm curious though, do you think D is potentially in danger of trying to do too many things? No one wants another mess like C++.


That's an ever present danger. We've been a bit more willing than C and C++, however, to discard some bad decisions.


I do not think it is in danger, it already does too many things that make it a complex language. Similar with Object/Free Pascal and to some extent C# and certainly others i cannot think of right now.

These languages are like horses running on a straight line towards maximum complexity. It is just that C++ is at the front right now, but no horse has any inclination of stopping.


The Oberon language is an exception, it goes the other direction.

https://www.miasap.se/obnc/oberon-report.html


Not an exception really, there are many simple languages and Oberon-07 (the dialect linked) is indeed an extremely simple one. Go is another language that is simple. And of course C (that compiler developers abuse undefined behavior to win artificial benchmark games and making the life of everyone using C for practical purposes miserable is not a problem with the language itself but with the compilers that do such abuse).

EDIT: Scheme too, at least up to R5RS.


What sets Oberon apart is that it has become smaller and simpler over time, something which is not true for the languages you mention.


This depends on if you see Oberon-07 as a different "version" of Oberon or as a different dialect belonging to the same family. Personally i see it as a different dialect instead of being the same language exactly because it removes functionality and isn't compatible with it.

It may not seem like a big difference, but it becomes more apparent when you consider how different all the Pascal dialects that are out there - some even from Niklaus Wirth himself - are.

Also IMO languages should not break backwards compatibility.


For me the best version of it was Active Oberon, but I guess Wirth wouldn't agree given its pursuit for minimalism.

Also I doubt he would bother to any further updates to Oberon-07 (2016 rev). Most likely busy with other matters nowadays.


Wirth — at the age of 85, no less! — is actively working on Oberon stuff, like the compiler. Here's his "news" file, which is up to date as of May 31, 2019:

http://people.inf.ethz.ch/wirth/news.txt


Oh! Thanks for the heads up.


Isn't Go inspired by Oberon? That would explain some things.


Yes, the method syntax comes from Oberon-2, unsafe package is similar to SYSTEM on Oberon.

The rest comes from Limbo (Inferno's userspace language).

Unfortunately it also left a couple of other nice things from those languages.


Are there shops that will take advantage of this stuff by limiting what features they use and following a certain style? For ex would a C++ team use some functional approach or a Rust style approach to ownership etc? Or it always just a mishmash?


Nice article as always. I don't think I fully grok this stuff (in other languages I've read about too). Perhaps you can help me have some clarity of thought.

1. What should I do when I want to make a data structure like a doubly linked list or a graph - ie one where multiple pointers point to the same object?

2. Memory is only one kind of resource where ownership needs to be tracked. Others include file handles, network sockets etc. Can you comment on whether the OB mechanism solves the problem in general, or is mainly for memory tracking?


1. This is, of course, a well known problem with an O/B system. The usual resolution is to a) use another data structure or b) hide the double linked list in @system code and use a safe interface to it.

2. It's a general solution (not limited to memory).


There's an article that focuses on number one:

https://rust-unofficial.github.io/too-many-lists/

I propose doing anything super hard and rare outside the general model. Make the doubly-linked list in whatever tool supports easy verification. Things like Why3 and solvers for separation logic. Run that object code through lots of testing with runtime checks for the specified behavior. If it passes, it's probably earned the right to be wrapped for use in the memory-safe code.

This use case is also why I prefer more research funding to go to verified solvers than to manually-verified algorithms. They'll get more real-world use on hard problems. Well, that and test generators. They complement each other. The testing tools are more cost-effective, too.


You could make the entire thing `const` and operate in a purely functional fashion - though this approach might be absurd in the context of systems programming (I can't claim to know as I've no background in the area whatsoever.


How do you make an immutable doubly linked list?


Build it, then cast it to immutable in @system annotated code.


From the article: use 'system' annotation (the equivalent of unsafe in Rust)


I will forever remember D as the book that was on the main characters bookshelf in Office Space.


If D manages to implement sound borrowing checking this will be huge! Brings it a step closer to matching Rust's feature list.

- Zero-cost abstractions: Yes!

- Move semantics: Yes!

- Guaranteed memory safety: Yes! (in code marked with the appropriate annotations)

- Threads without data races: Yes!

- Trait-based generics: No!

- Pattern matching: No! (Although like C++, in can be done as a library: https://code.dlang.org/packages/sumtype)

- Type inference: Yes! (Plus unlike Rust, you can also mark function return types auto, and let them be inferred too).

- Minimal runtime: Yes! (the GC could be disabled if using exclusively borrow-checked code)

- Efficient C bindings: Yes!

D also brings some features of its own to the table that Rust doesn't yet match.

- Higher-kinded types (template template params): it's possible to implement Haskell-style monads etc. in D, if for some reason you felt the need.

- Pure functions: these allow you to statically guarantee that a piece of code doesn't perform any IO and is a completely deterministic function of its inputs. This makes reasoning about a large codebase much easier, compared to Rust where there's no way to statically guarantee a function isn't doing any IO.

- Fast compile times: the reference DMD compiler is almost as fast as the Go compiler, at least for code that doesn't do a bunch of compile-time calculation.

- Compile-time compute: Rust now has constfn, but that still only supports a limited subset of the language. D allows almost the entire language to be used at compile time.

- Variadic templates: Something C++ users might miss when coming to Rust, they allow for the creation of things like tensors of arbitrary dimension (myTensor<6, 3, 2, 5, 6, 20>) on which operations are checked at compile time to ensure sizes are compatible (so it's a compiler error if you try to multiply two tensors of the wrong size or wrong number of dimensions).


The problem with D is that it is so non-uniform. People talk about the betterC mode, but not nearly all D code written can run in this mode. Now people will talk about this ownership mode, but very little code will really use it. To become an expert in D or C++, you really need to understand what feels like a dozen different languages... and that’s not what I’m looking for in a single language.

D’s concept of ownership (for now) doesn’t seem to have any plan to support partially moved structs, which are a major ergonomic feature.

D has organically grown just about every interesting feature ever conceived, but I would much rather use a purpose-built language, which is what Rust feels like.

I don’t want a C++ that’s better at being C++ than C++ is. I really enjoy how consistent and expressive Rust is. I also appreciate how intentional Go is about being Go.

A lot of people enjoy C++, and some people enjoy D. I enjoyed C++ back when Rust wasn’t an option... then something better came along. D and C++ try to be everything to everyone. I like D better than C++... but I have seen no compelling reason to trade Rust for D.

Variadic templates are much better expressed by either tuples or const generics. A fixed length array of a constant generic length is the Rustic solution to tensor-like problems, and it is being rigorously developed.

Arbitrary compile time computation should be separated from the body of the program. If you need to download files or read the file tree at compile time, that should happen in a build script. Rust provides build.rs for that. Precomputing values without I/O is conceptually just an advanced kind of compiler optimization, and that’s why Rust is basically seeking to make const fns only pure functions.

The fast compile times are a double edged sword, since those binaries are noticeably slower than Rust binaries, from what I’ve seen. I hope that Rust will one day have a much faster compiler for development builds.

I also think every sign is showing that Rust is starting to get some real traction, so I don’t think I’m alone in these opinions... but they are mostly just that: opinions. You’re welcome to your own.


> Arbitrary compile time computation should be separated from the body of the program.

Here's some code that initializes an array at compile time with a non-trivial computation:

https://github.com/dlang/dmd/blob/master/src/dmd/backend/var...

This used to be initialized by a separate program that generated some .d files. It was nice to get rid of that. Even better, if you check the generated object file, there's no trace of the code used to generate the array - just the data for the array.


> > Precomputing values without I/O is conceptually just an advanced kind of compiler optimization, and that’s why Rust is basically seeking to make const fns only pure functions.

I agree that generating an array at compile time is perfectly fine. I was saying that I/O should be restricted in compile-time computation, which the person I was responding to made me believe was entirely unrestricted in compile-time computation in D.


>Arbitrary compile time computation should be separated from the body of the program. If you need to download files or read the file tree at compile time, that should happen in a build script. Rust provides build.rs for that.

This is an area we'll have to agree to strongly disagree. For me having support directly in the language is a huge improvement in ergonomics, compared to having to run a separate script. Compare e.g. generics in Go1 (which can only be done with separate build scripts at compile time) vs in any other language that has built-in generics. Also consider F# type providers, which guarantee at compile time the correctness of code querying a database (generate the schema/structs by querying the database at compile time). Certainly the same thing could be achieved by running a separate script as part of the build process, but I've never seen a language actually implement that, probably because it'd be much more painful.

>The fast compile times are a double edged sword, since those binaries are noticeably slower than Rust binaries, from what I’ve seen.

Note that D also has an LLVM-based compiler backend that is slower to compile but produces much faster code, which can be used for release builds.


A major difference is that the Go compiler will not run build scripts when you build the code. You have to manually run codegen scripts, which is error prone since people are not good at remembering to do manual steps. Rust will automatically run the build script before building your code. It just allows you to logically separate build time I/O from runtime I/O, which I find is nice for reproducibility of builds and maintainability of code. The build script is still written in Rust, and you can import Rust code from wherever into that build script. The plan is to support arbitrary non-I/O computation in constant functions anywhere in Rust, since those are just pure functions.

As far as your comment about databases, you should look at Diesel. It handles this very nicely in Rust, so... that’s not a problem at all.

> Note that D also has an LLVM-based compiler backend that is slower to compile but produces much faster code, which can be used for release builds.

Is it compatible with 100% of the code that the current DMD release is compatible with? That’s all I ask for in alternate compiler implementations... but it seems like a lot, sometimes.


>A major difference is that the Go compiler will not run build scripts when you build the code.

Go actually has a built-in "Go generate" command that will look for magic "//go:generate <command>" comments in the code and automatically run code generation based on them, although I'm not sure how ergonomic it is.

>It just allows you to logically separate build time I/O from runtime I/O, which I find is nice for reproducibility of builds and maintainability of code.

This is a reasonable argument. For what it's worth, compile-time IO in D is more limited than in e.g. Haskell, as it seems the only compile-time IO D supports is reading files (or at least I can't find any documentation suggesting otherwise). Most of the power of its compile-time code execution comes from non-IO code (e.g. generating an optimal lookup table for a dictionary of values known at compile time, hard-coded in the code, or automatically generating fast JSON serialisation/deserialisation code for a struct without need for macros). While Rust also plans to support this, full support could be years away, while D supports it now.

>As far as your comment about databases, you should look at Diesel. It handles this very nicely in Rust, so... that’s not a problem at all.

Definitely shows it's possible to get the benefit of type providers without language support for them, hats off to the Diesel team!

>Is it compatible with 100% of the code that the current DMD release is compatible with? That’s all I ask for in alternate compiler implementations... but it seems like a lot, sometimes.

It's eventually compatible, sometimes lagging a few versions behind. Using LDC for release builds is somewhat comparable to only using stable Rust for release builds, except D has a slower pace of new feature additions than Rust since it already has so many features.


D deliberately restricts what compile time code can do that involves the operating system, restricting it to simply reading files from the current directory and down.

This is to prevent the use of the D compiler as a malware target by sending it specially crafted source code to compile.


Unfortunately, these restrictions make it hard to implement F# style Type Providers. Though honestly the filesystem one is not nearly as annoying as the inability to reinterpret-cast.


That's fine with me. I don't want to make the cross site scripting mistake.


Which happen to be currently broken on modern .NET runtimes because it wasn't done on a clean way apparently.


couldn't you just link to a c file in the same dir then use that to read up the file tree. I would be interested to know how D would prevent this.


D files don't #include .c files.


> Go actually has a built-in "Go generate" command that will look for magic "//go:generate <command>" comments in the code and automatically run code generation based on them, although I'm not sure how ergonomic it is.

But it's still not run automatically at build time, you have to go out of your way to run `go generate`, which isn't gonna happen if you, as is customary, fetch and build third-party code with `go get`.


> which isn't gonna happen if you, as is customary, fetch and build third-party code with `go get`.

Except that the convention is to check in the files generated by go generate, so in practice, it's not actually a problem.


> It just allows you to logically separate build time I/O from runtime I/O, which I find is nice for reproducibility of builds and maintainability of code.

You can get the same effect if you really need it with a pre-build command with dub (the package manager/build tool, and with other build tools such as reggae), but compilation is supposed to not change the state of the system other than produce build artefacts, you can read files at compile time with `import("filename")`, but you can't write to files, or open sockets etc.

Things like mixing in a data file is incredibly useful and not something I would want to use a rebuild step for at all.

> Is it compatible with 100% of the code that the current DMD release is compatible with? That’s all I ask for in alternate compiler implementations... but it seems like a lot, sometimes.

Not strictly.

It is usually pickier about what is deems valid (which is usually corner cases that DMD really ought to reject). They share the same frontend so there is very little difference.

It has some of its own bugs, though nothing too major.

It generally has way less bugs both due to its release cycle being based off of DMD stable and thus it lags one cycle behind (~six weeks), so new features aren't available immediately, and having all the backend and optimisation handled by LLVM, though we occasionally hit LLVM bugs that clang for some reason does not seem to trigger.

(I develop both DMD and LDC (the LLVM based D compiler))


> Things like mixing in a data file is incredibly useful and not something I would want to use a rebuild step for at all.

It’s not directly part of the const function, but rust has compiler built-ins to import files as values as compile time, which makes it easy to declare global static variables that contain that data.

This is different from being able to walk a directory and read random files.

https://doc.rust-lang.org/std/macro.include_str.html

https://doc.rust-lang.org/std/macro.include_bytes.html

> They share the same frontend so there is very little difference.

That’s excellent.


> This is different from being able to walk a directory and read random files.

D doesn't allow you to do that either. You have to specify what files you are importing via a compiler switch: https://dlang.org/spec/expression.html#import_expressions


Yes. That's another barrier I put in to prevent a carefully crafted source code from being a malware vector.


> Is it compatible with 100% of the code that the current DMD release is compatible with?

For practical purposes: yes. We use DMD for development and LDC for release builds, and it works really well in tandem.


Becoming a C++ expert is impossible in a lifetime, becoming a D expert is a matter of months.

It is, from the very core, a way simpler language than C++. I'd argue it's simpler than Rust, even. IMO acknowledging a few function attributes is way simpler than learning all the flairs of ownership and lifetimes.

As for metaprogramming, I love D's introspection capabilities, while I kind of want Rust's ability to create unquoted DSL.

The only "non-uniform" part of D that I can think of is the oddness of `shared`, and a few outdated stdlib modules. Everything else (betterC, pure, live, etc.) is just a subset of a powerful, coherent multi-paradigm language.


I like D, but if you spend a some time in D forums, you would get that coherency is something that even D could improve upon, specially attribute usage, inline assembly compatibility among its implementations and some constructor/destructor corner cases.


> To become an expert in D or C++, you really need to understand what feels like a dozen different languages... and that’s not what I’m looking for in a single language.

Well, the alternative is to learn an actual dozen of language - one for the high-performance simd part of your app, one for GUI layouts, one for writing parsers, one with strong typing semantics for your domain objects (but how are you going to keep them aboard in the interfaces with your other languages !), one for network communication...

I much prefer (as a C++ dev, but the same applies to D), the ability of the language to make eDSL adapted to each tasks.


Rust’s macro system is incredibly powerful — much more powerful than C++’s, which is just text substitution. It’s easy to make DSLs in Rust. That’s not what I’m talking about.

> Well, the alternative is to learn an actual dozen of language

That’s a false dichotomy.


C++ also has template metaprogramming and compile time execution nowadays.

Common wisdom is to leave macros for conditional execution, legacy code and the few cases that still aren't fully designed, namely reflection and metaclasses.


Rust macro system is much more powerful than D mixins.

I personally find Rust macro system infinitely easier to use than D mixins as well. In D, I need to learn a new pseudo-language to work with mixins, but Rust macros are just normal Rust code that gets executed at compile-time on other Rust code, and this code can do anything that any running Rust program can do.

The structure of the 2 fundamental Rust libraries around Rust macros are super intuitive to me (maybe its the CS background?). `syn` is a Rust parser from tokens -> AST, and it supports doing AST->AST folds and other common operations. And `quote` gives you semi-quoting.

With `syn+quote` most macros end up as 10 liners. Tokens->AST->AST fold->Quote->Tokens.


Huh? D metaprogramming is much closer to "normal code executed at compile time".

Rust procedural macros are like external tools, manipulating the AST as a structure. In D, compile time code is seamlessly interleaved with other code – you just have `static if`, `static foreach` etc. in your code.

I'm not sure what you mean by "new pseudo-language to work with mixins". The term "mixin" is unfortunately overloaded: the `mixin()` call just splices a string into the code, while `template mixin` is a way to expand a template where you want it. Neither introduces new complex structures.


> Rust procedural macros are like external tools, manipulating the AST as a structure.

Isn't viewing them as like external tools just a C centric view, based on C's (and C++'s) capabilities? Didn't Lisp have macros that manipulated the language as an AST prior to C even existing, inside the language?


Well, Rust is closer to C++ than Lisp :)

If you look at the API https://blog.rust-lang.org/2018/12/21/Procedural-Macros-in-R...

it works on the level of raw tokens, not even the AST. So you have to construct your own AST via the parser if you want to manipulate AST.


I don't know Rust macros, but I D had a old example that generates an image, using raycasting, at compile time.


> Well, the alternative is to learn an actual dozen of language

I would say that this is actually almost always preferable, for two reasons:

1. I believe it's generally good for programmers to know multiple languages, to be exposed to different ideas, ecosystems etc. This broadens worldviews, makes you think outside the box more often, etc.

2. Each of those languages will probably be better at what they do than D's subset. For example, D's GC isn't really one of the best ones out there. Its pure isn't the same thing as FP pure. Its move semantics is crap compared to Rust (probably even C++ does them better). The quality of the borrow-checking remains to be seen but frankly I'm skeptical, it'll probably follow the same pattern as the other features.


If you don't mind, I'm curious about what "partially move structs" are? Structs have always feel like a chore to me, complicating things more when compared to maps in dynamic langs. Maybe Rust has something for dealing with the "explosion of types" issue and dealing with partial information in structs that I don't know about. Skimming through the rust book looks like Option<T> is the solution to missing data in structs.


Partially moved just means that the compiler tracks which fields have been “moved out” of the struct and prevents you from either accessing those fields or attempting to reuse the partially moved struct as if it were still a whole struct, since it is no longer whole... it has given up some of its members, conceptually.

In the article linked by this discussion, the author specifically mentions how fields can’t be moved out in D’s implementation.

Option is the correct approach for dealing with information that might be missing, but that’s not really related to partially moved structs.

Structs document what is available in a given value, to both the compiler and to you. If that’s not convincing, then I’m not going to try more here... it’s off topic. But I will say you can use maps in Rust or other statically typed languages. They’re just not meant to be an alternative to well-defined structs.


Can Rust now handle partial move semantics on structs? I remember reading it couldn't do it before, and there were threads about how you should make small structs because of that, or that you should not use methods, because they take ownership of the full struct and not just the fields that they actually use, etc.


It can, with the big caveat that it only works well if you don't cross a function boundary.


Ah, so passing a struct as an argument to a function will still take ownership or borrow the entire struct?


Sure, my intention was not to get in a dynamic vs static lang argument, when seeing "partial moved structs" and "ergonomics" in the same sentence my mind jump to think that there is something Rust that can make it feel closer to a dynamic lang regarding types ;) Thanks.


Rust doesn't have anything for that. Partial move means that if a function only really need to access one field on the struct, it would only take ownership of that field, even though it took the full struct as an argument. So two functions could have ownership of different fields of the struct at the same time.

One thing with Rust is that it is really focused on performance and safety. It also cares about ergonomics, but not over performance and safety. Unfortunately, type explosion does make for faster code that can be validated more easily for safety by the compiler. That said, Rust has good support for vectors, those are used a lot, but are homogeneous. It also supports tuples with nice literal syntax and pattern matching on them. While it favours structs over maps, they are pretty lightweight compared to objects, so very small syntax to define them, you can create one out of another, and have a nice literal syntax to create them. Still, this is not the magical world of dynamic data-structures, so don't try and think you can just map/reduce over your structs and tuples, compose and restructure them freely, print out their content, or merge and split them, etc..


So the only thing that is "moving" is the ownership of elements of a structure.

It seems "partial move" actually means "partial transfer".


> So the only thing that is "moving" is the ownership of elements of a structure.

Yup

In rust, you can move, copy or borrow a value.

Move - means that the responsibility of freeing the allocated memory is moving to someone else. Rust says the ownership has moved.

Copy - means that the full value is copied, and a copy is passed, thus now you have two instances of the value, and each can act independently.

Borrow - means that the access permissions to the value are temporarily transferred to something else, but will come back when done, and the responsibility of freeing the memory is still up to the owner (lender).

That's why it is called "partial move", cause it's using Rust's terminology. Transfer isn't in the Rust lexicon, so it wouldn't be as appropriate.


There are no plans at the moment for D to implement partial struct moving.


One problem with modern languages is that when complexity grows as more and more features are bolted on, old languages like C will look like a viable option. The hardest part for a maintainer of any software is probably to keep features out.


> - Higher-kinded types (template template params): it's possible to implement Haskell-style monads etc. in D, if for some reason you felt the need.

Because the templates aren't typed at all. I'm not sure you can call untyped templates "Haskell-style".


While the templates are technically untyped (in the sense that there are no type-classes or traits in D), you can use template constraints to achieve something similar: https://dlang.org/concepts.html. Essentially you'd write a compile time isMonad predicate that only returned true if certain functions were defined for a type.

Even without that, the key thing is it's still possible to implement monads in the C++ style (https://stackoverflow.com/a/2565191/2553416), whereas it's impossible to write code to do the same thing in current Rust. It just gives much worse compiler errors if you use it wrong compared to if the language had built in type-classes / traits.


Template constraints aren't the same thing as typed generics. The key issue is that typed generics influence the typechecker. For example, you can write "let x = Default::default(); f(x);" and the compiler can determine the type of x from the type of f.

Also, there is the "higher" crate for higher-kinded types if you need them, so it is possible to write a monad typeclass in Rust. It's not exactly something you'd want to use, though.


>For example, you can write "let x = Default::default(); f(x);" and the compiler can determine the type of x from the type of f.

That would be less helpful in C++/D because their compilers don't do Hindley-Milner style bidirectional inference, so even if they had typed generics they wouldn't necessarily be able to do that inference. But I suppose that's an argument in favour of Rust, stronger type inference.

>It's not exactly something you'd want to use, though.

Ergonomic HKT can make writing various kinds of utilities easier. E.g. imagine I want to write a function Singleton that takes a collection C (vector, list, stream...) and an item of type T, and returns C<T>, a collection containing a single instance of that type. I can write that in Haskell, D or C++, but not in Rust. Or perhaps more usefully, a function Fill, that returns a container C containing n items of type T. For Rust it seems I'd need to write a separate implementation of Fill for each container type, I couldn't write one parameterised over any C?


You can do that without HKT, using FromIterator (which is also used by the widely used iterator method "collect").

Here is an implementation: https://play.rust-lang.org/?version=stable&mode=release&edit...

Fill is also possible, using the same trait and iterators.


I stand corrected. Could the same technique also be used to have a struct parameterised by C and T that had a C as a member?


You can't partially apply type parameters in rust, that is HKT. But traits & associated types in traits get you pretty far.


"wouldn't necessarily be able"

It goes completely against the most (perhaps) fundamental part of C++ philosophy of overloading functions on argument types.


Overloading does not conflict with type inference: it just means that type annotations are sometimes necessary to get down to concrete types for execution. From "f(x)" we can infer that the type of x is one of f's overloaded argument types.


> - Pure functions: these allow you to statically guarantee that a piece of code doesn't perform any IO and is a completely deterministic function of its inputs.

Is this actually true? As far as I know D's pure isn't the same as eg. Haskell pure, it only prevents modifications of global state, but you can still touch data referenced by arguments. I haven't done much D though so I'm not sure what the semantics really are...


If the parameters to a pure function are marked immutable, then you cannot change data referenced by those arguments. It's functionally pure.

Ironically, sometimes people complain that the purity checks are too stringent :-)

One thing interesting about pure functions in D is the purity restrictions can be relaxed by using a `debug` prefix:

    pure int test(int i) {
      printf("%d\n", i); // error, printf isn't pure
      debug printf("%d\n", i); // ok
      return i;
    }
This makes it very, very handy to debug your functional code.


> Guaranteed memory safety: Yes! (in code marked with the appropriate annotations)

Really? So if a thread passes a pointer to an object to another thread which frees it, the sending thread can no longer reference it? Genuine qn.


According to the linked article, by "passing a pointer" the sending thread relinquishes ownership of the object and, yes, cannot access it anymore after the point where it was sent.

Of course this assumes all code in these threads use the ownership+borrowing mode.


The borrowing scheme works for pointers that are passed in to functions, but not ones returned from functions As long as you don't go against the grain of the scope, you are fine. It is a natural and simple restriction that can be addressed by local data flow analysis.

The problem, as I see it, is going against the grain of the scope, a problem faced by iterators. For example, consider

    for (p = f(s); p; p = p.next) {} 
won't be allowed, since f may not be able to return a borrowed pointer (like s.ptr). Rust solves this using parametric lifetimes.

In general, I am saying that returning a non-owned pointer from a function is common, and required. Am I wrong in this assumption? If not, is there an alternate architecture to get around this problem? Note that marking the pointer as const is not an option.


The solution is to have the function not return a pointer but return a ref. A ref is a non-owned pointer.


Would this be similar to passing argument(s) and return pointers/variables to a C function - or similar with assembler, reserving some registers for return values (aka caller manages memory)?

But with the (d) compiler helping enforce correctness?


Not sure what you mean, but sounds like it is.


Thanks. References work for the iterator example.

I am a bit uncomfortable with not being able to move a pointer out of a structure, but it may be a restriction that's worth it; I can't think of a counterexample off the top of my head.


Oh, nice. It's hopeless for C/C++ for legacy reasons, but D - that could work.

Two memory safety related things to consider. Rust needs unsafe code to do these things.

1) Backpointers. Rust's ownership system doesn't allow backpointers, at least not without reference counts. So some standard data structures, such as doubly linked lists and various types of trees, can't be coded in safe Rust.

A backpointer is defined by two objects which have an invariant locking them together. If you could talk about a backpointer in the language, and tell the compiler where its forward pointer is, you could enforce the necessary invariants. If A contains a pointer to B, B's backpointer must point to A. Any code which modifies either of those pointers must maintain the consistency of that relationship. That's checkable at compile time.

This fits well with ownership. A owns B. B's backpointer is a slave. All that's needed is type syntax which says "backpointer b.reff always points to my owner."

There are lots of interesting cases to be worked out, especially around destructors. Those are worth checking at compile time, because people get them wrong. It's really nice if you can be sure that a data structure always tears itself down cleanly when you drop the link to its root.

2) Partial initialization. Collection classes which allow growth are hard to do in safe code, because they require arrays which are partially initialized. It should be possible to talk about partially initialized arrays in the language. Maybe some predicate such as "valid(tab,n,m)" indicating that array tab is valid from n to m inclusive. You can't access an element outside the n..m range. The checker for this needs to know some simple theorems, like "valid(tab,n,m) and valid(tab,m+1,m+1) implies valid(tab,n,m+1)". Then it can track through a loop and verify that the array is initialized for the given range.

Somebody will complain that they want sparsely initialized arrays containing types that really need initialization. Tell them no.


Backpointers can be more than one step away, for example with a circular linked list, or with a directed (not-acyclic) graph. It's harder to see how to enforce these more complex invariants at compile time (but maybe not impossible).


That's where the language designer has to say "no". Trying to make ownership semantics work for arbitrary graphs is too much trouble.

I had to deal with this once writing a 3D collision detection system. Objects were convex hulls, with vertices, edges, and faces, all pointing to each other. I had an implementation in C to look at, and it was a mess. Ownership was so tangled that they couldn't delete objects.

So I rewrote it in C++, with each object having collections of vertices, edges, and faces. All links between them were indices, not pointers. Lots of asserts to validate all the consistency rules. Easy to delete, and much better cache coherence. Chasing pointers to tiny objects all over memory belongs to the era when cache misses didn't dominate performance.

So that's the answer. Some data structures are too complex for ownership semantics. So do them another way and have all the parts be owned by some master object.


> Oh, nice. It's hopeless for C/C++ for legacy reasons, but D - that could work.

What in particular do you think would be impossible in C++ for legacy reasons?

From a very high level it seems to me that all the necessary ingredients of this solution are available in C++ (> C++11) as well. There are pointers, smart pointers, references, constness and move semantics for the ownership and borrowing. There are [[attributes]] for the method-selective opt-in.

There is of course always the argument of not making C++ compilers even more complex. But is there some actual technical limitation in fitting the same idea into C++?


You need to prevent copy-paste compatibility with C like code, which is kind of possible with static analysers, but even then only over the source code you have access to, forbading any 3rd party libraries.

So unless you get the buy-in from the whole team, it hardly works across big projects.


> Maybe some predicate such as "valid(tab,n,m)" indicating that array tab is valid from n to m inclusive.

That can be done with a Vec in Rust (n is vec.len() and m is vec.capacity()).


Vec is an unsafe built-in. If you can talk about partially initialized arrays, you can write Vec as safe code.


> I’ve been using malloc and free for 35 years, and through bitter and endless experience rarely make a mistake with them anymore. But that’s not the sort of thing a programming shop can rely on, and note I said “rarely” and not “never”.

Speaking as someone who often chooses the bitter path (because that's how I like it), this is an honest look at the state of things that I rarely find in articles about programming.


This seems to be all about heap-allocated objects. Can the ownership machinery reason about lifetimes of stack-allocated objects at all?

For example, i would like to declare a buffer on the stack, read some data into it, make const pointers into the buffer, put those pointers in heap-allocated structs, put those structs in a vector, and do some manipulation. As long as all the heap-allocated structs expire before i return from the function where the buffer was declared, this is entirely safe, but if they escape, i'm in trouble.


> Can the ownership machinery reason about lifetimes of stack-allocated objects at all?

It already does. If you compile with the -dip1000 switch, you'll be unable to escape references to the stack for @safe code.


Does this mean that you can't take unique/mutable references to the inside of reference-counted objects? Is there no escape hatch, like Rust's RefCell?


You can, it's just that the RC object itself cannot be implemented with the @live checking on. The RC object can then present itself with a @live compatible interface.


So you can call code that's not annotated with borrow checking annotations from code that is ?


Yes.


So how do you guarantee that the caller invariants are upheld, e.g., that the unannotated code does not leak a pointer to a different thread, etc. ?


> So how do you guarantee that the caller invariants are upheld

You can't, you have to trust the user. That's why one tries to minimize calling non-@live code.


How do you do that in unsafe Rust?


How does that look like? Do I annotate it as @undead?


While I like your style, in D we do it with @trusted annotations.


I still don't understand how that works. How do you avoid having two unique, mutable references to the same value in that case?

For example (pseudocode):

    void f(shared_ptr<Foo> a, shared_ptr<Foo> b) {
        g(&mut *a.field, &mut *b.field);
    }
This is unsafe because a and b might point to the same object.


That is what the part of this idea that Walter is tackling first, See https://github.com/dlang/DIPs/blob/master/DIPs/DIP1021.md

The short of it is that the compiler will enforce that the same pointer is not provided to function more then once at the call site.


Very interesting!

One of the key points here:

> This means that [Ownership/Borrowing] can be added to D code incrementally, as needed, and as time and resources permit. It becomes possible to add OB while, and this is critical, keeping your project in a fully functioning, tested, and releasable state. It’s mechanically auditable how much of the project is memory safe in this manner.


Is the proposed embedding of ownership/borrowing into D complete? Rust has these ugly lifetime annotations which is one of the reasons I do not like its syntax. Why is there nothing like this in the proposal for D? Or why is it not needed?


Most languages with some form of GC (Swift, D, C#) that are looking into adopting some form of ownership look at it from the productivity point of view, where you would only make use of it in the hot path, the very last mile so to speak.


There are already `scope` and `return` annotations for function parameters. My proposal does not require more annotations.


I think this looks very good. The main thing though is that it is opt-in where you want it. If it became the default, or required then it would make the language difficult and unsuitable for many things, but as an option it's great :)


So, if i have a mutable pointer, i can make multiple const pointers which borrow from it - but i'm not allowed to use the mutable pointer while those borrows exist. Makes sense.

Can any of the const pointers outlive the original mutable pointer?

I would guess not, because you're not allowed to let an owning pointer fall out of scope, and you can't pass the owning pointer to a freeing function while the const pointers are alive.

So is there any way to start with only a mutable pointer, and end up with only a const pointer? Say i am gradually building up some object, i want to use a mutable pointer to do that, but once i'm one, i want it effectively frozen, with only a const pointer to it.


The article says that const pointers can't release their memory, so that would lead to memory leaks. (Obviously in some cases you might want to keep that const pointer until the program is finished, but it's not a generic solution to the "I want a pointer that can't be mutated" because often you do want to drop the value at some point)


That's a very good point!

I was coming from a Rust mindset, where ownership and mutability are somewhat separated. Although you can't own an object through an immutable reference, you can own one through an immutable binding. But i am slowly learning that D's model is sufficiently different that there isn't a useful comparison here.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: