Hacker News new | past | comments | ask | show | jobs | submit login
The Periodic Table of Rust Types (mearie.org)
150 points by lifthrasiir on Jan 15, 2014 | hide | past | favorite | 85 comments



I think this a great way of presenting type features that interact! props to the author for cooking it up


> I think this a great way of presenting type features that interact! props to the author for cooking it up

It's a great way to present it, but like Scala's infamous "Periodic Table of Dispatch Operators", stuff like is rather off putting. Why use crude sigils when you could just as well use easily understood keywords (like ref, out, unsafe etc. in C-Sharp)?


It's nice to have very common features be syntactically lightweight, and sigils are incredibly lightweight compared to two-or-three character keywords. Rust code uses these pointers in all but the most trivial code, so they'll be showing up a lot; in contrast, I suspect as a non-C# programmer that ref and out are not as widely used in most C# codebases as ~ and & are in Rust. There are also not that many sigils, all told—three sigils for pointers plus a few other operators that appear in this table.

Additionally, this table is different from the tables of Scala dispatch operators, Perl operators, &c because of its regularity—it has two axes and a good deal of the information presented is a straightforward function of a cell's position in the table[1]. For example, the entry in the Owned column and the String row is going to be an... owned string, which is going to be prefixed with ~ (like every cell in the Owned column) and have a base type of str (like every cell in the String row). This is a far cry from the table of Scala's dispatch operators, in which there's no consistent indication of what a given operator will necessarily do aside from a general grouping of like operators.

[^1, edit]: That isn't to say that ALL the information is a (predictable) function of a cell's position in the table. Some of that irregularity comes from arguably expected consequences of Rust semantics (e.g. of course you can't have a bare string, because you can only have bare values whose sizes are known in advance, and you don't know how long a string will be) while some is genuinely arbitrary, like the syntactic sugar for functions.


And one of those sigils has effectively been deprecated, it's just & and ~ now. And ~ might turn into `box`.


My understanding was that `box` would be for allocation, but type definitions still need to have the ~ sigil. But I could be mistaken.


Funny enough, ~ sigil will probably be retiring in favor of box.

    let x = ~"Thing"; //old code
    let x = box Thing // new code probably
Note: I said probably because it's probably because:

A) Not sure it will be implemented

B) Not sure if that will be syntax

--- PS. Any link to dispatch operator periodic table?

Also I take your scala and raise you a Perl: http://glyphic.s3.amazonaws.com/ozone/mark/periodic/Periodic...


Re: the dispatch periodic table. This is from a library in Scala (a web client called dispatch) that was someone's first project in Scala and went very overboard on the use of operators. It is generally considered bad design.

The Dispatch periodic table is here:

http://www.flotsam.nl/dispatch-periodic-table.html


Just to avoid any misunderstandings: The linked page is a (very nicely done) joke. These are not Scala operators, but (by now retired) operators of a particular third party library for Http dispatch.


That's a shame, I happen to like the use of sigils to denote pointer types, to my eyes it makes code a lot easier to read.

Out of curiosity why are they making the change? And will they add the `let x = box Thing` syntax or as an optional addition to the language or will it fully replace the old `let = ~Thing` syntax?


I think the idea is to make heap allocation taxing to type. Basically the sigil is too easy to type, so its easy for application to just allocates a whup ass of memory. So, this way it might force you to think about each allocation.


Aww, I thought this was going to be a table of metal oxides and their chemical & physical properties.


Ah, you're looking for Hacksaw News.

So… would a dialect of Ruby without side effects be called Corundum?


Haha you aren't alone on that one.


This is great, and just what I needed to organise Rust types systems in my head. This should definitely be a part of the official documentation.


can we just make it "table"? There are no periodic trends that I can see.


There are some trends. Types to the right are "subsets" of types to the left; e.g. a &mut T can coerce to a &T, similarly, you can box a T into a ~T, or take references to get an &mut T or &T.


periodicity refers to (originally) that the trends are cyclical with respect to atomic mass - we now know that it's with respect to atomic number and that the period increases (2 - 8 - 18 - much more) but nonetheless between multiple rows there's an increasing valence. Hence there is a repeating aspect 'period' driven by a connection between row n and n+1 (row n+1 immediately follows row n in atomic number). There's no connection between row n and row n+1 in this table of rust types. The vertical organization is completely arbitrary, there is no ascending value across the table, and therefore it's not periodic.



That was my direct inspiration, though I suck at graphics design. Fortunately the periodic table of Rust types was much concise than that of Perl 6 operators...


Structure types?


They're of type T.


This all seems like a mis-feature. Smells of C++.


It's actually correcting the mistakes of C++:

- Ownership: means no unsafely shared mutable state, checked at compile time with no runtime overhead. Also, no explicit deallocations!

- A real type system: Rust's type system is inspired by the Hindley-Milner type inference algorithm used in languages like ML and Haskell


The thing is I've used lots of languages with full H-M inference that are much less complicated than this.


> lots of languages with full H-M inference

These are the ones that come to my mind:

- ML family (Standard ML, Caml, OCaml, F#)

- Haskell

None of these are competing in the same space as Rust: fine-grained memory control (read: perf on par w/ C++) with zero-cost abstractions for safety.

You can certainly argue that all of those languages are as safe as Rust, with the lack of nulls and explicit mutability (taken to a new level in Haskel), but you can't say they expose a memory models that actually reflects the underlying system (and are as tunable) to the extent that Rust does.


Rust implements a kind of region typing or linear logic, too, right? That's significantly beyond anything you'll see in garden variety Haskell/ML (though you can embed linear logic in the Haskell type-class machinery with a final encoding [0]).

[0] http://okmij.org/ftp/tagless-final/course/#linear


Yes to both. Lifetimes are basically regions, and unique types are basically affine types. (We usually stick to the C++ terminology, though, for familiarity's sake.)


Although OS research is being done with OCaml by Xen Foundation and Cisco

http://www.openmirage.org/


This is true. However, Rust is intended to be a systems language, which among other things means no, or at most optional, garbage collection and types for unboxed values.

Like Java generics and subtyping, any given part may be simple on its own, but the combination is not.

Rust tracks lifetimes for stack- and dynamically-allocated values as part of the type system; hence "owned" pointers. Which are horrible by themselves; hence "borrowed" pointers and the resulting lifetime complications. Rust includes traits, which are similar to Haskell type classes and are very nice. However, they come with a heapin' helping of their own complexity.

And so forth. Rust is aiming for a sweet spot somewhere between a relatively-simple Hindley-Milner[-Damas] parametric polymorphism and full-on dependent typing. So far, I think it hits a pretty nice local optimum.


There's already plenty of people talking out of their asses here. No need to add to it.


Could you elaborate as to what you'd prefer instead?


Don't forget there are two types of languages:

1. Languages that people moan about

2. Languages that people don't use

I'd love if magical syntax fairy would make Rust easy to read as Python, but I think the complexity fits the domain. Safe manual memory handling is probably gonna look weird because of guarantees it must make.


A simpler language.


And how do you propose to get a simpler* language with equal control and safety?

*I actually think Rust is relatively simple: once one "gets" ownership, then most of the other hard things follow from it, including lifetimes.


Look at nimrod (http://www.nimrod-lang.org).


Nimrod is GC'd (reference counting is a GC, and DRC has pauses), or unsafe if you don't use GC. It's not in quite the same space as Rust.


I'm a massive Nimrod fan, but those safety guarantees are void once you use manual memory management. What Rust gives you is safety with memory management. Nimrods GC and type system are both really powerful, but Araq and I were chatting about Rusts type system on irc for a reason...


I'm probably in the minority here, but I think the answer is "so use a GC". The number of people on earth that I would trust with my life to correctly manage memory in a complex application is very small.


You don't need to trust people using Rust, you just trust the type system.

Admittedly, the memory safety features of the type system haven't been formally verified, but this is a goal, and there is a rather large piece of in-source documentation: http://static.rust-lang.org/doc/master/rustc/middle/borrowck... (I haven't read it, so I have no idea if it will make any sense to someone who doesn't know Rust.)


> The number of people on earth that I would trust with my life to correctly manage memory in a complex application is very small.

The Rust compiler checks it for you, so you don't have to trust anyone. :)


What kind of guarantees does the type system provide? Can I say that no terminating/productive Rust program leaks memory, full stop?


Rust's type system statically guarantees that memory will never be accessed after it is freed, so no segfaults, no use-after-frees, no double-frees, no iterator invalidation, etc. It also statically eliminates data races between concurrent tasks.

In the past Rust has attempted to use the type system to prevent memory leaks in certain cases, but the features that attempted to do so were deemed overly restrictive to use for practical purposes. Nowadays I'm sure it's possible to leak memory if you try. Honestly I've never heard of a Turing-complete language whose type system can provide such a guarantee.


> Honestly I've never heard of a Turing-complete language whose type system can provide such a guarantee.

SPARK (a dialect of Ada) can, I believe. But it does so by forbidding allocation :)


Doesn't it make a stronger guarantee, that you cannot cause an invalid dereference? In addition to what you mentioned, this would also cover bounds-checking, trying to dereference a pointer that was never allocated, etc.

Also does it enforce that memory is consistently used as a single type? Can you allocate a byte array and then cast it to an appropriately sized array of integers?


  > Doesn't it make a stronger guarantee, that you cannot 
  > cause an invalid dereference?
I'm not knowledgeable enough to answer that question precisely.

However, I can tell you that Rust's type system is not strong enough to obviate bounds checking. I hear you'd need something like Idris' dependent types for that. Rust bounds checks arrays dynamically (there are `unsafe` functions available to index an array without bounds checks), and avoids bounds checking on arithmetic by guaranteeing that fixed-sized integers will wrap on overflow (which is gross, but might be changed to fail-on-overflow if it doesn't hurt performance too much).

  > Can you allocate a byte array and then cast it to an 
  > appropriately sized array of integers?
You can't do this in safe code, but you can in `unsafe` code via the `std::cast::transmute` function, which does still enforce that both types are the same size.


> However, I can tell you that Rust's type system is not strong enough to obviate bounds checking.

That's a bummer. It seems doable, but maybe it is too complex.

> avoids bounds checking on arithmetic by guaranteeing that fixed-sized integers will wrap on overflow (which is gross, but might be changed to fail-on-overflow if it doesn't hurt performance too much).

That would be nice as a default, but I'd be afraid it would hurt performance too much for numerical code. You'd definitely want a way to express arithmetic should be allowed to overflow (ie. that omits the check).

On a related note, one thing that is sorely missing in C and C++ is a way to test whether a value will overflow when converted to a different type (I wrote a blog article about this point: http://blog.reverberate.org/2012/12/testing-for-integer-over...)


> That's a bummer. It seems doable, but maybe it is too complex.

In general, eliminating runtime bounds checking is solving the halting problem.

  let v = [1, 2, 3];
  if halts(some_program) { v[1000] } else { v[0] }
Of course, this doesn't meant that it's impossible in a subset of cases, e.g. people are doing work on fast range analysis for LLVM, which would be able to remove some of the bounds checks sometimes: http://code.google.com/p/range-analysis/ (that analysis also applies to the arithmetic, and apparently only makes things single-digit percent slower (3% iirc).)


> In general, eliminating runtime bounds checking is solving the halting problem.

Your example does not convince me that this follows. In cases where the compiler cannot statically prove which branch will be taken, I would expect it to require that either path can be executed without error (so in your example compilation would fail). But you could use static range propagation to allow code like this (given in C++ syntax since I'm a Rust novice):

  void f(int x[], unsigned int n) {
    n = min(len(x), n);
    for (int i = 0; i < n; i++) {
      x[i];
    }
  }
Maybe not the greatest example since I would hope that Rust+LLVM would hoist its internal bounds-check to emit something very much like the above anyway. I guess intuitively I just hope that there's more that can be done when it comes to static bounds-checking.


Well, if you throw the halting problem at it then all of your static analysis goes away since you can use general recursion to write arbitrarily typed expressions. That's why things like Idris have termination checkers.


Yeah, there's definitely some fuzziness around the edges here since general recursion can break invariants in hard to detect ways.


You can't even say that about languages _with_ a GC. At least not as long as you use the practical definition of "leaks memory", which is that the memory remains alive until the application shuts down. Here's a simple example of a memory leak in JS in that sense (modulo nontrivial manual cleanup, obviously):

  window[Math.random()] = new Array(100000);
A much more interesting question, in some ways, is what guarantees you have about not accessing no-longer-alive objects. That's where Rust has some serious advantages over C++, say.


I think you could in raw linear logic—you wouldn't be allowed to introduce a type unless you eliminated it later.


There are many problems where a GC is too intrusive (one notable one is: writing a GC). Until you acknowledge this, Rust will make little sense to you (nor will C or C++, for that matter).

The whole problem Rust is trying to solve is that a programmer can do manual memory management without anyone needing to trust that they have gotten it right. The compiler can automatically check correctness (except for unsafe blocks, which are kept to a minimum).


I think the answer is "don't use a language that you don't want to use". If Nimrod works for you, that isn't a reason to bash Rust.


Different use cases, Rust isn't for you it seems.

The whole point of Rust is to make memory management safer.


So why not let those small number of people have access to a language which makes it much easier to correctly manually manage memory?


It doesn't satisfy the safety or control criteria, e.g. it has a compulsory GC.


  > And how do you propose to get a simpler* 
  > language with equal control and safety?
Easy - write your own compiler! :)


Simpler or easier?


Perhaps more words and fewer symbols.


Symbols have been systematically stripped from the language. There really aren't a significantly large number of them anymore, at least in comparison to other curly-brace languages.


You can make a similar table with C++, complete with pointer, pointer-to-member, function pointer, member function pointer, reference, r-value reference, array, std::auto_ptr, std::unique_ptr, std::shared_ptr, const and volatile. :)


Except C++ doesn't have, for instance, a way to mark data as immutable, so the table's going to be a bit lacking...


Err, I'm a big fan of Rust and all, but I'm pretty sure c++ does have const and so on. What c++ lacks is a difference between owned and borrowed pointers, except by convention and so on.


Ah, but C++ "const" doesn't do what it says on the tin! What "const" means is not "constant", but "read-only". Something that's const to you might not be const to something else, so you can never depend on it staying the same.

I may be wrong, but my understanding is that Rust's constants are actually constant, and proved as such by the compiler, which is a major difference over C++.


That's right. If you have an `&` reference to something, the language enforces that it will never be mutated as long as that reference is alive. (Also, if you have an `&mut` reference to something, the language enforces that you're the only one who can mutate it while that reference is alive; that's how iterator invalidation is prevented.)


I've often wished for this in C and C++, particularly from an optimization perspective. I always hated that the optimizer cannot assume that the the contents of a struct to which I have a const pointer won't change out from under me when I call another function. It means that it cannot cache members of this struct in callee-save registers; it has to reload them every time.

I don't know how much of a speed difference it would actually make in practice, but it bothers me that I cannot express this in C++.


Rust's "immutable" references ensure that it cannot be changed from either the reference holder or other safe code that has a reference to the same memory region. Note that (as the periodic table suggests) "mutable" references can be downgraded to the immutable references, but the mutable references are locked while the immutable references are active. Once the immutable references are gone (this is checked by the compiler, see the lifetime guide for details) the mutable references can be used again.


Not only is it "read only," it's not even necessarily read only:

  #include <iostream>

  void f(const int* var)
  {
    *var = 42;
  }

  int main()
  {
    int* ptr = new int(78);
    f(ptr);
    std::cout << *ptr << std::endl;
    return 0;
  }
Compiling produces:

  [scott_s@local Code] g++ const.cpp 
  const.cpp: In function ‘void f(const int*)’:
  const.cpp:5: error: assignment of read-only location
But if we make it:

  #include <iostream>

  void f(const int* var)
  {
    int* sneaky = const_cast<int*>(var);
    *sneaky = 42;
  }

  int main()
  {
    int* ptr = new int(78);
    f(ptr);
    std::cout << *ptr << std::endl;
    return 0;
  }
We get:

  [scott_s@local Code] g++ const.cpp 
  [scott_s@local Code] ./a.out 
  42
C++ gives us escape hatches all over the place. I think that the modern approach of compartmentalizing all escape hatches into explicitly regions of "unsafe" code, and providing no escape hatches outside of such regions, is much better.


> Ah, but C++ "const" doesn't do what it says on the tin! What "const" means is not "constant", but "read-only".

Yes and that's a huge trap, which is why I'm so annoyed that D has copied 'const' from C++ instead of renaming it to 'view' or 'read'..


C++ const doesn't actually provide constness for purposes of concurrent access unless you follow a bunch of other rules (no const_cast, no use of "mutable", no use of mutable class statics, no use of mutable file-level statics, no use of mutable globals) that in practice people violate all the time even with const objects.


[deleted]


The main purpose of Rust is to be a practical, safe alternative to C++. Go and D are NOT such languages, because of the mandatory garbage collection. From time to time there's been efforts to replace C++ with safer garbage collected languages such as Java or C#, and every time it basically failed. Direct control over memory has been proven pretty much essential in high-performance software. Thus, we're still stuck with C++, and we're still suffering for it, having nightmarish bugs and myriads of security holes and whatnot.

Rust at least tries to make an honest effort to change that. If and when it gets really fleshed out it'd be a boon to all those poor developers. And please forget your awfully subjective qualms about syntax (seriously, go and program in Haskell if you're all for pretty syntactic and cohesiveness - though I can't guarantee you won't find it "disgusting"). Memory management in the C++ way is bloody hard, and it's a dirty affair. It makes no sense to be repulsed by all the complexity there, because it's unavoidable, and it's a job that has to be done. Of course there's going to be a smattering of syntactical constructs to make memory management less tedious, just like there is extra syntactic baggage on bigger Haskell programs to accommodate the fine control over side effects.

Having a memory safe high-performance language with C++-like memory management is a HUGE thing. A while ago I would have dismissed such a thing as a pipe dream, and when I became convinced that Rust could work it was a moment of great rejoicing. Even if Rust ends up on the dumps, there has been at least an effort, and maybe someone will make a better Rust some day.

So please try to understand the purpose and rationale behind Rust before dismissing it in a superficial way.


> Thus, we're still stuck with C++, and we're still suffering for it, having nightmarish bugs and myriads of security holes and whatnot.

There have been alternatives in the form of Ada, Modula-3 and Oberon, Cyclon and a few others.

Except for Ada, they were left in the research labs.

Looking to the history of systems programming languages, the only ones that have become mainstream have an OS vendor to thank for.

So which OS vendor will push for a C++ alternative?


Mozilla is an OS vendor.


So when are we expected to be able to write FirefoxOS Apps in Rust?


Rust has the beginnings of emscripten targeting[1], so in the near future?

[1] https://github.com/Yoric/Mozilla-Student-Projects/issues/33


My idea was for them to run 100% at native speed.

For running inside the browser I rather stay with JavaScript and friends.


[deleted]


> D is not a garbage collected language. If the standard library is improved one will be able to ignore the GC.

D isn't safe if you don't use the GC.


It is memory safe as long as you use SafeD mode (@safe).


"When you enter SafeD, you leave your pointers, unchecked casts and unions at the door. Memory management is provided to you courtesy of Garbage Collection."


Stupid me, I was thinking about something else when I commented.


I don't see how you make the leap from a syntax you don't like, to the language lacking cohesion. In my experience Rust code looks perfectly cohesive, and once you have the sigils in your head you simply parse them as what they mean.

To give an example, I don't believe that this code:

https://github.com/chris-morgan/rust-http/blob/master/src/ex...

is any less cohesive than this:

https://gist.github.com/thisishugo/8433714

Obviously each to their own, but I'm finding Rust a very enjoyable language to work with.


If you don't need a systems language than for christ's sake don't use or learn one. Go use your go. shoo. Also, you care about syntax, others care about safety and correctness. It's not your cup of tea but no need to be all "this is a horrible language" crap. I dare you to have a face to face discussion with anyone on the development team and see if you'll talk so big then.


[deleted]


> Go is also a systems programming language so I don't know what you mean. I would like to use Go, yes, and I do when I have a chance.

Go is a fully garbage collected language that doesn't offer a lot of support for manual memory management (and that isn't intended to attack Go—it just wasn't one of its goals). Rust is a low-level systems language that allows full control via zero-cost abstractions, but is safe. Achieving that goal requires some new machinery and syntax. Systems programming often entails writing for environments in which you can't use a GC (or even any runtime at all) for performance or other reasons.

> Also your point about syntax vs safety and correctness makes no sense, since it implies that those are mutually exclusive so I'm just going to ignore that one.

Rust has a unique feature—manual memory management with safety—so it needs syntax for it. It's not just syntax for its own sake—it's what gives the language a niche all its own.


>Go is also a systems programming language

No it isn't.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: