Hacker News new | comments | show | ask | jobs | submit login

Because you can build things like:

* A faster grep ( https://github.com/BurntSushi/ripgrep )

* A GPU accelerated terminal emulator ( https://github.com/jwilm/alacritty )

* A web browser ( https://servo.org/ )

* A containerization system ( https://github.com/tailhook/vagga )

* An operating system ( https://github.com/redox-os/ )

* An extremely fast text editor ( https://github.com/google/xi-editor )

And be faster and safer than C/C++.

Offtopic: using symbols make it hard to read for an outsider


These symbols are in fact very important in Rust and reading the book makes their usage very clear.

I get that it may be hard to read if you're not familiar with the language, but so is * and & if you're not familiar with them in the context of pointers. Sometimes however, a language feature calls for a special symbol as is the case here.

The usage of apostrophes in English also doesn't make much sense for an outsider, but they're very much a necessary part of the language and very easy to use if you're an English speaker. Same applies for Rust.

To add to that point...

I'm a beginner to Rust and this was my exact reaction (Too much complex syntax!!), but one conclusion I have come to since, is that one of the really nice thing languages like Python do, is that they just don't deal with a whole bunch of CS issues (eg: everything is a reference to an object) or are very opinionated about it (ownership/lifetime is bound to scope, no way to extend/change it). By doing this, not only are the language simpler, but they also just need less symbols.

Is there a reason why all the above software cannot perform as "fast" or "safe" as Rust when written in other programming languages? After all, every program compiles down to machine code/assembly.

As soon as you need to be able to access low-level pointers for performance, you run into a problem: you can easily end up holding onto a reference to memory that's been cleared, or otherwise invalidated, by some other piece of code. This type of bug is insidious, and can easily be missed by the most stringent unit tests and even fuzzing. Of course you can try to get around this with abstractions on top, as every higher-level language does, and as you can do in C++ if you're willing to build reference-counted pointers around absolutely everything... but these are not zero-cost abstractions.

What Rust does is track reference lifetimes at compile time, giving you certainty about who can safely "own" or "borrow" memory in every single line of code, without any runtime pointer indirections or other slowdowns. The language is built around this feature at every level, with "lifetimes" being a syntactic construct at the level of type and mutability.

Imagine if you wanted to safely parse user-submitted JSON, maybe cache derived products from that JSON, and then make sure that when the input buffer is released, you weren't holding any handles into strings inside it. The only safe way to do that in any other language is to proactively copy the strings, or rigorously reference-count the input buffer. But Rust has your back here. If you use zero-copy deserialization from Serde ( https://github.com/serde-rs/serde/releases/tag/v1.0.0 ) then the compiler will refuse to compile if you are using any of that data longer than the lifetime of the original buffer, and do so without needing to add any runtime bookkeeping.

Yes, it's an annoying language to learn because of that "fight with the borrow checker." I LOVE that the language designers and steering committee are so open to quality-of-life improvements for newbies, like that string warning. The language will only get easier to learn over time. It may never be what you use to make your next app, but if you're doing systems programming, it's the wave of the future.

So, I recently saw a study claiming that memory management-related issues constituted 25% of attacks, the rest was other vectors. I may be misremembering, but in any case, simply getting rid of memory faults does not get rid of all attack vectors.

Given this, I wonder if the (seemingly) added complexity of Rust could result in more attack surfaces of other kinds.

I don't know anything about Rust mind you.

Rust is "fast" because it runs close to the bare metal, like C or C++. It doesn't feature garbage collection, many of which stop the world to clear memory.

Other non-garbage collected languages (ie those with manual memory management) lack Rust's memory safety semantics and are this subject to segfaults, buffer overflow exploits, etc. Rust is extremely "safe" since it prevents these types of errors at compile time.

So why does C, for example, lack Rust's memory safety semantics? Is it something to do with the design of the language itself? Can Rust predict user input, whereas C cannot?

Because Rust (the language) makes you reason about and define the concepts of ownership, borrowing and lifetimes of variables, and enforces this to the degree that certain classes of bugs are not possible. C does not require (or natively support) that, so this information is not available to the compiler.

It's just like typed and untyped languages. Typed languages require more up front work in that you must define all the types and data structures and which functions can accept them. This is more work than just creating them ad-hoc and using them as needed, but it prevents certain types of errors by catching them at compile time. The ownership and lifetime information for variables is loosely equivalent to that. It prevents certain types of problematic usage. It isn't perfect, and sometimes you have to work around its limitations, but the same could be said of most type systems.

There are plenty of primers on this feature of Rust, I advise you to take a look, you might find it very interesting.

There is a lot of "undefined behavior" (UB) in C, including straightforward stuff like overflowing signed integer addition. More insidiously, multithreaded code can be quite hard to write in C, because it's very very very easy to trigger UB in your multithreaded code. For example, if you have a shared variable that's protected by a lock, it's pretty easy to accidentally forget to lock the lock (or lock the wrong lock) before accessing the variable, and now you've invoked undefined behavior. Rust doesn't allow you to make those mistakes.

To be clear, Rust's model of locking data rather than locking code is really lovely, but that doesn't mean that it's not possible to mess up locks: Rust only prevents data races, not race conditions in general.

(However, you're correct in that it's not undefined behavior to mess up locking in Rust, at least not without an `unsafe` block involved.)

True, you can certainly deadlock in Rust or do other logic mistakes. What I was trying to get at is you cannot access data protected by a lock without holding the lock, and you cannot leak any references to that data past the unlock either, so you cannot stray into undefined behavior by accessing the same data from multiple threads without appropriate locking/synchronization (like you can do oh so easily in C). At least not without an `unsafe` block.

You know all this, of course. I'm just commenting for others' sake.

They're different languages with different philosophies. C provides a set of very powerful and potentially dangerous tools (direct memory access and management, for instance), and does not police how you use them. Rust want you to carefully explain what you want to do with those tools via its ownership system, unless you opt for "unsafe".

Rust is like the safety mechanism on a sawblade that shuts off once it realizes it's cutting into your finger.

Not sure which C compilers you're referring to. If you mean Clang which is also based on LLVM. https://clang.llvm.org/

Clang is a competitor to GCC.

I did not mention compilers in my comment. Do you mean that if I use LLVM compile a C program then I get the same assurances as when I compile a Rust program?

LLVM doesnt speak C, it is an optimization and codegen layer. Both rustc and clang output to LLVM.

Rust's main benefit is in the compiler itself, not optimization and codegen.

LLVM is a compiler construction, not a compiler.

If I understand, no languages offer the same assurances, I remember GodBolt is a nice way to explore how it's compile to assembly code you can compare.

https://rust.godbolt.org https://gcc.godbolt.org https://go.godbolt.org

Help me out. What is compiler construction?

Most compilers can be broken up into two steps, which is what we call a front end and back end. [1] The front end of a compiler does syntactical (parsing + lexing) and semantic analysis (type checking, etc). The back end of a compiler takes in an intermediate representation of the code, performs optimizations, and emits the assembly language for a target CPU. Clang is an example of a front end and LLVM is the back end for Clang. Clang and rustc both share LLVM as a back end, meaning they both emit LLVM IR.

[1] Many compilers have much more than two stages. For example, Rust has another intermediate representation called MIR.

Many thanks, kind stranger.

If your team of programmers tried to build something like Servo in assembly, you would never complete the task because the task is so laborious and error prone. Finding and correcting all the bugs in your hand-written assembly isn't feasible.

So it's theoretically possible to express such a program in assembly language, but it's not something humans could realistically produce without tools such as Rust.

It might be able to, but there are no guarantees. That is the real benefit of strong and expressive type systems: It can prove properties of the program at compile time. An equivalent program could be written in c, brainf*ck, or even a Turing machine, but it gets harder and harder to prove properties like memory safety the less structured the language.

Let's say you are using LLVM to compile a Rust program, and an "equivalent" C program. You can compile both of them down to IR, and then enforce type safety at the IR level. Doesn't that ensure that you can prove properties about the program at compile time?

Possibly, but types can encode far more than just the structure of data. Rust, for example, uses types to encode lifetime and ownership information. Haskell uses the IO monad to encapsulate non-determinism. Neither of those have equivalent concepts at the IR level.

It's not a set law, but more expressive type systems almost always increase the class of properties that can be "easily" proved in a language. I work on a verification tool for C/C++ programs and we constantly struggle with the languages. Pointer arithmetic and aliasing dramatically complicate any possible analysis, and these problems are only exacerbated at a lower level IR/ASM level.

You can't enforce type safety at the IR level. LLVM IR has very little in the way of type information.

I've been manually writing some LLVM IR recently to prepare for a project involving JIT compilation, and LLVM's type system is actually shockingly expressive. The majority of the problems I run into are the fact that you have to copy-paste more often and that leads to errors.

I wouldn't recommend anyone write real code using LLVM IR, but it's not as bad as you'd expect.

The hard part is getting the "equivalent" C program. :)

Just to the same links to your question.

https://gcc.godbolt.org https://rust.godbolt.org

> Is there a reason why all the above software cannot perform as "fast" or "safe" as Rust when written in other programming languages? After all, every program compiles down to machine code/assembly.

Yes, the reason is things like garbage collection and language runtime. Every program does ultimately run some form of machine code, but the amount and type of code generated can vary very widely, not even considering things like VM where you have another couple of layer of abstraction that slow things down.

Yes, in theory all programmes in all languages eventually run machine code. However some programming languages (e.g. Python) will do things that you can't get away from, like reference counting. So you'll be running extra machine code and you can't "turn that off". The designers of that language have made valid trade offs to result in that.

The GP comment mentioned Ruby, Python and JS which are not suitable for this kind of things.

Of course you can use some other compiled language instead of Rust, I think the choice boils down to productivity and ecosystem.

EDIT: where I said "some other compiled language" I should have really said "some other compiled and non-garbage-collected language"

Rust is based on LLVM just like Swift, Pony, Crystal, etc to compile to native code. Java uses bytecode which will need to translate to machine code in JVM.

Node.js, Java, Go use Garbage Collection for use case where programmers do not have to manage memory.

You could use LLVM to compile any language to LLVM IR, and then to machine code using a backend. Does that mean every language has the same properties as Rust?

If Node/Java/Go use GC (or VMs), then aren't they more safe than Rust?

> You could use LLVM to compile any language to LLVM IR, and then to machine code using a backend. Does that mean every language has the same properties as Rust?

    Nope, it's largely depend on the language design.
> If Node/Java/Go use GC (or VMs), then aren't they more safe than Rust?

    Yes and No, memory allocation is one of the issue in GC. Go is generally more safe on networking but not security where Rust can manage it securely.
This is a good read on Java vs Rust: https://llogiq.github.io/2016/02/28/java-rust.html

https://news.ycombinator.com/item?id=14173716 https://github.com/stouset/secrets/tree/stack-secrets

In fact, we don't have to bother much with GC because it's depend on programmers and job availability. GC was created based on the idea that manage memory is hard for large scale project. It work well for Azul Systems, and they have recently advertised for LLVM engineer to bring more performance where JVM could not.

Parts are completely untrue. The JVM has the advantage to selectively jit-compile hot or small functions, whilst Rust has to compile all at once. Rust binaries are thus big, whilst you don't ship JVM binaries, you ship small bytecode.

Go has the advantage of memory safety (via GC), plus better concurrency safety, which is lacking in Rust. There are concurrency safe languages, but not mentioned in this thread.

> Rust binaries are thus big, whilst you don't ship JVM binaries, you ship small bytecode.

This is not borne out by practical experience. While we've not experimented much with Rust, our Java deployments are significantly larger than other native languages like Go (and I expect Rust would actually be a bit smaller than that since it requires less runtime than Go).

JVM deployed binaries are large, not especially because the bytecode is large, but because you have to ship all the bytecode for all your code and all its transitive dependencies; there's no linker and the semantics of the language make it essentially impossible to statically prove that individual functions or classes aren't needed. You can trim it down with tools like Proguard, but that's a non-trivial undertaking and prone to error, which again you won't know until runtime.

Plus the drawback that you need a relatively large VM to run a JVM binary, but you can run Rust binaries completely standalone (out of a scratch container if you want).

> Go has the advantage of memory safety (via GC), plus better concurrency safety, which is lacking in Rust.

I'm curious what you mean by "better concurrency safety". My understanding is that Rust attempts to statically prove that concurrent accesses are safe (e.g. https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.h..., especially the section on locks). Go does nothing of the sort - it provides some nice concurrency primitives and tooling to detect data races, but the compiler does nothing to actively prevent it.

Go, in my understanding, is not memory safe even with its gc, and also doesn't have "concurrency safety" because of this. That's why https://golang.org/doc/articles/race_detector.html exists.

Rust however prevents these kinds of errors at compile time.

Which things were you thinking of that Rust was lacking here?

The go race detector is still miles better than enforcing manual mutexes and locking in concurrent Rust code.

Much better would be a proper type system to get rid of those at compile-time of course. Look at pony. And a better memory-safety system than RC.

You don't always need those; it depends on what you're doing. And furthermore, that _is_ the compile time system that enforces things. Go's race detector can only prove the presence of things, not their absense.

As for Pony,

> Pony-ORCA, yes the killer whale, is based on ideas from ownership and deferred, distributed, weighted reference counting.

That's it's GC.

What you mentioned JIT is exactly my thought when IBM explained why they like about Swift over Java in JVM.

More technical explanation: https://www.ibm.com/support/knowledgecenter/SSYKE2_7.0.0/com...

"In practice, methods are not compiled the first time they are called. For each method, the JVM maintains a call count, which is incremented every time the method is called. The JVM interprets a method until its call count exceeds a JIT compilation threshold. Therefore, often-used methods are compiled soon after the JVM has started, and less-used methods are compiled much later, or not at all. The JIT compilation threshold helps the JVM start quickly and still have improved performance. The threshold has been carefully selected to obtain an optimal balance between startup times and long term performance."

> Go has the advantage of memory safety (via GC), plus better concurrency safety, which is lacking in Rust.

    Are there any examples you could list?
One of Go GC issue is based on this stouset's discussion https://news.ycombinator.com/item?id=14174500

Java bytecode looks rather large to me.

Its only fair to also write "Also be slower but definitely more safer than C++".

Rust should not be slower. If so, it's a bug. Please file them.

Just like with Javascript, everything must be rebuilt again!..

Except Rust has incredible memory safety, without cumbersome and slow GC, and runs close to C++ speeds, unlike the heavy, wasteful, and slow frameworks you mention like Electron or HTML5.

Rust is a modern C/C++ replacement, or at least tries to be. That's no mean feat.

Plus alacritty looks great and is much faster and lighter than most other terminal emulators, ripgrep is a great replacement for grep and can be a serious improvement when you are searching through millions of lines on thousands of giant files. To mention just the two of them...

Alacritty is full of render issues and missing features. It's fast because it barely does anything, and doesn't even do it correct. Just check the ever growing laundry list of issues it has on Github.

With that said, it's a very interesting project. Especially with regards to its rendering engine.

I only use ripgrep these days.

Except that there are other languages like Go, Crystal, D, Nim that all offer memory safety ( thanks to there GC ), with a light GC ( like reference counting ). And based upon the same benchmarks that Rust in participates, they are close or even faster at times, with close or better memory usage.


GCs are nondeterministic amongst other issues. Baking one into the core of a systems language likely isn't a good idea.

It worked out alright for Algol 68RS, Mesa/Cedar, Modula-2+, Modula-3, Oberon, Oberon-2, Active Oberon, Component Pascal, Sing#, System C#.

Those systems failed market adoption mostly due to politics and company acquisitions than technical hurdles.

Some applications are easier without a GC. Embedded work comes to mind. Realtime guarantees for systems with GC are also somewhat tricky (though not impossible to achieve).

True, but just because a systems language has a GC doesn't mean one has to use it everywhere, it is the only mechanism to allocate memory or that it has to run all the time.

Modula-3 is a good example of how to support all scenarios required by a systems programming language with GC, unfortunately Compaq buying DEC followed by HP buying Compaq, killed all the work the SRC group was doing.

For embedded work check the Oberon compilers sold by Astrobe for ARM Cortex-M3, Cortex-M4 and Cortex-M7, as well as, Xilinx FPGA Systems.

Baking unsafe memory languages with limited type systems and blocking libraries for the hard parts is neither a good idea for systems.

Try these benchmarks instead: http://benchmarksgame.alioth.debian.org/u64q/rust.html

Rust should be beating the majority of those languages in well-implemented comparisons.

> light GC ( like reference counting )

Refcounting is often one of the slower ways to implement a GC. It also has other issues, like long GC pauses when a large structure goes out of scope.

I hope you do realize that a good GC is usually faster then refcounting, and comparing it to slow GC's does not prove anything.

For fairness you need to compare Rust to D or Pony or SBCL, which are also close to C++ (even faster), plus added concurrency safety or memory safety, which can be circumvented in Rust.

> I hope you do realize that a good GC is usually faster then refcounting, and comparing it to slow GC's does not prove anything.

In idiomatic Rust code, you typically have very few reference increments/decrements, making it faster and more efficient than both, GC and traditional RC-based approaches.

The reasons for this are that objects are often allocated on the stack, passed by reference (a safe pointer), and directly integrated into a larger structure, requiring fewer heap allocations and very few – if any – refcounted objects. In Rust, unlike C or C++, you can do this safely and ergonomically because Rust enforces clear and well-defined ownership semantics.


You haven't actually demonstrated any problems, just repeated that they exist.

Unsubstantiated criticism usually gets downvoted on HN, Rust or not.

> I do know rust, their RC problems, their unsafe problems, their concurrency and locking problems and their type system problems.

Please elaborate. In particular, what RC and concurrency problems?

Yeah, hard to imagine that things can improve over time!

try ripgrep, it'll blow your socks off.

What makes ripgrep fast (AFAIK) is mainly using mmap() instead of open()/read() to read files, and relying on Rust's regex library that compiles regexes to DFAs which can run in linear time. Those are things that you can do just as well in C.

To witness, https://github.com/ggreer/the_silver_searcher (aka "ag") is about as fast as ripgrep, but written in C.

This is not to say that Rust does not have benefits. But the benefit is not "speed", but "speed plus security".

Here's a much more thorough post on why it's fast. mmap is only a small part of it (a bunch of the grep implementations mmap)


That's a great blog post. The `linux_literal` section is particularly interesting re: mmap.

Ripgrep only uses mmap() when it is searching very few files. For anything beyond that it uses intermediate buffers.

I switched from ag to rg a while ago. The speed benefits are noticeable.

> Those are things that you can do just as well in C. [..] But the benefit is not "speed", but "speed plus security".

But nobody said otherwise? I don't understand your point. Speed + safety is indeed precisely the point. I would implore you to do your own comparative analysis by looking at the types of bugs reported for these search tools. (I can't do this for you. If I could, I would.)

> What makes ripgrep fast (AFAIK) is mainly using mmap() instead of open()/read() to read files,

I think you're confused. This is what the author of the silver searcher has claimed for a long time, but with ripgrep, it's actually precisely the opposite. When searching a large directory of files, memory mapping them has so much overhead that reading files into intermediate fixed size buffers is actually faster. Memory maps can occasionally be faster, but only when the size of the file is large enough to overcome the overhead. A code repository has many many small files, so memory maps do worse. (N.B. My context here is Linux. This doesn't necessarily apply to other operating systems.)

> and relying on Rust's regex library that compiles regexes to DFAs which can run in linear time.

There's no confusion that such things can't be done in C. GNU grep also uses a lazy DFA, for example, and is written in C.

The "linear time" aspect doesn't show up too often, and none of my benchmarks[1] actually exploit that.

There's a lot more to the story of how ripgrep beats the silver searcher. "About as fast" is fairly accurate in many cases, but to stop there would be pretty sad because you'd miss out on other cool things like:

    - SIMD for multiple pattern matching
    - Heuristics for improving usage of memchr
    - Parallel directory iterator (all safe Rust code)
    - Fast multiple glob matching
In many cases, this can make a big difference. Try searching the MySQL server repository, for example, and you'll find that the silver searcher isn't "about as fast" as ripgrep. (Hint: Take a peek at its .gitignore file.[2] This has nothing to do with memory maps, SIMD or linear time regex engines.)

And yes, I could have done all of this in C. But it's likely I would have given up long before I finished.

[1] - http://blog.burntsushi.net/ripgrep/

[2] - https://github.com/mysql/mysql-server/blob/5.7/.gitignore

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact