The part of GC which causes the most latency issues is compaction rather than me...

gwbas1c · on Aug 5, 2020

IMO, the main advantage of Rust is that it doesn't require an extensive runtime in order to have memory safety. This allows you to write a library like an image processor or embedded database without using C / C++.

Otherwise, if you wrote your image processor in C# or Java, it becomes hard to call your library from Python or Node because you have to require the entire VM. Likewise, you can ship an application binary that has no requirements for a runtime. (Your application binary doesn't require a JVM, CLR, Mono, Python, Node, or some other runtime.)

I've been through the Rust book twice but I'm just getting to the point of trying to write something in it. The mental model is very different. Coming from C# / Java / Javascript / Objective C; I'm wondering how many hours I need before I can get my head into Rust?

pizza234 · on Aug 5, 2020

I'm learning Rust as well; in my opinion, starting with small guided projects is the most stimulating and incremental approach, although unfortunately, I find that starting practicing Rust - differently from other languages - requires "having read the whole reference".

The resources I've reviewed are:

- Rustlings: I personally don't like the project; it's exercises for the sake of exercising, which may be good or not, depending on the person/approach.

- Ferrous systems exercises (set of mini projects): very small and simple, but interesting to work on. I think they're a very good resource to try immediately after reading the book.

- Rust Programming By Example (book): if one finds one or more projects interesting to work on, this is a very fun book. it's "not-perfectly-produced" though, if you know what I mean.

- Hands-On Data Structures and Algorithms in Rust (udemy): even if one doesn't like algorithms, I think working with data structures implicity develops an understanding of "getting the head into Rust".

- Build your own Jira with Rust (https://github.com/LukeMathWalker/build-your-own-jira-with-r...): An exercise-based approach to practice; probably a good alternative to the previous resource, for those who really don't want to work with A./D.S..

- Talent Plan (https://github.com/pingcap/talent-plan): like the previous, but more technical.

Have fun :-)

gwbas1c · on Aug 5, 2020

> I find that starting practicing Rust - differently from other languages - requires "having read the whole reference".

And that's the problem that I have. In high school I was handheld into C / C++ with weekly lessons. By the time I started my career I abandoned C because the things I worked on professionally had no benefit from manual memory management.

Now the thing that I want to write, an embedded database, requires manual memory management and no runtime. I could, in theory, go back and do it in C. It'd be slow working in a language that I haven't done anything in since 2002, but at least I'm familiar with all the conventions.

Do I basically need to spend 40-80 hours doing silly exercises just to ease into the new conventions and mental model?

pizza234 · on Aug 5, 2020

It's not really clear what you mean with "silly exercises". One of the resources is a course for building "a [...] networked, parallel and asynchronous key/value store", which is far from being a "silly exercise".

Even ignoring that, it's a matter of big picture.

If learning Rust is only for this project, or only to be "quickly proficient in a new language", then I don't think it fits the specific case. The may be alternatives; I don't have experience (somebody else can surely advice better) something like C++ with smart pointers or memory safe D, I guess, could fit.

In the big picture of a career, or even in the context of a single company, spending 40/80 hours to be proficient in a language is essentially an insignificant time.

gwbas1c · on Aug 6, 2020

> It's not really clear what you mean with "silly exercises"

At this point in my experience, if I want to learn a language, I write something "easy" in the language that I want to write for my enjoyment.

For example, when I was between jobs I wrote a personal blog engine in NodeJS so I could get up to speed in modern Javascript and the node ecosystem: https://github.com/GWBasic/z3

"Silly exercises" implies a programming exercise that has little point outside of instructing a basic concept: The kind of exercises I did in high school when I learned C are an example; there was no outside purpose to the code itself. IE, there's no tangible use to the code when it's complete.

What I did 3 years ago was write a small program in Rust that opens links listed in a text file. (I've written many versions of this program over the last 18 years, mostly for self-education.) When I first wrote the program, it was mostly copy & paste, but it compiled even though I didn't understand most of it.

Last night I decided that I was going to recompile it on Windows as my first exercise. I had to change the "open a link" library because it only compiled on Mac, which required changing some code: https://github.com/GWBasic/open_links

Now I'm going to try porting my in-browser Javascript in Z3 to Rust + WebAssembly. Let's see how far I can get!

pjmlp · on Aug 5, 2020

On the other hand it requires jumping through hoops to make borrow checker friendly architecture designs, or fiddle your code base with Rc<> types everywhere.

And when their count reaches 0, you have your stop the world, unless you move the destruction into a background thread, thus manually emulating a tracing GC.

pkolaczk · on Aug 5, 2020

When RC drops to zero it is not STW. It is stop the current thread only. And even that can be trivially solved by background deallocation. Solving latency problem of GC is far from trivial.

pjmlp · on Aug 5, 2020

As if stop the current thread isn't the world the world in single threaded programs.

I explicitly mentioned how moving into a background thread is poor man's tracing GC.

Languages like D and C# offer mechanisms for deterministic deallocation, while keep the productivity of a tracing GC.

orneryostrich · on Aug 5, 2020

But if your program is single-threaded, you don’t need an `Rc` type. The whole point of `Rc` and `std::shared_ptr` is so an object can have one owning pointer per thread, in cases where you’re not sure which thread will finish using the object last.

pjmlp · on Aug 5, 2020

Sure you do, try to implement a Gtk-rs application without the Rc<RefCell<>> dance for widgets state.

pkolaczk · on Aug 5, 2020

Finalizers (IDisposable) or try-with-resources are not equally strong as deterministic destruction in C++ or Rust. Or did you mean a different feature for deterministic destruction in C# that I don't know of? I'm quite curious.

pjmlp · on Aug 5, 2020

You can stack allocate objects, so those are alive just until the end of the stack.

Then there native memory allocation and safe handles.

IDisposable and Finalizers aren't the same thing, actually, although they happen to be used together as means to combine deterministic destruction alongside GC based destruction.

You can also make use of lambdas or implicit IDisposable implementations via helper methods, that generate code similar to memory regions or arenas in C++, but in .NET.

Finally, many tend to forget that .NET was designed to support C++ as well, so it is also possible to generate MSIL code that ensures deterministic destruction RAII style, naturally this falls into a bit more advanced programming, but it can be hidden away in helper classes.

algorithmsRcool · on Aug 5, 2020

> You can stack allocate objects, so those are alive just until the end of the stack.

Well, not really. You cannot stack allocate anything but primitive buffers. So no objects or even strings. So it cannot replace heap allocation for anything but smallish "arrays" of simple types like char and int. This also means you can't use normal structs/value types, only primitives.

> You can also make use of lambdas or implicit IDisposable implementations via helper methods, that generate code similar to memory regions or arenas in C++, but in .NET.

> Finally, many tend to forget that .NET was designed to support C++ as well, so it is also possible to generate MSIL code that ensures deterministic destruction RAII style, naturally this falls into a bit more advanced programming, but it can be hidden away in helper classes.

I'm pretty sure there is no way in C# or in MSIL to explicitly free/deallocate a heap allocated object.

MSIL defines a Newobj opcode, but no Freeobj or anything like it that I have ever seen. You can use custom allocators or unmanaged memory to deterministically allocate and free buffers of structs/values types. But only those that do not include references to managed object references, otherwise you would need to pin and the references yourself and keep the GC aware that there were non-tracked references to those objects. It gets messy fast.

pjmlp · on Aug 6, 2020

It is so easy, write the code that you want in safe mode C++/CLI, get the code template, then implement the helper classes to generate the same MSIL on the fly.

As for stack allocation, apparently you missed structs.

Here is your string allocated on the stack.

    unsafe struct cppstring
    {
        const int BuffSize = 1024;
        fixed char data[BuffSize + 1];
        int current;
        
        public cppstring(System.ReadOnlySpan<char> buffer)
        {
            for (int i = 0; i < System.Math.Min(BuffSize, buffer.Length); i++)
                data[i] = buffer[i];
            current = 0;
        }
    }

    public class StackDemo {
        public void Myfunc() {
            var str = new cppstring ("Hello from stack");
        }
    }

Providing std::string like operations is left as exercise for the reader.

algorithmsRcool · on Aug 9, 2020

But that object is not a string, it is char buffer. You cannot directly substitute one for the other.

Yes I realize they can be used to store the same data, but you've defined a new type that is incompatible with an actual string object.

shpongled · on Aug 5, 2020

Easiest way to get started if you're coming from a GC background is to just liberally `.clone()` everything in Rust. Once you're used to move semantics and the syntax, then you can start messing around with borrowing. It definitely has a learning curve, but I find it a breeze to write once you grok the ownership rules.

nestorD · on Aug 5, 2020

One thing I love about Rust is that allocation are very explicit and easily spotted which helps a lot when one wants to avoid them.

I found C++ to be treacherous around corners cases on this subject.

Matthias247 · on Aug 5, 2020

I actually don't think so. I've seen plenty of Rust libraries that copy strings about 5 times unnecesssarily between usages, just because `.clone()`, substrings and co are so convenient. Those all could have been optimized away, but the authors of that code didn't knew or didn't try.

And if you do `MyAwesomeStructure::new()` you might actually trigger a whole bunch of allocations which are invisible.

So a "yes" from my side on Rusts ability to remove allocations if you try hard enough. A "no" however on allocations being extremely explicit and easy to see for non experts.

nestorD · on Aug 5, 2020

dthul's comment [0] covers my point of view fairly well. Clones and new are easy to spot, you might not be careful about them because you do not care or have other priority but you can find/grep them quickly: they are explicit.

Meanwhile C++ has a lot of implicit memory allocation and things that might or might not allocate.

[0]: https://news.ycombinator.com/item?id=24061235

pjmlp · on Aug 5, 2020

Unless you are going to review the whole code, written from scratch, there is no way to actually be aware of all allocations in Rust without help from a memory profiler.

dthul · on Aug 5, 2020

Just one or two days ago I asked here on HN how memory allocations in C++ are considered to be more hidden than in Rust and got some good replies: Especially constructors, copy constructors, assignment operators etc. can introduce non-obvious allocations.

For example:

  T t;
  a = b;
  T t2 = t;

can all allocate in C++. The equivalent in Rust:

  let t: T; // won't allocate
  let t = T::new(); // might allocate
  a = b; // won't allocate
  let t2 = t; // won't allocate
  let t2 = t.clone(); // might allocate

So in Rust you can tell that as long as there is no function call, there won't be an allocation.

pjmlp · on Aug 5, 2020

Like operator overloading and implicit conversations.

dthul · on Aug 5, 2020

True, those are even more places where C++ can implicitly allocate. The operator overloading also applies to Rust. Rust has no implicit conversions though, which are arguably worse since they are invisible (that's why I usually mark all my expensive single argument constructors as "explicit").

pjmlp · on Aug 6, 2020

Sure it has, you cannot ensure Deref implementations don't allocate.

dthul · on Aug 6, 2020

While technically true, the documentation makes it very clear that Deref should only be implemented for smart pointers and never fail. So no allocations in practice.

pjmlp · on Aug 6, 2020

Ah problem solved then, we just need to document very clear that C code should not corrupt memory, how did I never thought of it.

cogman10 · on Aug 5, 2020

I think the point was more that allocations are fairly explicit.

There are a few places in C++ where allocation can happen pretty much invisibly. A copy constructor is an example of that. You might see a new allocation simply by calling a method.

With rust, you usually won't see an allocation unless it is explicitly called for. You can follow the call tree and very easily pick out when those allocations are happening.

pjmlp · on Aug 5, 2020

Rust also has operator overloading and implicit conversations, both can cause memory allocations.

cogman10 · on Aug 5, 2020

Certainly. However, both of those things (from what I've seen) are fairly rare.

The C++ copy constructor comes up fairly frequently.

pjmlp · on Aug 5, 2020

Like unsafe Rust code, there is the theory and then there are the code bases that one finds out in the wild on crates.io and in-house, not necessarily using best practices.

dthul · on Aug 5, 2020

Rust does not have implicit type conversions.

pjmlp · on Aug 6, 2020

Via Deref.

dthul · on Aug 6, 2020

Not in practice. It's only used for reference coercions to existing objects.

You would need to ignore the documentation and bend over backwards to use it for an actual type conversion, because you can only return references.

pjmlp · on Aug 6, 2020

Bending backwards is a synonym to code quality delivered by consulting/offshoring projects across most big corporations.

darksaints · on Aug 5, 2020

Don't forget scanning. Yes, moving blocks of memory around is expensive, but it can also be done concurrently. Scanning, AFAIK, cannot be done concurrently, and thus remains the primary blocker to lower latency. And scanning is something that is entirely eliminated with static memory management.

pron · on Aug 5, 2020

Scanning is most certainly done concurrently with ZGC. Even root scanning is on its way to become fully concurrent, which is why we're nearing the goal of <1ms latency.

jerrinot · on Aug 5, 2020

I didn't know about fully concurrent root scanning, thank you!

How does root scanning work wrt to Loom? Are stacks of virtual threads treated as roots? I guess there is no other option?

pron · on Aug 5, 2020

No, virtual thread stacks are not roots! This is one of the main design highlights of the current Loom implementation. In fact, at least currently, the VM doesn't maintain any list of virtual threads at all. They are just Java objects, but the GC does treat them specially.

kjeetgill · on Aug 5, 2020

Interesting! In what way are they handled specially?

pron · on Aug 5, 2020

Unlike any other object, the location of references on a stack can change dynamically, so the GC needs to recognise those objects and walk their references differently. There are other subtleties, too.

pkolaczk · on Aug 5, 2020

Right, it is concurrent, but it is still costly. It brings rarely used data into the caches and pushes useful data out of the caches. If some parts of the heap were swapped out, the impact of concurrent scanning can be quite dramatic.

pron · on Aug 5, 2020

Ah, but you can pin GC threads to specific cores, and a reference-counting GC also has such non-trivial costs. In practice, however, people in the 90-95% "mainstream" domain that Java targets are very happy with the results. Of course, there are some applications that must incur the costs of not having a GC. In general, though, the main tangible cost of a GC today, for a huge portion of large-scale applications, is neither throughput nor latency by RAM overhead.

pjmlp · on Aug 5, 2020

I also enjoy the low level C++ like tooling that .NET offers, but I am also looking forward to the fruits of Panama and Valhalla in that direction.

pkolaczk · on Aug 5, 2020

Pinning doesn't solve the problem of cache L3+ thrashing.

pron · on Aug 5, 2020

It could if the GCs used non-temporal instructions that bypass the L3. Of course, how much of a problem this is in practice in most applications is something that would need to be measured.

pkolaczk · on Aug 5, 2020

Point taken. What about swap?

pron · on Aug 5, 2020

Yeah, swapping could be really bad and should be avoided. So: don't swap :) Java's memory consumption can't go up indefinitely. The most important setting is the maximum heap size. Set it to a good level and don't swap.

haxen · on Aug 5, 2020

All the modern GCs scan the heap concurrently, the hardest problem is scanning the GC roots in the call stack. ZGC is currently implementing concurrent stack scanning.

wokkel · on Aug 5, 2020

Yes it can. Stop the world collection is a thing of the past. See for example: https://developers.redhat.com/blog/2019/06/27/shenandoah-gc-...

jerrinot · on Aug 5, 2020

Modern Garbage Collectors do concurrent scanning.

I believe most GC implementations have non-concurrent "initial marking" phase, but that's typically fairly quick. It has to scan roots of your object graph, think stack, JNI, etc.

dan-robertson · on Aug 5, 2020

Scanning can be done incrementally with each allocation (Such that allocations become slightly more expensive but no individual allocation does loads of scanning work). Scanning can also be done concurrently.

rurban · on Aug 5, 2020

Scanning is also entirely eliminated by using no global heap allocations. With copying and small stack allocations there's not need to scan much, and can easily stay below 1ms.

bananaface · on Aug 5, 2020

Can't you just allocate a huge block up-front and throw stuff into it with a custom allocator? I don't know if Rust allows you to do that kind of thing.

steveklabnik · on Aug 5, 2020

Yes. There are some trade offs, you can’t parameterize the standard library data structures over an allocator yet, for example.

AtlasBarfed · on Aug 6, 2020

So arena allocation and buffer reuse?