Hacker News new | past | comments | ask | show | jobs | submit login
Rewriting m4vgalib in Rust (cliffle.com)
372 points by zimmerfrei 17 days ago | hide | past | web | favorite | 44 comments



If the author is reading this, this might be the greatest single article I’ve read that expresses the benefits of Rust and how minimal the trade-offs are. Wonderfully written piece, thank you!

BTW, in your bounds checking examples, I think it can be further simplified (would need to check the output to see if it's any better), this:

   fn fill_1024(array: &mut [u8], color: u8) {
     // Perform an explicit, checked, slice of the array before
     // entering the loop.
     let array = &mut array[..1024];
 
     for i in 0..1024 {
         array[i] = color;
     }
  }
Could be:

  fn fill_1024(array: &mut [u8], color: u8) {
     for i in &mut array[..1024] {
         *i = color;
     }
  }
Here's the playground: https://play.rust-lang.org/?version=stable&mode=debug&editio...

Thanks again, this is a wonderful piece.


I'm quite terrible at assembly, but it looks like both got inlined and compiled to a memset:

    leaq 8(%rsp), %rdi
    movl $1024, %edx
    movl $255, %esi
    callq *memset@GOTPCREL(%rip)
for the 0xFF fill and

    leaq 8(%rsp), %rdi
    movl $1024, %edx
    movl $238, %esi
    callq *memset@GOTPCREL(%rip)
for the 0xEE fill.

However it turns out the original also compiles to that, because the inlining means the fixed size array is visible to the optimiser.


With lower opt-level, it does not get optimized to memset, and the succinct version actually compiles better, eliding the bounds checking in the loop as well as, for some reason, needing a smaller stack.

https://godbolt.org/z/Wvuen5

This leads me to believe the shorter approach is actually better.


> With lower opt-level

Yeah but who'd do that?

OTOH I found an "issue" with the playground: the compiler realises that since only a single array is ever created the functions can only ever be called with the proper bounds, so even with inline(never) everything gets compiled down to a memset without bounds check.

And trying to defeat that by modifying buffer to be 1023, the compiler realises the assertions are always going to crash so it creates a weird-ass "bad" version (which it calls), only implements the slicing (and its error) in the base and doesn't even bother outputting code for the simpler version.

Tricksty compiler. Using rand() and slicing so it can't know whether it's getting slices of 1023 or 1024 elements and generating a random fillchar (otherwise the fillchar gets moved into the functions) finally gets it to generate the expected code… and remove one of the two functions, calling the other one instead.

So yeah, at this point I will conclude your version works just as well as the article's at getting optimised.


I think you have illustrated the point quite well actually:

>And trying to defeat that by modifying buffer to be 1023, the compiler realises the assertions are always going to crash so it creates a weird-ass "bad" version (which it calls), only implements the slicing (and its error) in the base and doesn't even bother outputting code for the simpler version.

With optimizations off, we can see what the compiler is doing before optimization passes potentially obfuscate it. In practice when the call is nested behind many layers it may not be able to elide the bounds check in the loop in all cases; the opt-level may be hiding the truth because it is defeating the test case.

Whether that's true in this case or not is irrelevant, of course; I'd say it's almost always preferable to have better code as early as possible in the code generation process.


> > With lower opt-level

> Yeah but who'd do that?

Anyone who does `cargo run`. A project I'm working on right now is slow enough in debug builds that I usually debug it with light optimizations enabled. Lean on the optimizer and you'll find it a leaky abstraction.


> Anyone who does cargo run.

The entire discussion starts from looking at getting bounds checks optimised away, not optimising seems counter-productive to this concern?

> lean on the optimiser and you’ll find it a leaky abstraction.

The abstraction here is not really leaky though, observable behaviour would be as expected t’es, it’s when you look at his this behaviour is achieved that things start looking odd.


> The entire discussion starts from looking at getting bounds checks optimised away, not optimising seems counter-productive to this concern?

I'm aware of the history of the conversation, but the point where I joined concerned which idiom performs better unoptimized, a case which is relevant because everybody cargo-runs.

> The abstraction here is not really leaky though, observable behaviour would be as expected t’es, it’s when you look at his this behaviour is achieved that things start looking odd.

In many circumstances performance is part of observable behavior.


Author also wrote "Learn Rust the dangerous way".

> In this series, I'm trying something different. Let's take a grungy, optimized, pointer-type-punning, SSE-using, heap-eschewing, warning-disabling C program and look at how we could create the same program in Rust.

http://cliffle.com/p/dangerust/0/


I dislike nitpicking, but the following is wrong:

    "In C, bodies[i] is exactly the same as *(bodies + i) — it performs pointer arithmetic and a dereference and nothing else. In particular, it assumes that you have some reason to know that i is a valid index for the array. This is the common case in C and gets a shorthand using square brackets."
In C, bodies[i] is not (bodies + i), with the exception being char arrays. The general rule is:

    bodies[i] = *(bodies + i * sizeof(bodies[0]))


No, it's actually defined to be the same. The pointer arithmetic in C follows your intuitive understanding of array indexing.

    #include <stdio.h>
    
    int main() {
        int array[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};
        printf("%d %d\n", array[4], *(array + 4));
        return 0;
    }
prints "4 4".


+ in C advances pointers in sizeof-pointed-to thing increments. It's why ++ works too. Similarly - on pointers returns a difference in units of sizeof-pointed-to thing.


The semantics of a[i] is exactly that of *(a + i). Which is why i[bodies] is actually legal C code.


This was more interesting than the average "rewrite in Rust" blog post. Specifically I thought the part about how having compiler-enforced memory safety allows for the use of patterns that would otherwise be difficult to use safely interesting.


> compiler-enforced memory safety allows for the use of patterns that would otherwise be difficult to use safely

I have to say, this has been amazing. For context, I'm a C++ developer who has had minimal run-ins with the borrow checker because I was already used to thinking about ownership. I refactored a large-ish Rust crate I maintain to use references for a fundamental type. I would have considered it irresponsible to do that with a C++ codebase unless absolutely required. Even if I did the upfront analysis to make sure it was safe, I'd need to re-do that exercise for every architectural change, at least. Instead with Rust, I just tried it on a whim, knowing the borrow checker had my back.


In C++ you can have some help via static analysis.

Unfortunately very few care to actually use them. https://www.bfilipek.com/2019/12/cpp-status-2019.html?m=1#wh...

As I learned since Turbo Pascal days, if it isn't part of the language, it seldom gets used.


That's quite right.

But also there is a big difference between "the static checker will probably catch errors of this kind" and "the compiler will certainly catch errors of this kind". The latter is far more liberating.


worth pointing out, i think, that that was the original justification for rust: that by building a safe language that still gave you the level of control you'd expect from a C(++) they could experiment with all sorts of otherwise difficult and subtle concurrent code to figure out what would actually work/help.

Much like Erlang's concurrency fell out as a consequence of needing to be confidently write high availability code, Rust's general safety falls out of needing to write concurrent code fearlessly.


Oddly enough that was not the original justification, Rust was initially conceived as something closer to Swift. It’s really the accretions community which saw the possibility to fulfill a need and ran with it, and of course Grayson Hoare & co which accepted / allowed it.

Rust used to have split stacks and green threads for instance, and had special syntax for GC’d values (and unique pointers TBF).


https://www.youtube.com/watch?v=HkUUJ-rOUPg has an output of the demo, and this is indeed very impressive for hardware that isn't supposed to output video at all...


May I also chime in and present my old bitbox console, a VGA on the same chips, with a dozen games and emulators for it ? http://bitboxconsole.blogspot.com/?m=1


You need a lot more photos, in my opinion. I wanted to look at some to help me engage with the project and clicked around to the blog, then Github, then the wiki, then Hardware but left when I didn't see any photos anywhere.


Thanks a lot for the feedback !


That was seven years ago. I liked this better, from four years ago:

https://m.youtube.com/watch?v=7yXxhvKmVb0


They are unbelievably performant! I think he needs to join the Android auto team and make their car software run smooth like this!


He's actually already part of the Fuchsia team, so he very well could be working on the successor to Android.


Thats great news! I really hope they improve the android auto responsiveness


Even if Android was pure C++ instead of Java, given how those Frameworks got designed, I doubt it would help much.


I lack the knowledge of that particular piece of software but may I see that this has been extremely interesting to read!

Also, I quite love the trend of "we rewrote our software in Rust" lately. I loved the speed of C/C++ about 15 years ago but the mounting complexity and the increasingly huge game of whack-a-mole pushed me away from them.

Now I am gradually and casually learning Rust and I absolutely love what I am seeing and experiencing.


This is a persuasive article, but it is still not clear whether Rust will achieve a self-sustaining mass of developers. That is, precisely: enough that each new person who masters Rust will be able to find work writing Rust. (To be clear, it would be good if this were to happen.)

The deepest observation in the article is the remark about the cognitive load imposed by need to avoid pitfalls inherent in the programming model.

I coded K&R C in the 80s, and while I never shipped an argument-type bug, avoiding them was a continual distraction. I coded C90 from 2006 to 2012, with a similar cognitive load I was happy to leave behind, although again I never shipped a bug caused by its failings. (We had bugs, but they were specification bugs: building the wrong thing.)

Writing C++ post-14 has been an increasing pleasure. The cognitive load noted is present at times, but decreasingly so as the Standard Library and other libraries get more powerful.

I am disappointed that the author found it too hard to implement a correct program with C++, but more disappointed that he chose to blame his tools for his failure.

It has become fashionable to insist that correct programs cannot be written in this or that unfashionable language, but in fact a great many correct programs have been written in all of them, not excepting octal machine code.

What varies is how much work is needed, and how many people can do it: it is certain that, for any chosen language past or future, most people cannot or will not write a correct program in it. Rust is not different in this.

Many people coding C++, and perhaps most coding C, are not writing correct code, and have over the years written a very great deal of bad code. Will those same people succeed in learning Rust, and learning to write good code in it? It appears that the people currently adopting Rust are more skilled than is typical, so it seems hard to generalize their results to more typical programmers.

C++ and Rust embody different choices in emphasis: Rust, safety; C++, library power. Stewards of C++ have decided the best way to get substantial correct programs is to enable encapsulating semantics in well-tested, well-optimized libraries that can be used without compromising performance or system architecture. Rust has chosen to emphasize making wrong low-level code hard to express, although its ability to express libraries will only increase.

It will not be clear for a long time which will have better results. Will enough people learn Rust to write the correct low-level code needed? Will enough C++ programmers learn to use libraries that make (re-)writing dangerous low-level code unnecessary?

What is clear is that the number of C++ programmers is growing faster than ever, and interest in coding to high level C++ libraries, as measured by attendance at an ever increasing number of conferences, is exploding. Even attendance at each ISO Standard meeting is quite a lot larger than at each previous one, now reliably in hundreds. The number of new C++ programmers in any unit time is still larger than the number of new Rust programmers, by a large factor.

So, the experiment is interesting and valuable. What we can be confident about, either way, is that, for every C coder, or C++ coder writing low-level, often bad code, switching to Rust or to C++ using reliable libraries will result in more good code. It matters much less which they choose.


Why did rust not go with standard c syntax


Rust did copy C syntax to a large degree. It didn't copy some of the mistakes such as type declarations, which are hard to parse (typedef makes AST depend on itself) and hard to understand ("array of function pointers" is a head-scratcher in C, easy in Rust).


Aside from the feedback loop, C-style decls also make the language less regular in the presence of type inference as you need an "inference pseudo-type" to take the place of the type you're omitting.

Also prefix casts have similar issues as prefix type + typedef.


There are features Rust has that aren't in C, and so have no precedent. So it's not really possible.

We did take a lot of influence from C, though.


Because we now have several decades of evidence of how C and C++ syntax make compiling difficult.

If you look at Rust in the context of other languages, you see that most of the languages designed in the last 10-15 years have made similar syntax choices.


I appreciate the ability to avoid having to enclose my conditions in parenthesis, and the main operand of statements like match, switch etc. to not be wrapped in parenthesis.


I am sorry, but if you think language syntax “beauty” is something to be considered while picking the language to develop in, then you’re probably bot senior enough to make decisions which one of them to use for the project.

There’s only one true answer to this: “it doesn’t matter at all - I’ll learn it in a week@.


Beauty is a proxy for several related properties (including productivity, expressiveness and safety), which absolutely do matter and are factors to consider.


Well, but by this definition C++ is a bad language :) Don’t get me wrong, but if it’s hard to write an algorithm to parse it - then how do you expect humans to?


Humans aren't able to correctly parse C++.

There are numerous examples, but a common one is that humans will usually parse `T x();` as constructing an object of type T with the default constructor, but it actually is a forward-declaration of a function returning T.


Simply: humans do not like to write code that compilers find hard to parse. So in practice it doesn't matter very much.

Natural languages are enormously harder to parse than any programming language, and people delight in expressing just the constructs that are hardest to parse, yet people do not usually have difficulty communicating with one another. When they do, syntax troubles are vanishingly rarely the reason.


The obvious counterpoint to this is brainfuck. A few examples:

    ,+[-.[-]-,+]

    >>[-]<<[->>+<<]

    >++++++++[-<+++++++++>]<.>>+>-[+]++>++>+++[>[->+++<<+++>]<<]>-----.>->+++..+++.>-.<<+[>[+>+]>>]<--------------.>>.+++.------.--------.>+.>+.
So. Now that we've established that syntax "beauty" matters in an extreme case, can we agree that people can have different opinions about where the line is?

Can we then agree that having an opinion different than your own does not make someone an inherently less senior(???) or capable developer?


No, we won’t agree here because brainfuck is clear and concise. There’s no duplicity in what the operator may mean depending on the context so it most perfectly fulfils this role - is it the emojis you’d like to see there?

I have a strong opinion that you should like everything. Are you going to make a point that your genetic makeup somehow prefers tabs over spaces? :)


Actually, Brainfuck has excellent syntax. The language is very limited, but within those limits the syntax is as clear as can be.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: