Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I love Rust and use it everyday but the syntax bloat is something I will never get over. I don't believe there's nothing that could be done about it. There are all sorts of creative grammar paths one could take in designing a language. An infinite amount, in fact. I would really like to see transpiler that could introduce term rewriting techniques that can make some of that syntax go away.


It's a pain to write all that boilerplate, I agree. I don't think it's bloat though - I've been doing rust for a few years now, and when I revisit old mostly forgoten code, I love that boilerplate. I rarely have to do any puzzling about how to infer what from the current file, it's just all right there for me.

I feel this way about all the verbosity in rust - some of it could likely be inferred, but but having it all written down right where it is relevant is great for readability.


Having done a bit of C lately (lots in the past) and quite a bit of Rust, Rust is not verbose!

The functional syntax the author of this (good) article complains about is what this (long experience in procedural C like languages) old programmer has come to love.


>when I revisit old mostly forgoten code, I love that boilerplate. I rarely have to do any puzzling about how to infer what from the current file, it's just all right there for me.

This is going to sound absurd, but the only other language I had this experience with was Objective-C.

Verbosity is super underrated in programming. When I need to come back to something long after the fact, yes, please give me every bit of information necessary to understand it.


Useful verbosity is fine to me. However, I never wish to see another line of COBOL, thank you very much.


This is a really good point, IMO. I've never written extensive amounts of Objective-C, but in my adventures I've had to bolt together GUIs with Objective-C++ and I learned to love the "verbose" nature of Obj-C calls whenever I had to dive back into the editor for a game engine or whatever because it meant I didn't have to rebuild so much state in my head.


What you want is for the complex things to be verbose and the simple things to be concise.

Objective-C makes everything verbose. It’s too far in the other direction. Memories of stringByAppendingString


Eh, sure, I'm willing to buy Objective-C went too far in some places. It still works for coming back to, though. :)


That's true, I found this writing F# with an IDE vs reading F# in a PR without IDE it really becomes easier to read if you at least have the types on the function boundary.

F# can infer almost everything. It's easier to read when you do document some of the types though.


> F# can infer almost everything. It's easier to read when you do document some of the types though.

F# is also easier to avoid breaking in materially useful ways if (like TypeScript) you annotate return types even if they can be inferred. You'll get a more useful error message saying "hey stupid, you broke this here" instead of a type error on consumption.


"Creative" grammar introduces parsing difficulties, which makes IDE tooling harder to build and less effective overall. My overall guess is that Rust made the right choices here, though one can endlessly bikeshed about specifics.


Creative grammar can introduce parsing difficulties, but it doesn't have to.

I've made a couple small languages, and it's easy to end up lost in a sea of design decisions. But there are a lot of languages that have come before yours, and you can look to them for guidance. Do you want something like automatic semicolon insertion? Well, you can compare how JavaScript, Python[1], Haskell, and Go handle it. You can even dig up messages on mailing lists where developers talk about how the feature has unexpected drawbacks or nice advantages, or see blog posts about how it's resulted in unexpected behavior from a user standpoint.

You can also take a look at some examples of languages which are easy or hard to parse, even though they have similar levels of expressivity. C++ is hard to parse... why?

You'd also have as your guiding star some goal like, "I want to create an LL(1) recursive descent parser for this language."

There's still a ton of room for creativity within constraints like these.

[1]: Python doesn't have automatic semicolon insertion, but it does have a semicolon statement separator, and it does not require you to use a semicolon at the end of statements.


> you can look to them for guidance. Do you want something like automatic semicolon insertion? Well, you can compare how JavaScript, Python[1], Haskell, and Go handle it

You can't look at JavaScript/Python/Go (I don't know about Haskell), because Rust is a mostly-expression language (therefore, semicolons have meaning), while JavaScript/Python/Go aren't.

The conventional example is conditional assignment to variable, which in Rust can be performed via if/else, which in JS/Python/Go can't (and require alternative syntax).


> You can't look at JavaScript/Python/Go (I don't know about Haskell), because Rust is a mostly-expression language (therefore, semicolons have meaning), while JavaScript/Python/Go aren't.

I have a hard time accepting this, because I have done exactly this, in practice, with languages that I've designed. Are you claiming that it's impossible, infeasible, or somehow impractical to learn lessons from -- uhh -- imperative languages where most (but not all) programmers tend to write a balance of statements and expressions that leans more towards statements, and apply those lessons to imperative languages where most (but not all) programmers tend to write with a balance that tips more in the other direction?

Or are you saying something else?

The fact that automatic semicolon insertion has appeared in languages which are just so incredibly different to each other suggests, to me, that there may be something you can learn from these design choices that you can apply as a language designer, even when you are designing languages which are not similar to the ones listed.

This matches my experience designing languages.

To be clear, I'm not making any statement about semicolons in Rust. If you are arguing some point about semicolon insertion in Rust, then it's just not germane.


Not the parent, but you can certainly have an expression-oriented language without explicit statement delimiters. In the context of Rust, having explicit delimiters works well. In a language more willing to trade off a little explicitness for a little convenience, some form of ASI would be nice. The lesson is just to not extrapolate Rust's decisions as being the best decision for every domain, while also keeping the inverse in mind. Case in point, I actually quite like exceptions... but in Rust, I prefer its explicit error values.


Ruby is a great example of a language that’s expression oriented, where terminators aren’t the norm, but optionally do exist.


> I have a hard time accepting this, because I have done exactly this, in practice, with languages that I've designed.

I don't know which your languages are.

Some constructs are incompatible with optional semicolons, as semicolons change the expression semantics (I've given an example); comparison with languages that don't support such constructs is an apple-to-oranges comparison.

An apple-to-apple comparison is probably with Ruby, which does have optional semicolons and is also expression oriented at the same time. In the if/else specific case, it solves the problem by introducing inconsistency, in the empty statement, making it semantically ambiguous.


Ruby is expression-oriented like Rust and doesn't have semicolons. Neither do most functional languages.


Have you also written tooling - e.g. code completion in an IDE - for those small languages? There are many things that might be easy to parse when you're doing streaming parsing, but a lot more complicated when you have to update the parse tree just-in-time in response to edits, and accommodate snippets that are outright invalid (because they're still being typed).


Yes, that's a good example of exactly what I'm talking about. Code completion used to be really hard, and good code completion is still hard, but we have all these different languages to learn from and you can look to the languages that came before you when building your language.

Just to give some more detail--you can find all sorts of reports from people who have implemented IDE support, talking about the issues that they've faced and what makes a language difficult to analyze syntactically or semantically. Because these discussions are available to sift through in mailing lists, or there are even talks on YouTube about this stuff, you have an wealth of information at your fingertips on how to design languages that make IDE support easier. Like, why is it that it's so hard to make good tools for C++ or Python, but comparatively easier to make tools for Java or C#? It's an answerable question.

These days, making an LSP server for your pet language is within reach.


Tooling should not depend on code text, but on language's AST.


I'm not an expert as I do not work on these tools but I don't think IDEs can rely solely on ASTs because not all code is in a compilable state. Lots of times things have to be inferred from invalid code. Jetbrains tools for example do a great job at this.


In practice though, getting the AST from the text is a computational task in and of itself and the grammar affects the runtime of that. For instance, Rust's "turbofish" syntax, `f::<T>(args)`, is used to specify the generic type for f when calling it; this is instead of the perhaps more obvious `f<T>(args)`, which is what the definition looks like. Why the extra colons? Because parsing `f<T>(args)` in an expression position would require unbounded lookahead to determine the meaning of the left angle bracket -- is it the beginning of generics or less-than? Therefore, even though Rust could be modified to accept`f<T>(args)` as a valid syntax when calling the function, the language team decided to require the colons in order to improve worst case parser performance.


This is why using the same character as a delimiter and operator is bad.


How does C# manage to handle without the turbo fish syntax? What’s different in Rust?


It's not impossible to handle the ambiguity, it's just that you may have to look arbitrarily far ahead to resolve it. Perhaps C# simply does this. Or perhaps it limits expressions to 2^(large-ish number) bytes.


Comments tending to skip on being a part of the AST make that harder.


That's really cool that you think Rust syntax could be significantly improved. I'd really love to hear some details.

Here's the example from the post:

  Trying::to_read::<&'a heavy>(syntax, |like| { this. can_be( maddening ) }).map(|_| ())?;
How would you prefer to write this?


That whole example feels like a strawman, from my (maybe limited) experience something that's rather the exception than the norm.

First, lifetimes are elided in most cases.

Second, the curly braces for the closure are not needed and rustfmt gets rid of them.

Finally, the "map" on result can be replaced with a return statement below.

So, in the end we get something like:

  Trying::to_read(syntax, |like| this.can_be(maddening))?; 
  Ok(())


That part is actually not so bad?

  Trying\to_read\[&'a heavy](syntax, |like| { this. can_be( maddening ) }).map(|_| ())?;
I can't improve it that much


Rust has an extremely powerful macro system, have you tried that?


Rust macros are one of the more annoying features to me. They're great at first glance but whenever I want to build more fancy ones I constantly bump into limitations. For example they seem to be parsed without any lookahead, making it difficult to push beyond the typical function call syntax without getting compiler errors due to ambiguity.


Procedural macros have a peek function from the syn crate. macro_rules macros can stuff this into the pattern matching.

e.g.

https://turreta.com/2019/12/24/pattern-matching-declarative-...


But proc macros are limited by requiring another crate (unless things have changed in the last year). Sure, it’s just one extra crate in the project, but why must I be forced to?


Asked and answered in an adjacent comment.


But there's the weird limitation that procedural macros have to be in a separate crate.


Why is that weird? Procedural macros are compiler plugins. They get compiled for the platform you're building on, not the one you're building for, and so they need to be a separate compilation unit. In Rust, the crate is the compilation unit.


Because you can't just throw together a simple procedural macro to use in a specific project, as you can in other languages.


Nim, which technically accomplishes all (I assume) of the Rusty things that require syntax, manages to do it with quite a lot nicer syntax.


Nim accomplishes memory safety using a garbage collector. That's pretty dissimilar to rust and more comparable to go or D.


Nim allows you to chose what memory management method you want to use in a particular piece of software. It can be one of various garbage collectors, reference counting or even no memory management. It allows you to use whatever suits your needs.


> > Nim accomplishes memory safety using a garbage collector.

No memory management in Nim equals no memory safety guarantees. Or no? Well in that case the statement above is true.


You can get management and safety with one of Nim's modes, as per the peterme link in my sibling, if you would like to learn more.


I don’t understand why you all are posting tedious details and well actuallys when the original assertion was (way back):

> Nim, which technically accomplishes all (I assume) of the Rusty things that require syntax, manages to do it with quite a lot nicer syntax.

Nim does not have something which gives both memory safety and no ((tracing garbage collector) and/or (reference counting)) at the same time. End of story.

The fact that Nim has an off-switch for its automatic memory management is totally uninteresting. It hardly takes any language design chops to design a safety-off button compared to the hoops that Rust has to jump through in order to keep its lifetimes in check.


>Nim does not have something which gives

You are simply incorrect, appear unwilling to research why/appear absolutist rather than curious, and have made clear that what I think is "clarification" or "detail expansion" you deem "tedious" or "nitpicking" while simultaneously/sarcastically implicitly demanding more details. That leaves little more for me to say.


It would be helpful if you pointed out the Nim memory management method that works the same as the Rust one.



You have managed to point out that tracing garbage collection and reference counting are indeed two ways to manage memory automatically. Three cheers for your illuminating clarification.


Sorry about being an arse. I got frustrated by all the talking-past-each-other.


While tracing garbage collection is indeed one possible automatic memory management strategy in Nim, the new --mm:arc may be what darthrupert meant. See https://uploads.peterme.net/nimsafe.html

Nim is choice. :-) {EDIT: As DeathArrow also indicated! }


Reference counting is a form of garbage collection.


Terminology in the field can indeed be confusing. In my experience, people do not seem to call reference counted C++ smart pointers "garbage collection" (but sure, one/you might, personally).

"Automatic vs manual" memory management is what a casual PL user probably cares about. So, "AMM" with later clarification as to automation options/properties is, I think, the best way to express the relevant ideas. This is why I said "tracing GC" and also why Nim has recently renamed its --gc:xxx CLI flags to be --mm:xxx.

Whether a tracing collector is even a separate thread or directly inline in the allocation code pathway is another important distinction. To muddy the waters further, many programmers often mean the GC thread(s) when they say "the GC".

What runtimes are available is also not always a "fixed language property". E.g., C can have a tracing GC via https://en.wikipedia.org/wiki/Boehm_garbage_collector and you can get that simply by changing your link line (after installing a lib, if needed).


Terminology in the field is what CS books like "The Garbage Collection Handbook" (https://gchandbook.org), or "Uniprocessor garbage collection techniques" (https://link.springer.com/chapter/10.1007/BFb0017182), define.

People don't call reference counted C++ smart pointers "garbage collection", because they aren't managed by the runtime, nor optimized by the compiler, rather rely on basic C++ features.

But they call C++/CX and C++/CLI ref types, automatic memory management, exactly because they are managed by the UWP and CLR runtimes respectively,

https://docs.microsoft.com/en-us/cpp/cppcx/ref-classes-and-s...

https://docs.microsoft.com/en-us/cpp/dotnet/how-to-define-an...


I doubt you are, exactly, but I think it's really hard to argue that the terminology, as often used by working programmers, does not confuse. ("Need not" != "does not"!) All that happened here is darthrupert made vague remarks I tried to clarify (and j-james did a better job at [1] - sorry!). Most noise since has since been terminology confusion, just endemic on this topic, embedded even in your reply.

I may be misreading your post as declaration rather than explanation of confusion, but on the one hand you seem to write as if "people not calling RC smart ptrs 'GC' is 'reasonable'" yet on the other both your two books include it as a form of "direct GC" - GC Handbook: The Art of AMM with a whole Chapter 5 and the other early in the abstract. darthrupert just reinforced "working programmer usage" being "not academic use" elsewhere. [2] GCHB even has a glossary - rare in CS books (maybe not in "handbooks"?) So, is your point "Academics say one thing, but 'People' another?"

C++ features you mention were intended to blur distinctions between "compiler/run-time supported features", "libraries", and "user code". Many PLs have such blurring. Such features, basic or not, are optimized by compilers. So, neither compiler support nor "The Runtime" are semantic razors the way I think you would like them to be (but might "explain people/working programmers"). If one "The" or "collection" vs. "collector" are doing a lot of semantic work, you are in confusing territory. Also, human language/terms are cooperative, not defined by MS. MS is just one more maybe confusing user here.

Between intentional blurriness, loose usage, and many choices of both algos & terms used in books, papers, documentation and discussions, and the tendency for people to just "assume context" and rush to judgements, I, for one, don't see existence of confusion as mysterious.

Given the confusion, there seems little choice other than to start with a Big Tent term like "memory management" and then qualify/clarify, though many find "not oversimplifying" tedious. I didn't think this recommendation should be contentious, but oh well.

[1] https://news.ycombinator.com/item?id=31440715

[2] https://news.ycombinator.com/item?id=31445489


I see now that the GP wrote “a garbage collector” (not the article). Oops! “A reference counting method” doesn’t roll off the tongue. So it appears that your nitpicking was indeed appropriate.


Until you get a reference cycle. Then it's a form of garbage accumulation.


See the neighboring subthread: https://news.ycombinator.com/item?id=31438134 (which has details/links to more information and is more explicit than just the 4th footnote at the end of the mentioned twice before peterme link.)


Well, it's not exactly garbage collection in the way it's commonly understood, I believe:

https://nim-lang.org/blog/2020/10/15/introduction-to-arc-orc...


That doesn't really matter for syntax. You could easily add lifetime syntax to Nim.


It matters for the question of whether Nim manages to do the same things as Rust with better syntax.


Okay, so instead of Nim, consider a hypothetical language that has Nim-like syntax but Rust semantics. What would be the problem with that?


I'm curious what that assumption is based on. Rust and Nim are pretty different, and both of them have features that the other doesn't even try to have.


This is an interesting comparison of memory semantics I stumbled upon: https://paste.sr.ht/blob/731278535144f00fb0ecfc41d6ee4851323...

Nim's modern memory management (ARC/ORC) is fairly similar to Rust. ARC functions by reference-counting at compile time and automatically injecting destructors: which is broadly comparable to Rust's ownership + borrow checker.

(A big difference is that Nim's types are Copy by default: this leads to simpler code at the expense of performance. You have control over this, keeping memory safety, with `var`, `sink`, and others, as highlighted in the above link.)

https://nim-lang.org/blog/2020/10/15/introduction-to-arc-orc...

For reference cycles (the big limitation of reference counting), there's ORC: ARC + a lightweight tracing garbage collector.

As I understand it Rust also cannot handle reference cycles without manually implementing something similar.

https://nim-lang.org/blog/2020/12/08/introducing-orc.html

https://doc.rust-lang.org/book/ch15-06-reference-cycles.html


ARC is not "fairly similar" to idiomatic Rust. ARC is not idiomatic in Rust; it's a fallback for when you can't make do with lifetimes and borrowing.


Nim passes by value by default, which eliminates much of the complexity overhead of lifetimes and borrowing in most programs. (the compiler does optimize some of these into moves.)

But when you do want to pass by reference: that's where Nim's move semantics come in. These are what are fairly similar to Rust's lifetimes and borrowing, and what the paste.sr.ht link briefly goes over.

If you're interested, you can read more about Nim's move semantics here:

https://nim-lang.org/docs/destructors.html


Note that rust doesn’t have ARC; there is an atomic reference counted pointer, but it’s not automatic, which is what the a in ARC stands for.


Tongue in cheek: Then it's exactly like (modern) Nim, only that Nim does the fallbacking automatically as needed ;) There are lots of devils in the details, I assume.


Rust had a lot to go against so I can't blame them for somehow subpar syntax. Maybe it's gonna be revised.. maybe some guys will make a porcelain layer or a rustlite.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: