Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Algorithm Cookbook in Rust (github.com)
320 points by EbTech on June 19, 2017 | hide | past | web | favorite | 71 comments

MIT license is good, but I always wish that sample code was available under CC0 -- and presented with explicit permission to copy and paste without preservation of notices. (A request to link back optionally when practical is fine.)

That makes it easier for sample code to achieve the ends of maximum code reuse and popularization of the underlying technology.

Agreed. I once tried to submit a small snippet I found on stackoverflow to an open source project. Stackoverflow user content is licensed under creative commons, cc-by-sa (see https://stackoverflow.blog/2009/06/25/attribution-required/). I included attribution, but it was rejected. I totally understand why. CC0 is much preferred in this case.

Your anecdote also emphasizes the requirement for open source projects to maintain rigorous intellectual property policies. Good on both you and the project for the appropriate handling!

Clean IP history is especially important for "universal donor" sample code intended to be copy/pasted. If there turns out to be an IP violation in library code, with luck it may be possible to excise it from the library -- and hopefully not too many downstream will have forked it. But when code is copy/pasted, the linkage is severed and the damage becomes increasingly difficult to repair.

For those reasons, I'd prefer to see something like a contributor list of public identities for collections of sample code, so that it is transparent who is making the licensing promises.

FYI, newer Stack Overflow code is licensed under an MIT license with a link in code comments serving as attribution. See https://meta.stackexchange.com/questions/272956/a-new-code-l...

Looking at that link I don't think it was implemented. The footers on stack overflow pages still indicate cc-by-sa.

Why was it rejected?

Though I don't know the specific project, I can tell you that it is common for projects to reject contributions under licenses that cannot be subsumed by the project's main license.

For example, if a project under the Apache License 2.0 were to accept a contribution of CC-BY-SA code, the complete product could no longer be available under the Apache License (and other licenses with similar terms) -- causing problems for downstream consumers.

Thank you. But it seems not finished? I am trying to implement some basic data structures and algorithms in Rust, but the compiler nearly kills me. Is there any data structure & algorithms implementation tutorials for Rust newbies, like Learning_ Rust_With_Entirely_Too_Many_Linked_Lists[1]?


Data structures and algorithms are possibly the hardest part of Rust to dive into unless you have a really good reason. The best advice for newbies is to learn with other stuff and use existing algorithms. Of course, if you are trying to implement stuff for a class or something similar then you might be out of luck, and might consider trying the Rust IRC and such to work through the errors as they come up.

I made this cookbook specifically to show it doesn't have to be so hard. While designing an industrial-strength data structure library is an advanced skill, there are easier ways to implement the core ideas that should suffice for contest and coursework purposes. Very few people have to build actual libraries.

> Data structures and algorithms are possibly the hardest part of Rust to dive into

What else is there?

I think it was too much for the. to say "Data structures AND algorithms". It is mostly data structures that can be difficult.

Agreed. It's very unnatural to represent data structures with circular references, or pointers in general. I had to re-think how I approach certain problems to make it more Rust-friendly.

CRUD webapps. It can get venture funding too if one can design a login page.

So what is the point of a language in which things are hard to implement?

Algorithms and data structures are inherently hard to implement correctly and safely given low-overhead, bare-metal designs.

Rust tries to guarantee that you've implemented things correctly and safely, and therefore makes all the formal verification of such into a requirement.

Other low-overhead-bare-metal languages, meanwhile, trust you to have done the formal verification yourself using non-compiler-toolchain tools like linters and static analyzers.

The people who use these other languages who do ensure correctness+safety, will have done all the same work they do in Rust—just using third-party tools instead.

The people who use these other languages but who do not ensure correctness+safety, might seem to have an "easier time", but they will almost always end up with algorithms/data structures that—while seeming to work in most cases—have fatal flaws or vulnerabilities.

To use Rust is simply to sign up for "doing the work" of formally verifying your code up-front, rather than brushing it off as something to think about in some vague, undefined "later."

It's not any harder to correctly implement a data structure, even an "advanced" one, in Rust-with-unsafe than it is in C++-with-exceptions. I don't know where this meme comes from.

I don't think this claim is completely unfounded from perspective of people coming to Rust from other languages. Consider for example:

* Poor ergonomics of raw pointers. Lack of arrow syntax, verbose pointer arithmetic (at least now there is an experimental offset_to).

* Once you have pointers in data structures, properly specifying lifetimes is now your job. Compiler is no longer your friend, but an whisperer of evil (Compiler: Of course it safe. It has 'static lifetime, it must be safe. How do I know, you ask? I saw you dereferencing a pointer).

* Cyclic data structures (without additional layer of indirection) require unsafe, or reference counting and dynamic borrow checking.

* Need to account for zero-sized types when doing pointer arithmetic.

* Need to uphold Rust invariants in unsafe regions of code - forming non-unique mutable references to the same memory is hardly unusual when working with cyclic data structures, or in general (consider for example a swap without checking if memory locations are identical).

It seems to me that Rust moves part of required work from users of data structure to the designer / developer of data structure. You need to invest more work upfront, but once done users don't need to be on constant lookout for accidental API misuse that results in undefined behaviour.

Agreed but don't you have to do all this in C/C++ too if you want your data structure to be really safe? Except perhaps zero sized types.

No, those are Rust specific things. For example, aliasing mutable references in C++ is potentially dangerous, but not undefined behaviour per se.

Regarding making a "really safe" data structure, this is quite tricky question. Safety means quite different things in those communities. Moreover those different concepts of safety are not readily transferable between languages, at least not in useful sense.

In Rust you would say that data structure is safe if it doesn't cause undefined behaviour when used without any unsafe user-code. Essentially once you write a safe data structure the safety is enforced by a compiler, even in a presence of malicious user-code (as long as it avoids unsafe blocks of code). In C++ on the other hand, a user would have only themselves to blame if they broke a precondition expressed somewhere in a documentation and caused undefined-behaviour.

This is exactly why I would postulate that writing correct data structure in Rust, as opposed to say C++, may require more effort. For similar reasons using dependent types doesn't make programming any easier. Of course, this may turn out to be a worthy investment in the long run.

> Cyclic data structures (without additional layer of indirection) require unsafe, or reference counting and dynamic borrow checking.

Doesn't this apply to any language?

Some of it is probably that you're not able to not-handle some categories of flaws, and the compiler yells at you for them instead of pushing it off to a runtime error you may never see. Instant pain on the ignorant implementer rather than "works on my machine™".

Efficiency, predictability, increased security.

"My other goal is to show developers that C++ and Java kinda suck" -- really?!

Rust is one of my favorite programming language, and yet I learned how much I missed java after learning Rust. The fact that everything is dynamically dispatched and non-primitives are referential type allows you to program things without thinking too much. Plus, JIT & Hotspot can optimize these away - perform stack allocations for local objects and short-circuit dynamic dispatch for nonvirtual functions.

That said, my biggest Rust-gripe has been writing recursive / self-referential data structures. Borrow system is just not very convenient for that and you end up restructing the code to be a bit more Rust-friendly.

Yeah, it's a pretty poor way to represent Rust to an audience that is ostensibly it's target audience.

Any developer who can't name something that "kinda sucks" about their language probably isn't ready to learn a new language.

The quoted comment isn't about particular shortcomings in a language, it's a blanket pejorative, and it doesn't reflect well on the author's credibility.

Fair point, it's fixed now.

Would be a stronger argument if you had a better rating on CF :) As of now I will stick to the Petr / Egor philosophy.

What's their philosophy? I think Petr was using C# at some point.

Well I would say their philosophy first of all relies on strong tools. Intellij with CHelper plugin allows for simpler testing and also has incredibly powerful debugging as I am sure you have seen in their screencasts.

Also I think they sort of show the value of Java's simpler approach with everything being a reference in terms of writing code quickly. The alternative approach that C++ competitors use relies on a very specific style unlike what production code at say Google uses, with pointers used as sparingly as possible. With everything being a reference it is much easier to work with graphs for instance.

To me Rust goes even further into the C++ direction where it becomes more and more difficult to quickly write code that compiles. With Java I have no issue writing 100 lines and having everything compile the first time around, particularly with an editor like Intellij. With C++ even Clion struggles significantly with producing helpful info, simply due to the complexity of C++.

Anyway don’t have enough time to really eloquently think through all aspects of this. Also I was just kidding about CF rating, you are far better than me obviously.

It's not achieving it though. The code could be more idiomatic and the project could have a better file organization.

Suggestions are welcome; I'm still learning Rust.

To put context around my provocative comment earlier: I write C++ professionally. C++ is the right choice for my organization, though for purely historical reasons. In the long run, I think we'd all be better off if Rust or similar languages replaced their predecessors. So my comment came from a place of personal frustration, the same that led me to find Rust in the first place.

The old wording was too negative; I'm just excited to show that contest programming, which one might imagine to be hard to translate, is not only practical but indeed arguably nicer in Rust.

Just be careful to not adapt the michaelochurch approach. Very risky :)

It's a worthy cause. At HN we might all be aware that Java and C++ suck, but others might not know. Assuming others know what you do is a common cognitive bias. To us, this might be a slightly childish thing to have as a goal, but it might be a revelation to some readers.

> At Hacker News we might all be aware that Java and C++ suck, but others might now know. Assuming others know what you know is a common cognitive bias.

The irony here is astonishing.

I am a frequent commenter on HN and I rather like C++. I acknowledge its shortcomings without resorting to the tribalism that is marring the reputation of the Rust community.

Please don't assume that all of Hacker News has a homogenous perspective on the utility (or lack thereof) of extremely common languages like C++ or Java. I've said this before in threads like this and I'll say it again: one language does not need to win.

"One language does not need to win" is the majority position of the Rust community. Of course, it itself is not completely homogeneous either.

Oh believe me, I know the Rust community does not itself endorse this behavior (or mindset). I do believe the vocal minority that pushes that agenda actively harms the community; however, I appreciate your effort for clarification.

I'm not saying that Java or C++ are useless per se, just that they suck compared to many other options we have available.

Of course they both have tremendous utility, but so too does a 15 sqm house. Doesn't mean it doesn't suck.

A graph with the members first, next, and endpoints is not exactly self explanatory, something better than a "A compact graph representation" comment would be nice.

Especially since the ownership model of rust makes the classic graph representation of edges owning nodes impossible.

It is only impossible​ using safe Rust. Unsafe code blocks should allow to represent this behind safe API.

To add to the list, here's my rust-noob implementation of the Kalman filter.


GPL3 means it's not very reusable in any projects that aren't GPL3. Have you considered a BSD or MIT license? or CC0 if you want to get code reuse? Or at the very least LGPL2

Maybe that's what they want. Personally I use GPLv3 for any project I can, because my priority is user freedom and not code reuse.

Which of the four freedoms (ignoring that only freedom #0 and #2 apply to users) is violated by using more liberal licenses like MIT or CC0? Freedom #0 is clearly violated by GPL projects: GCC for years tried to be prevent to be run as a library or in combination with other programs. Running ZFS or the better proprietary graphics drivers on Linux is a bit questionable legally. GPL being incompatible with app store rules kind of breaks freedom #2 and #3.

> Which of the four freedoms [...] is violated by using more liberal licenses [...]?

None of them [which, as you probably know, goes without saying because they're both free software licenses]. The problem is that a project using a lax/pushover license is that the project can be used as a tool for violating user freedom. For examples of this, see the PlayStation and other such products which were developed thanks to the existence of free software under lax/pushover licenses.

As an aside, the thing that personally made me care about the prolonged effects of copyleft is that I realised that if I wanted to contribute to a world without proprietary software, every non-copylefted program that I wrote was potentially acting against my intentions. You might not see it that way (and I didn't see it that way for a long time), but that's why I made my decision.

> Freedom #0 is clearly violated by GPL projects: GCC for years tried to be prevent to be run as a library or in combination with other programs.

That is simply not true, and is a non-sequitur to boot. It _is_ true that the GNU project decided to make it hard to create proprietary modules [or otherwise external modules] for GCC, but nothing stopped a user [in principle] from doing either of those things. The GPL doesn't require a project to make themselves usable as a library.

> GPL being incompatible with app store rules kind of breaks freedom #2 and #3.

But "app stores" are acting against user freedoms, which is why you can't distribute GPLv3 software [without additional permissions] in most "app stores".

In fact, I would argue the GPL is doing its job well here. It's preventing the re-distribution of free software in a way that harms user freedoms. That's sort of the whole point. Just because you don't care about how "app stores" treat users doesn't justify claiming that copyleft is acting against users here...

Consider using AGPL for anything that might be run on a server somewhere.

Thanks, that's a fair point. I do value user freedom and also come from the R language where GPL rules among packages. It's true though that Rust crate culture leans much more to MIT.

I think GPL3/MIT as a dual license should be a decent compromise.

> It's true though that Rust crate culture leans much more to MIT.

This is a fairly worrying trend I've noticed in new languages. In general it seems that they are systematically creating ecosystems where copyleft is much less modularised than in other (older) languages.

For example, in Go and Rust, LGPL is effectively as strong as GPL because it's non-trivial to make packages/crates replaceable for an end-user with a binary. This results in a general distaste towards LGPL even though it's objectively _the language's fault_ that LGPL isn't as friendly as it is in C. This causes everyone to license things as Apache or MIT (or _maybe_ MPLv2) and as a result the ecosystem of copylefted software is reduced.

> and as a result the ecosystem of copylefted software is reduced

You might be worried, but I'm not. In fact, I'd say that it's a magnificent trend. The less copyleft software, the better.

> The less copyleft software, the better.

All technical discussions will end in license arguments. Put simply, I don't agree (obviously). What I don't understand is this viceral hatred of copyleft to the point where you will make even weak forms of copyleft unfriendly in a language ecosystem. Aren't languages meant to be agnostic to license arguments...

At the end of the day, I don't care either way. I make all of my standalone Go and Rust code GPLv3. It's just a shame that people are willing to stunt their own languages just to make a point about licensing.

> Put simply, I don't agree (obviously).

Right. That's reasonable. The point of my comment was to express disagreement with something that seems obviously true to you. I'm happy about the decrease in copyleft for its own sake. I dislike copyleft because I dislike intellectual property, on which, copyleft relies upon. (There's no need to debate this further. I've done it many times already.)

> It's just a shame that people are willing to stunt their own languages just to make a point about licensing.

You say "stunt," I say "free." Mischaracterizing this into a group of people indulging in some trivial holy war is absurd.

Others are more practical than me. Languages are permissively licensed to drive adoption. Regardless of the theory of copyleft, and regardless of rationality, people are afraid of it and stay away from it.

> I dislike copyleft because I dislike intellectual property, on which, copyleft relies upon.

I promise I won't debate this further, I just wanted to say that I also dislike the concept of intellectual property. The reason I like copyleft is because it takes the draconian machinery of copyright and puts it to work protecting users.

Yes, it inconveniences developers, but I think users are more important. You wouldn't prioritise the "right to jerry-rig" of bridge architects over the people who use their bridges.

> You say "stunt," I say "free."

I'm not sure how making it harder to change out parts of a compiled program makes the language "more free". It's a feature that's missing.

> Languages are permissively licensed to drive adoption.

I don't disagree with that (though I think compiler _implementations_ should be copylefted). But that's not the point, my point was that if you can't replace parts of a compiled program then LGPL loses its appeal.

> I just wanted to say that I also dislike the concept of intellectual property.

Right. I understand the "copyleft fights against IP" argument. :-)

> I'm not sure how making it harder to change out parts of a compiled program makes the language "more free". It's a feature that's missing.

Really? Are we really not above misappropriating words? My point wasn't to lay claim to the word "freedom." My point was to say how silly it is to bake our biases into our descriptions of the world around us. The various perspectives on freedom are well covered by the Internet at large, and extend far beyond the realm of IP.

What would be the point of dual licensing GPL3 and MIT? So far as I can tell, MIT is a strict subset of GPL3, and any GPL3 project can use it anyway.

Releasing it under MIT too would allow it to be used in closed source software projects. GPL does not allow that use case.

I think the point int_19h meant is that you don't need to additionally license software under GPL when you already license it under MIT

This is great. I am curious if anyone else knows of a similar cookbook for Golang.

I have implemented quite a few algorithms and data structures for fun here [1]. It's not authoritarive by any means, and not everything is in Go (there's also some Rust, Swift, TS and JS), but maybe a few things could turn up useful.

[1] https://github.com/peferron/algo

Awesome. Thanks for the links.

Thanks for sharing this. I did a deep dive into Rust when I was tasked with implementing an interpreter[0] for school, and found the language to be very enjoyable to work in. While I have tried to convince other friends to try it, they still object citing a steep learning curve and the thought that it will never replace C++ or Java in mainstream development ecosystems. It is resources like these that will help lower that barrier to productivity. :)

[0] https://github.com/sejr/core-interpreter

Just a heads up: please put github tags

Offtopic/meta. I'm wondering about a technology that allows us to NOT re-implement an algorithm cookbook every time a new language shows up.

I'm pretty sure any technology that could do that would also obviate programmers for most tasks, since it apparently understands how to program and learn new programming technologies well enough to implement algorithms both correctly and idiomatically based on a specification.

yea, I'm convinced it's possible to at least save a ton of work there. Probably not for optimized libs, but something is much better than nothing.

Rust though might be an example where such a thing wouldn't work (had it already existed), since the type system is legitimately different. But some things might work, and even a test suite you could develop against would be wonderful.

An ABI that could express sum types would be a huge step forwards for the computing world.

.net and the JVM sort of achieve this, but at the cost of going managed for everything. That's mostly fine (and possibly entirely fine, to be honest; very few people who think they need non-GC actually do), but there are still a few diehards who insist on non-GC, and Rust is pretty much the first good language to have that option in a comprehensive way.

Like dynamic libraries and FFI? Reimplementing algorithms is the only viable way to get them completely in a new language. It's also the only viable way to improve them.

But implementing algorithms is an enjoyable way of learning a language.

Ah yes, a unicorn.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact