Love to see things like this all the same as they tend to solidify protocols/specifications. This of course can have both good and bad results.
It is something I am interested in buying but I have not been able to find reviews of it.
The only reason I haven’t read it yet is that I think it deserves a lot of attention and I haven’t made time yet.
Still undecided as to which language to work in, I wanted to use rust but the one part I'm unsure about is possibly where tree like data structures will need to be implemented?
I guess that's tough to do in rust? I will give it a try.
Not everyone agrees with me, of course.
As I said, I haven't read the book yet, so I can't tell you how easy/difficult it will be in Rust. The author is learning Rust now, incidentally. But if you need a graph (which you probably will), I'd advise you to use something like https://crates.io/crates/petgraph instead of building one yourself. The difficulty is in writing graphs, not using them.
To "get around" this (but see the next paragraph), we usually associate each node to a simple identifier such as an integer (usize), and store all nodes into a Vec. Neighbors are indexed by this identifier rather than a pointer. (Indirection? Maybe. But I'm pretty sure most processors support a base+offset mode just as easily as a direct reference mode.)
Honestly, I think this is good practice even outside of Rust. If you're using pointers, and you want to associate some extra data to a node that isn't part of its graph structure -- a label, say, or some other structure that you're modeling the relationships of -- there's no good way to add that data without modifying the definition of a node. A pointer indexes a _single_ point in memory. An abstract identifier may index any number of points in memory -- just create another table containing the new information you want to associate.
> Honestly, I think this is good practice even outside of Rust.
It is, and they use this way of writing their code a lot in game dev. They call it Entity Component System, or ECS for short. The concept of ECS extends a bit beyond what you described above but fundamentally I perceive your description to fall in line with ECS.
An ECS library for Rust exists, named Specs. It might be of interest.
It's not all that different from a database, though.
Indeed! I'm tempted to coin a variant of Greenspun's Tenth Rule: any sufficiently complicated program contains an ad-hoc, informally specified, bug-ridden, incomplete implementation of a database engine.
> There are benefits but they come with downsides
Downsides relative to what? The downsides listed are all in common with traditional pointer-based references, so I would argue that garbage collection is rather orthogonal to the question of storing indices instead of pointers. Any allocation comes out of a memory arena of some kind, be it an explicit vector of slots or the implicitly-defined standard heap. The tools for solving these problems are the same in all cases.
Certainly, Rust's references avoid all of the problems you listed. Rust pointers essentially embed the semantics of a garbage collector at compile-time , in the domain where ownership patterns can be strictly verified. But nodes within a graph are already a poor fit for the ownership model -- the entity that "owns" a node is really the graph itself, not its neighbors -- so that's the level of granularity at which Rust's references are useful. You need something else within the scope of the graph.
EDIT: On reflection, you might be referring to a language like Java which has an ambient global garbage collector. Indeed, using indices instead of pointers means you're on your own -- you've allocated the memory arena through the standard means, but then you take on the responsibility of managing that memory yourself. This is a fair criticism! Purely in my experience, data modeled as a loose graph of directly-related objects is a lot more difficult to understand and maintain than data modeled indirectly using some form of identifier -- mostly because of the effects I mentioned in my earlier post on associating new information to an entity.
But if you know through some other means the exact time when a node should be deleted, you can delete it at that time, and anyone following a soft reference will find that it's no longer there, which may be a way of catching a bug. This is how both databases and entity component systems work. But it does mean that resolving a reference can fail, and you have to handle that somehow.
Something I'd wish I'd known (although it wouldn't have changed my decisions to purchase) is that it's not an exploration style book (i.e. "Let's cat this file and find out what it contains and why.") it's more of an explanation (i.e. "When I cat this file, it outputs XYZ which means ABC which I know from my research of the git source."). So the author isn't taking you along on their research, but rather coming back to you after the research is done to explain their findings from the ground up.
This means early chapters have a lot of, "You'll just have to trust me XYZ means ABC." But this is also understandable given the complexity of git; there isn't really a square one.
I also would have preferred the author use something like Python instead of Ruby for the reference implementation. IMO Python is a little more ubiquitous and easier to install/setup than Ruby. Ruby also leaves Windows devs at a disadvantage. But that's just me being pedantic.
Overall I'd give the thumbs up.
It's so much fun, but not that practical for scalable websites.
Git based kv has a bit different purpose than the regular kv storage. They are intended for communication between entities, running in parallel, sort of transactional memory.
They are not intended for users' data storage.
My understanding is that they want to get it fully ported before Python 2 EOL.
Strongly recommend using some standard FOSS license before plenty of people add commits and it gets a big mess clearing up the licensing situation later.
Also, not having a license file isn't a messy situation, that means “this project is protected under Berne Convention copyright“: the author is the only one holding every rights on the code and every use that is not explicitly allowed is a copyright infringement (unless it's fair use).
That being said, it would be nice from the author to put the code under a permissive license to allow other people to play with his code too (at the moment, even forking it is a copyright infringement…).
Not that it changes much, but from the GitHub Terms of Service:
> By setting your repositories to be viewed publicly, you agree to allow others to view and "fork" your repositories https://help.github.com/en/articles/github-terms-of-service#...
First: GitHub has a Terms of Service which was somewhat-recently amended to make this license grant explicit:
"Any User-Generated Content you post publicly, including issues, comments, and contributions to other Users' repositories, may be viewed by others. By setting your repositories to be viewed publicly, you agree to allow others to view and 'fork' your repositories (this means that others may make their own copies of Content from your repositories in repositories they control)."
(Crucially, it doesn't require an open-source license, though.)
Second: even without that, there's such a thing as an implied license:
Like, if you write something down on a piece of paper, you can't then sue the owner of the paper for copyright infringement.
Similarly, if you upload code to GitHub, and tell it to share your code, you can't then sue them for sharing your code, ToS or no ToS.
And that license grant is solely through github as a service, it's unclear that a local clone is even permitted.
That license grant has been added specifically to make GitHub itself waterproof (AIUI), so it makes sense it doesn't extend to user's rights. Look, but don't touch.
The ToS doesn't say there's an implicit reproduction license, though; it says there's an explicit reproduction license.
The other licenses can still be argued to be implicit. For instance, you have a decent argument that local clones are an implicit license – GitHub provides a "Clone or download" button directly on the repo page, and it's one of the main use cases of GitHub. (Other arguments exist.)
Then it's totally excluded from my claim which aims «every use that is not explicitly allowed». :)
Your first point doesn't really bring much though, since it falls in the “explicitly allowed” part of my comment.
Overall, my whole point stands still: if anyone went on GitHub, downloaded the project and did anything with it that went beyond fair use, that would be a copyright infringement because neither the author nor GitHub granted you any permission to do so.
But, can you compile it ? I'm not sure… Better ask your lawyer. And what about running the compiled binary ? I don't think you're allowed to do that.
Hosting that code yourself somewhere else would be illegal though.
It wasn't offered for free local reproduction since that right was not explicitly granted, and Github's license grant does not grant it either (as far as my reading goes). Though the country you're in may have a private copy exception, in which case you'd be in the clear I think (depending on the specifics of that exception).
And in turn, the project you submitted it to cannot re-licence that patched section of code (e.g. become either GPL licenced) without your permission, as it does not belong to them.
(Edit for clarity)
Edit: I edited my comment to say “could” instead of “would” because the original author could argue that the author of the patch implicitly gave him the right to redistribute his patch by contributing it to a public repository. I'm not sure it would stand in court, but I'd say it would have a non-null chance of success.
But If the author, who initially claimed the project was a learning project, decided to use it commercially, he clearly wouldn't be allowed to use the patch. (and again, it could be different if the patch author willingly contributed to a commercial product).
The answer is that making the change is already usually copyright infringement (though I think some countries have a concept of private copies being exempt from these types of restrictions). But redistribution of your patch would definitely be copyright infringement because a license to create a derived work was not given to you -- and patches are by definition derived works.
For Rust that is (afaik) MIT. Why don’t you go try it? ;)
Performance for higher-level languages is usually great in-so-far as you're able to essentially write C code in that higher level language. When the language's limitations inhibit you from writing the C code you want to write, performance usually suffers. In java's case, lack of value types and stack allocation can be a major performance hindrance. Boxing is also a problem, although, as the mailing list post notes, this is easily overcome-able via manual specialization.
In addition there are many other solutions to safe parallelism and/or concurrency, some of which don't require a type system at all; Erlang is famous for safe concurrency and is untyped.
Lastly, there's good old fashioned multiprocessing which can be safe just by not sharing memory.
There is no one feature that is new in Rust, but it has a relatively unique set of features in the non-GC language world; ATS is the only one coming to mind, though I'm sure there are some other niche ones.
I love this combination in rust because latency sensitive operations in GCd languages are notoriously hard to achieve. Lisp was able to be an operating system because nobody needed to run quake at 100fps on a lisp machine. With GC you can pick latency or throughput but can't reliably get both without coding around the GC.
This does mean for me that when considering things that rust is particularly good at, latency sensitive applications stand out; this is not to say it's bad at non-latency sensitive applications, just that one has a lot more choices when latency is a non-issue.
I understand some of these are very likely for educational purposes (like this one and others; it's good for getting more familiar with the language), but it still seems to be a bit of a strange trend (especially since people who don't need to learn are doing it, seemingly just because "yay rust").
Furthermore, the resulting code feels just so sturdy. I can also expose it as C and it can be used from the likes of Python.
The value of simple things like this is immensely underestimated.
Say what you want about Rust v C/C++, but you cannot tell me that Rust's build process isn't easy. In fact, building for other platforms is pretty trivial, just `rustup add <target>` or wherever and you can target pretty much any common platform and many uncommon ones, and those targets will get updated with everything else. In fact, it's so nice that I have to convince myself to not use it as the way to distribute CLI apps (`cargo install <tool>`).
Actually there's a lot of great reasons to rewrite everything in every language. Git is an especially good piece of software to implement everywhere because it's relatively stable and it's pretty useful.
As for actual reasons, one good example is so you can keep your dependencies in the language, using the language package manager. For Go nobody even questions that this is worth it; it enables painless cross compiling and completely static, libc-free binaries. For Rust that may not be a thing, but you do at least get the benefits that you could integrate Git functionality without having to hack around in porcelain.
This one here is a learning experience by it's own description, but I would suggest people stop complaining about "rewriting everything" in $LANGUAGE. The opposite complaint is often cited as a reason why to not use the language (that, for example, basic programs haven't already been ported.) If we did build an alternate world with feature parity, unit testing, optimizations, in a memory safe language, I doubt many people would be complaining about the strange trend of rewriting things anymore.
That said, even when using those bindings, I've had to drop down to wrapping the porcelain sometimes. I'd love to have a native rust git implementation. It'd be easier to hack on, and to abuse for the type of git interactions I've written.
And it'd be one less external dependency to worry about when cross-compiling. I love how easy it is to install things from source in Rust - and it's pretty easy to add flag to make the program use your CPU's special instruction set to make it even faster.
> stop complaining
Not a complaint; more a question as to why there's a specific move around rust. I appreciate the reply; that's exactly the kind of response I was looking for.
> especially since people who don't need to learn are doing it
Don't need to learn?? There's always something to learn.
These days I'm getting lazy, so for something like git, I'll often exec out to the CLI app instead of fiddling with bindings, especially if it's not a performance-critical part of my app. However, I'd definitely look for a rewrite first and clean bindings second to use as a lib, especially if the rewrite had a suitable license (anything not copyleft).
HN discussion: https://news.ycombinator.com/item?id=19485609
Maybe it wont run on your toaster but if you are making git commits through your toaster you got other issues.
This made me snort out some coffee on a Monday. Thank you!
I think your point is also a good one I don't see represented often, portability for portability's sake is kinda silly. It's use that's important.
Edit - apparently rustc can now be linked to musl instead of glibc in nightly. Cool!
Is this not a successful build? At that point it seems like the OS is just not providing the necessary tooling, but Rust will run on the architecture.
Break out of these patterns by writing useful software in it that people would like to have on less popular platforms, so more people feel motivated to build and maintain build tools for it.
The author says quite clearly that it's "for fun and education".
Although Rust is not as portable as C, going through these hoops would mean that —- modulo codegen bugs —- the generated C code should still be as memory-safe as the original Rust code.
no, there is not any fully featured or official way to do that at the moment.
Why re-invent the wheel? Ada is superior. :^)