More on GVFS

qarioz · on Feb 8, 2017

GVFS stands for GNOME Virtual file system since 2007. Looking at you Microsoft.

sho_hn · on Feb 8, 2017

Not sure why you're being downvoted. That's indeed what gvfs commonly stands for in industry. It's a poor and confusing name choice.

woof · on Feb 8, 2017

Outside the "industry" (aka the small world of GNOME and KDE), GVFS is not commonly known at all.

For the Gnome community this might be both poor and confusing, for the rest of us it's ... meh.

scbrg · on Feb 8, 2017

That's a fairly good argument to ignore any name collision; "oh, it only affects people that are not me".

Given that the products are quite similar - both are virtual file systems after all - confusion is actually quite likely. If you're the kind of person who has reasons to speak about virtual filesystems at all (which, honestly, most people don't) you should probably know about both.

Had this been back in the day one would guess it was deliberate... but hey, everybody tells me Microsoft are supposed to be good guys these days, so I guess Hanlon's Razor applies.

AnkhMorporkian · on Feb 8, 2017

I see no problem in it in that the two domains are very, very unlikely to be confused in context. Pretty much the only time confusion is likely to be sown is if there is a headlines akin to the one submitted here.

AFAICT, there's essentially no overlap of features beyond the fact that they're virtual file systems. The feature sets are completely different, and there's no intersection where if you read more than a few words into whatever you're looking at you'll be left scratching your head thinking "Oh jeez, which one is it this time?"

1125 · on Feb 8, 2017

I don't know... The first 3 articles shown on a Google search for "gvfs" are based on that small world of GNOME.

louhike · on Feb 8, 2017

Will it still be the case next? I'm not saying it is a good choice, but maybe they consider they will gain enough traction to surpass it in the results.

scbrg · on Feb 8, 2017

If that's the idea, that would make the choice of name directly malicious, rather than just an unfortunate oversight. I don't see how it would make qarioz's original point any less valid at all.

stuaxo · on Feb 8, 2017

Which ends up being crappy for anyone on the gnome side googling for gvfs.

bkor · on Feb 8, 2017

The g in gvfs is not an abbreviation for GNOME. It's either gio or glib and part of that. See the picture on https://developer.gnome.org/gio/stable/ch01.html#gvfs-overvi..., it's more low level than something specific to GNOME.

smcleod · on Feb 8, 2017

yes that's what I thought the post was about until I saw the domain it came from as well...

fragmede · on Feb 8, 2017

Not to mention, this is supposedly a new Microsoft, but this looks an awful lot like steps one and two in "embrace and extend" ploy that they perfected in the 90s.

ekidd · on Feb 8, 2017

From the article:

> Lots of branches – Users of Git create branches pretty prolifically. It’s not uncommon for an engineer to build up ~20 branches over time and multiply 20 by, say 5000 engineers and that’s 100,000 branches. Git just won’t be usable. To solve this, we built a feature we call “limited refs” into our Git service (Team Services and TFS) that will cause the service to pretend that only the branches “you care about” are projected to your Git client. You can favorite the branches you want and Git will be happy.

This is almost certainly a result of trying to have one company-wide monolithic repository that holds the source code of hundreds or thousands of separate projects.

Git is more pleasant when you break your codebase into isolated components. These can be pretty large—the Linux kernel has 16 million lines of code—but if your codebase is many times larger than a complete modern kernel, you might want to split it.

If you have 5,000 engineers all pushing branches to a single master git repository, you may want to either rethink your repository structure, or at least have maintainer subtrees the way Linux does.

izacus · on Feb 8, 2017

> Git is more pleasant when you break your codebase into isolated components. These can be pretty large—the Linux kernel has 16 million lines of code—but if your codebase is many times larger than a complete modern kernel, you might want to split it.

As part of a company that switched to same kind monorepo structure form several separate repositories - there is nothing "pleasant" about having to deal with multiple Git repositories for connected components. Subtrees, submodules are utter hell of maintenance, checkout bugs (which hurt CI) and bad UX across the board ("why doesn't this build?" "you forgot to checkout submodules" "no you forgot to move the commit pointer" "no you forgot to change dependant tests because you didn't see them in an isolated repository"...)

It's not a GOOD approach, but it's the best approach compared to all other more terrible ones.

luckydude · on Feb 8, 2017

Git would do well to copy BitKeeper's nested repository architecture. Unlike Git, a collection of nested repositories (what we call submodules) works exactly like a monolithic repository. Want to see diffs for your tree? Same command. Want to check in your changes? Same command. Want to update your collection? Same command. Etc.

It's a lot more work than the hack that is submodules (we understand that hack, we did exactly the same thing for years to track outside source drops). But it is worth it, everything just works.

http://www.mcvoy.com/lm/bkdocs/nested.html

http://www.mcvoy.com/lm/bkdocs/productline.pdf

krupan · on Feb 8, 2017

I'm really intrigued by this. I love mercurial (and tolerate git), but they both handle sub repositories poorly. How much of my mercurial (and/or git) muscle memory would I have to unlearn to switch to bk. Just for basic commands, I mean.

You could tell me to go try it out myself, but a nice side-by-side comparison showing that bk does all that git/hg do (if that's true) could go a long way to attracting git/mercurial people over to bk.

luckydude · on Feb 8, 2017

Well we support fast-import so you can get your history into BK and play trivially.

BK does more and less than git/hg. It's got richer history, it was trivial to write a bk fast-export that git could import. Writing a bk fast-import wasn't so bad but an incremental one is hard because git doesn't store as much history (no per file history, actually no file object, just pathnames that have some data, not the same, we have a DAG per file, makes history better, makes merges faster and easier).

Mercurial copied our UI so for a lot of stuff if you know hg <cmd> then bk <cmd> will just work. BK predates all these systems, they picked up some of bk's commands.

There is a cheatsheet (somewhat out of date but it will get you going) at:

http://mcvoy.com/lm/bkdocs/bk_refcard.pdf

And you can email dev <at> bitkeeper.com or hit up http://bitkeeper.org for more info.

Cheers.

ekidd · on Feb 8, 2017

> Subtrees, submodules are utter hell of maintenance, checkout bugs (which hurt CI) and bad UX across the board ("why doesn't this build?" "you forgot to checkout submodules" "no you forgot to move the commit pointer" "no you forgot to change dependant tests because you didn't see them in an isolated repository"...)

I do not recommend using submodules, except in very limited cases.

What I'm suggesting is that if your code base is big enough to break git (considerably larger than the Linux kernel's 16 million lines of code), you might want to break it into multiple projects with their own release cycles. Build them separately, release them via an internal release process, etc.

Personally, I like largish repos. But "an order of magnitude larger than the Linux kernel" is just too big, in my opinion, at least for git-based projects. If you have 100 or 200 million lines of code, you probably have natural points to split up your architecture.

iainmerrick · on Feb 8, 2017

Seconded. Submodules especially are a tempting feature but never, never, ever work the way you want.

Submodules used to be incredibly buggy and terrible; they've improved in recent versions of Git so now they're merely really bad.

luckydude · on Feb 8, 2017

They are pretty profoundly broken in my opinion in that they take what is a distributed scm and turn it, for the submodules, into a centralized system.

Here's an example. Everyone who has used a good SCM discovers a pretty useful work flow, we call it merge and test. The idea is you have N pull request sitting around. You don't want to just pull them in, you want to see if they all work together. So you clone your integration tree to a test tree. In BitKeeper, you'd then add each of the pull requests as an incoming parent (bk parent -i <repo>) and then you just merge them (bk pull). Once it's all merged, you build it, test it.

Facebook does this, a lot of people do this. Why do they do it that way? To avoid polluting the main tree with garbage. It's the opposite of continuous integration (aka shoveling shit into your integration as fast as possible). Smart companies want their integration tree to be a stable base on which to build, not a layer of continuously integrated quicksand. If the tests all pass you can push the whole wad to the integration tree, or rebase each to tip (we hate this, doing that means you turned your history into CVS history and you lose all sorts of useful information).

So how do submodules break that workflow? Easy, they don't support sideways pulls. If two people are working on a submodule and they have not pushed to the main tree, try and sideways pull from Jane to John and you won't get Jane's work. As a dev who worked on the nested collections, I get why that doesn't work, there are a zillion corner cases where it gets complicated dealing with that state (and there is a goldmine of test cases in the open source BK code, see src/t/t.nested* and start reading).

Here's an example of an obvious thing you have to deal with (if you don't think in DAGs this is gonna hurt a little). Suppose you have a subrepo called libc. Jane has a clone of the collection as does John. Their clones are just the top repo, no subrepos are present. They both pull from different clones that have libc present and each of those clones have modified libc. So that means there is implied (different) work in Jane's missing libc and John's missing libc. Which means if Jane pulls John's clone then the libc DAG has forked and needs to be merged. BK recognizes that, even though the subrepos are not present, and tells Jane when she pulls John's collection that she needs to populate libc so she can merge it. The centralized model of submodules side steps that entire class of problems, at the cost of no sideways pulls, no workflow other than the centralized CVS like model.

Submodules are the quick hack to get you semi-coupled collections, BK/nested is the hard work to make a collection have the semantics of a monolithic repo.

woodrowbarlow · on Feb 8, 2017

does anyone know if mercurial is more pleasant in this regard?

e12e · on Feb 8, 2017

No, not really. Apart from the reply above about bitkeeper, I'm not aware of any new or old (d)vcs that does "sub-modules" well.

There's AFAIK been some orgs working on mercurial in order to support large mono-repos - I don't know if it's better or worse than git today. I see that Illumos (opensolaris) have shifted from mercurial to git, for example:

https://wiki.illumos.org/display/illumos/Mercurial+Workflow

(Solaris/Sun was a somewhat early adopter of "big" mercurial repos).

Working with big mono repos are not without challenges: https://code.facebook.com/posts/218678814984400/scaling-merc...

FWIW I tried using sub-modules for mercurial some years ago - in the end I realized it was probably not worth it. I'm also skeptical about go's use of "vendoring" dependencies for a similar reason.

luckydude · on Feb 8, 2017

So one of the main developers of hg is on record saying don't use hg's answer to this problem. I haven't looked at what they have, that was enough for me to move on.

bluejekyll · on Feb 8, 2017

I think GitHub/Gitlab (enterprise or public), do an excellent job of capturing the monorepo desires, but with lots of disparate code repositories.

That is, mono repos are great for single company wide access, easy to fork and commit against different areas, have a single system for driving all CI changes across repos.

I agree that having a single huge repo kinda sucks in Git, but it's also basically a nonstarter for companies that want all the features of git, but their repo and build is monolithic. Would you require people to first completely rebuild their software into separate Git repos before using Git? This could take years!

I've been looking at this exact problem at my work, and there is no easy answer here. I think MS probably picked the easier path forward in their case.

LeifCarrotson · on Feb 8, 2017

> This is almost certainly a result of trying to have one company-wide monolithic repository that holds the source code of hundreds or thousands of separate projects.

Correct. But both Google and Microsoft use this method. And I'm sure they've put in a lot more man-years of investigation into it than you and I have!

cwyers · on Feb 8, 2017

Facebook and I believe Twitter, too. It's almost as if anyone who actually has to contend with problems at this scale prefers a monorepo and anyone who likes to speculate on how to solve other people's problems on message boards thinks it's a dumb idea.

iainmerrick · on Feb 8, 2017

True, but I believe a lot of both Facebook's and Twitter's dev culture comes from ex-Google employees. So there could still be some groupthink going on. (Alternatively, a victory for successful ideas?)

Indirect evidence: both of them released open source clones of Google's proprietary build system "Blaze" -- Facebook's "Buck" and Twitter's "Pants". Google was actually last to the party, with "Bazel".

mjn · on Feb 8, 2017

Google reportedly [1] also uses a virtual filesystem (FUSE-based, in their case) to let individual clients get a convenient/efficient view into their giant monorepo. Microsoft looks like they're doing something very similar here, except building their vfs/monorepo solution on top of git and open-sourcing it (Google's is built on top of an in-house VCS called Piper).

[1] http://cacm.acm.org/magazines/2016/7/204032-why-google-store...

bluejekyll · on Feb 8, 2017

Isn't google's system inspired from their use of Perforce as well? Is Piper Git like, or more Perforce/SVN like?

iainmerrick · on Feb 8, 2017

Piper is a reimplementation of Perforce, because they had an existing gigantic source repository in Perforce and needed something compatible. There was an earlier project to migrate to SVN but that failed.

I believe Git was never in contention due to the obvious scalability problems, but GVFS does seem like it might potentially be usable at Google.

blktiger · on Feb 8, 2017

_Programming_ is more pleasant when you break your codebase into isolated components. Instead of trying to force git to adapt to these monolithic repositories, we should be breaking them down into manageable pieces for developers to work on. If you still need 10 years of history, just keep the old repository around for archival purposes.

[Edit] Not that I'm against GVFS, there is nothing wrong with not wanting to download the entire repository in some cases.

iainmerrick · on Feb 8, 2017

I have done a (very) little bit of work in Chromium/WebKit, which at the time was split across two repos. The dance you had to do to make a change spanning both repos (make a commit in repo A that adds a hook for the thing you need but still compiles, make repo B use it, maybe a third commit to tidy up A) was incredibly tedious and I'm sure introduced extra bugs.

When the Chromium team forked WebKit (renaming it Blink) they merged it into Chromium, specifically citing ease of development.

To some extent I think the problems with split repositories are just down to bad design. WebKit was never a nice clean API, it was just a mess of whatever hacks Apple and Google needed for their respective browsers. The Chromium/WebKit split wasn't made on sensible engineering grounds, it just reflected the Google/Apple administrative divisions (Conway's law: "organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations")

I still feel like modular code is better, and properly designed modular code can work well in modular repositories. But good modular code can work in monorepos too. Monorepos have the advantage that you can do huge codebase-wide refactorings atomically. Clang seems to use that approach -- all the internal APIs are constantly in flux.

jamesmiller5 · on Feb 8, 2017

Gerrit, the review system Chromium uses does support the exact feature of committing across repo boundaries using "Topics" . You can verify a "Topic" is atomically submittable across repositories before submission and then commit it to those repos all at the same time.

https://www.gerritcodereview.com/releases/2.12.md#New-Featur...

iainmerrick · on Feb 8, 2017

Sounds good! Although it says: "This setting should be considered experimental, and is disabled by default"

I assume all the repos involved have to be managed by the same Gerrit instance too. In that case it's not clear to me what you actually gain from multiple repos (besides working around Git's scaling issues).

jamesmiller5 · on Feb 9, 2017

If I'm not mistaken it's turned on by default in the latest 2.13.5 version, though I haven't had a chance to play with it much other than a quick smoke test.

> In that case it's not clear to me what you actually gain from multiple repos (besides working around Git's scaling issues).

I'm not sure what you mean here, would you please elaborate?

iainmerrick · on Feb 9, 2017

If you have 100 repos (say) with all code reviews and commits going through Gerrit, how is that different in practice from having a monorepo? Why would you want or need multiple repos?

If you're pulling in repos from third parties, outside of your Gerrit workflow, that's different -- but in that case I don't see how you can enforce those atomic multi-repo commits.

luckydude · on Feb 9, 2017

I dunno about Gerrit, but in BitKeeper, a commit that spans multiple repos is atomic, it either works or is completely rolled back (yes, of course, if there is a damaged disk and you can't write the data I'm lieing but if it is possible to be atomic it is atomic).

We lock the top repo and then go do the work in the subrepos. All repos respect a lock in the top repo but we have a way to say "yeah, yeah, there is a lock but that is your lock, go ahead".

aseipp · on Feb 8, 2017

Many monolithic repositories still have very clear isolation and component separation. There's a difference between "I'm doing this out of good taste" and "I'm doing this because this tool makes my life insufferable, otherwise". For dedicated organizations, with active development -- singular repos have some keen benefits, IMO.

Git is absolutely bad at handling even moderately large repositories (unless it is 100% text based for its entire life), and its support for handling submodules completely sucks. I didn't even work on a multi-million line codebase. Few hundred kLOC. But with like 20 submodules though (library components), and the amount of organizational overhead is annoying as shit. (Not to mention the tools we developed to stop the active ~20 developers from footgunning themselves when they committed non-existent pointers, explaining carefully how they worked, etc). We have a binary or two, but we are anxious and careful to control and update them when we do (they were added before our time, and every update at this point on such a long-lived repo can visibly affect performance. Splitting it out is even more annoying ultimately and loses our history, and does not reduce the size of the original repo). Git is just bad here. Almost everyone is a volunteer. This kind of shit is a waste of time for everyone, frankly.

What if git was 5x as slow as it was now? Would that mean we should create 5x more repositories, divide every repository up into 5 more? Delete your historic repositores with 5x frequency? Would trying to improve that mean we're trying to "force git to adapt for monolithic repositories", as you claim? There are very real limits to this kind of reasoning, bending backwards for your tools.

Tools like Bitkeeper are far better at this and have a way better UI, out of the box. The "Product Lines" feature means subrepositories are much more transparent for all users (with full push/pull bidirectional interoperation, which is absolutely critical, and it does crazy things like -- gasp, cloning all subrepositores by default). But BitKeeper can also handle 50GB repositories fairly easily and quickly, including large binary files. No multi-repos necessary, no LFS. It "just works". That's excellent and it removes concerns and worries you might have later.

I also think you understate the true value of continuous history. It's not something to toss aside so quickly, and any extra effort to use it means you lose a lot of insight. Maybe if you don't have to dig old code constantly (or your company throws away/rewrites its product constantly I guess).

In my current job, for the OSS project I worked on -- it's got VCS history dating back to 1995. Yes, I have traced changes, design decisions, old relics, and even bugs back nearly ~20 years. Why? Because it was easy and there, and the best way to discover why something was done. It is absolutely one of the most important discovery and historic tools, for this reason. Many developers -- including myself, use it all the time.

New developers cannot possibly understand the rational of an obscure change 10 years ago by someone with a PhD without this kind of help, at least not easily. Some files may not have been touched in like 8 years! This repository is also relatively quick to clone, luckily. The 20 submodules make it a damper.

One of my last jobs was working on a product that was 15 years old, and had only one repository. They stored binaries in it, but it had had many of them updated over the years (small JAR files, mostly, one copy of GCC). It took 6 hours to checkout, but honestly that was the main thing that sucked - performance. However, all of the same benefits applied. And that one truly was continuous -- you could get a look at the repo, as it was built for customers, years ago, to look at the behavior. Atomic refactorings across years of code, over a million lines of code, and 50+ developers and libraries was possible in a single commit.

That's a good thing. But the performance is a tool deficiency, not something to be hand waved away. That second repository, the 15 year one, was like, 50GB total. That's peanuts in today's terms. Other VCSs would have been much more tolerable (including BK, or probably something like Perforce, in terms of pure speed).

chris_wot · on Feb 8, 2017

In my current job, for the OSS project I worked on -- it's got VCS history dating back to 1995. Yes, I have traced changes, design decisions, old relics, and even bugs back nearly ~20 years. Why? Because it was easy and there, and the best way to discover why something was done. It is absolutely one of the most important discovery and historic tools, for this reason. Many developers -- including myself, use it all the time.

I have to second this. When I was working on the VCL component of LibreOffice I often had to backtrack through positively ancient history to try to understand decisions. Unfortunately, history before 2000 is not available, and even worse was the absolutely stupid decision someone made to merge changes but not keep commit history for branches - consequently there are poor summaries of multiple changes for a single code commit. It can be incredibly frustrating!

luckydude · on Feb 8, 2017

I'll third this. I hired a guy, a really bright guy, and put him to work. He was debugging some problem, I think a windows problem, and found the file that contained the code with the problem, looked at it, and said "this code is gross, I'm gonna rewrite it".

Did I mention he was bright? He was, and he took a look at the history in bk revtool (a graphical tool, shows you the dag, you can left/right click on a pair of nodes and see diffs, double click on a node and see that version of the file).

He double clicked on the very first rev and lo and behold, there was the code he was about to type in. He said it was exactly how he imagined it.

Hmm, says he. I wonder why it changed. He started clicking around and went "ohh, so on this OS they have that problem". Some more clicks "oh, and windows has this problem with wrapping pids". Etc.

Had he not had that history (and had it not be really fast to click around and see it), he would have rewritten the file to be back where we started, losing all the bug fixes.

History matters. Being able to see it quickly and easily matters.

iainmerrick · on Feb 8, 2017

Git is absolutely bad at handling even moderately large repositories (unless it is 100% text based for its entire life), and its support for handling submodules completely sucks. [...] Tools like Bitkeeper are far better at this and have a way better UI, out of the box.

Wait, are you saying... Git is a half-assed copy of the just the bits of BitKeeper that were needed for the Linux kernel?!? Written by people who don't care about UI?

(I'm mostly joking, of course, Git is great in many ways. But it's so terrible in others. The living embodiment of "worse is better".)

luckydude · on Feb 8, 2017

As the original BitKeeper guy, the Git "design" is painful. I can tell you this, if Git was as good or better than BK, we would have folded up shop years ago. We kept going (and open sourced BK) so people could see how a decent SCM works.

The fact that Git has no per file history data structure, no actual revisioning of renames, blame is insanely slow (see below for a demo), etc. Yeah, Git is really really popular, it won, no question. But man, did the world get screwed because of that.

At this point I'm well on my way to being retired and playing with tractors, but I'd love it if Git ripped off everything useful in BK. I don't see it happening, Linus is proud of his "design", it's pretty entrenched. At least with BK out there as open source, you can fast-import your git tree, play around and see what the world could have been like.

Oh, yeah, the blame demo. Git's file format isn't a file format, it's a repository format. With no formal file object, Git has to paw through the entire history to get the annotations for a single file. Most of the time you don't care but if you want to be responsive to bug reports, support requests, being able to figure out who changed what in a file is really helpful. So we benchmarked our blame implementation against Git's blame implementation, here's a little video about it:

http://www.mcvoy.com/lm/bkdocs/blame.ogv

It's pretty hilarious IMHO.

emmelaich · on Feb 8, 2017

Considering how fast git is compared to everything else for common operations, I'm quite happy for `blame` to be slow.

Optimise for the common case is standard engineering practice as I'm sure you know.

(btw putting design in quotes is a bit rude; as a fan of yours it pains me a little to say this)

luckydude · on Feb 8, 2017

Sorry about the design comment, I just don't know what else to say. It's a really poor design for an SCM system. As an SCM person, I'm horrified at the Git design. No file history? No file object? No file rename history? You guess at renames? Say what? If I were teaching a class on SCM and someone turned in Git as a class project I'd flunk them.

It's fine if what you want is a compressed tarball server, that's the essence of what it is and exactly what Linus wanted. From that point of view the design is brilliant. From an SCM point of view it is very lacking.

As for fast, Git is fast until it is not. See the Facebook benchmarking Git thread, it was posted here. Git commit and pull performance don't scale up at all.

Git is very fast when everything fits in memory. It's horrible when things don't fit in memory. Remember, it names everything by hash, doing that is neat from a math point of view, it sucks from a disk point of view, even an SSD point of view.

Finally, "optimize for the common case" I think is a bit off the mark here. Linus wasn't optimizing for performance, he was optimizing for ease of implementation. he simply doesn't care about the history. If he could have figured out a way to have a sliding window of history that was just enough to merge every old repo out there, I suspect he would have done that. To him, history is baggage, he just cares about the tip.

SCM people care about all the history, it's all useful at some point. The goal is to record all of that, efficiently, so that when someone needs it, they have it. The video is just showing that we got the efficient part done.

iainmerrick · on Feb 9, 2017

Personally I would really love it if "git blame" were both fast and reliable. When working with other people's code I constantly look at file version history to figure out why things are the way they are. Unfortunately, big merges and renamings tend to break the history in Git.

luckydude · on Feb 9, 2017

Renames break history in Git. Yup, that movie I linked to was not looking for renames and Git was still 1000x slower than BitKeeper. The default on blame in Git is to not look for renames because doing so is so slow. It's a non-issue in BitKeeper, the pathname of an object is an attribute of the object, just like the contents. It's exactly like a file system, we have an inode that is a unique name for the file, where it lives can vary from delta to delta, BitKeeper doesn't care.

"Big merges" break history in Git. Not in BitKeeper, BK passes changes by reference not by value. What that means is suppose you have 3 users, A, B, and C. And you have a file with 1000 lines of code. A & B clone that repo and have identical versions of the file. A changes the top 501 lines, B changes the bottom 500 lines. Now C clones A and pulls B. In the merge, C has pick between A's line 501 and B's line 501 but the other lines are merged unchanged. In most naive systems, it will look like C wrote either the top 501 lines or the bottom 500 lines when you run blame. That's what I mean by pass by value, the merged lines are a copy.

In BitKeeper, the unchanged merged lines are passed by reference, they are in the history exactly once. Only the line where it was merged (changed in the merge) will show up as belonging to C.

You'd be amazed at how long it takes people to trust this. They are so used to pass by value semantics (and hate them because now a bunch code done by someone else appears to be done by C and C gets stuck with the bugs).

iainmerrick · on Feb 9, 2017

OK, you got me interested. "brew install bitkeeper" worked so I might give it a try!

Is there anything like Github or Bitbucket that supports BK?

If there were a service to easily host BK repos, and especially if you could easily make it readable via Git, that could be pretty compelling.

(Edit to add: http://bkbits.net maybe?)

luckydude · on Feb 10, 2017

Yeah, bkbits.net is it for now.

Does brew tell you who did the packaging? I'm wondering if that's one of my guys or someone else (I'd love it if it were someone else, be good to get people hacking).

iainmerrick · on Feb 10, 2017

I think this is the package spec: https://github.com/caskroom/homebrew-cask/blob/master/Casks/...

So it's using the installer from bitkeeper.org; I don't know if the people who added it to homebrew are affiliated.

aseipp · on Feb 8, 2017

This reasoning has very serious limits, however. Every tool influences your behavior. Imagine if Git was somehow 10x slower. It's already quite fast still, but still way slower (e.g. on Windows for some cases). Does this mean you should now use 10x the amount of repositories to compensate, because "it's more pleasant to use Git with smaller repositories"? There's a line between bending your back for the tool, vs the other way around.

Put another way: what if git was suddenly 100x faster, at every scale for every repository? Would you put more stuff in there? Some limits are real (e.g. Google has like an 80TB source repository, you're screwed there). But some of the limits are absolutely artificial in a sense.

There's definitely something to be said for misusing Git in ways it shouldn't be used. And if you're Google or Microsoft or Facebook -- whatever, you have a thousand other problems.

But honestly, "I want it to be able to handle large repositories" has never seemed like a fundamental "misuse" of any version control tool. It actually seems like something we've only told ourselves is a misuse, because literally every available tool is just incredibly bad at, it fundamentally.

jpsim · on Feb 8, 2017

It's a little funny that GVFS itself is hosted on GitHub which means you won't be able to pull it as a GVFS repo.

Stratoscope · on Feb 8, 2017

That makes perfect sense; they want the code to be where the people are.

And the GVFS repo doesn't have the problems GVFS tries to solve. The current .git directory is only 425KB. You could put three copies on a floppy disk!

whatnotests · on Feb 8, 2017

I recall that for Windows "Longhorn" they were supposed to include a file system based on SQL Server.

Never happened. We ended up with "Vista" instead.

Maybe they've learned their lessons on that.

I do recall some years ago seeing someone had posted (here maybe?) that they were using git to manage their home directory and everything in it.

Having the ability to easily switch between versions of my operating system could possibly be great.

  git checkout -b upgrade/service-pack-12
  # ...trying things...
  # ..decide it was a terrible idea...
  git reset --hard
  git checkout -
  # phew!

sigi45 · on Feb 8, 2017

You can do that with nixos.

Anyway when the said this about a File System based on SQL Server, i really liked the idea. I had something similiar in my head: Instead of installing software to folders, you add a node and all necessary files are referenced in this node.

That is basicly what apps are doing right now already. But imagine that your folders become tags and meaning and when you put images into a 'year' or in an 'event' node they are already sorted.

nolok · on Feb 8, 2017

> But imagine that your folders become tags and meaning and when you put images into a 'year' or in an 'event' node they are already sorted.