Wow, this scratches a lot of my itches about Git. I teach a Git course at my alma mater, and the things that confuses people the most (the index, how to undo mistakes etc etc) all seem addressed head-on. At first glance, this seems substantially easier to teach than Git.
The Git compat seems like a great idea for this to really take off. My team is totally PR based though so if/when doing (Git compatible) feature branches lands in JJ I'm excited to switch.
While I agree that git could use a rethink from a user tooling perspective, I really appreciate the existence of the index. I’m not tied necessarily to this specific implementation of it, but having a staging area where chunks are added piecemeal is an enormous benefit.
I honestly wish git forced `-p` for operations that support it. I’ve worked on too many teams where people would just commit everything in their working directory, and it would inevitably make reviewing their PRs a nightmare. Particularly with times where changes were accidentally committed wholesale that shouldn’t have been part of that set of changes.
> just commit everything in their working directory
But that's what they've tested. I've had far more problems in the other direction, where the commit doesn't contain the complete set of things that it's supposed to but because all the tooling - every single IDE and compiler - is looking at the working directory not the index, I've missed something.
The index is definitely confusing for new users and users of other VCS.
> having a staging area where chunks are added piecemeal is an enormous benefit.
It would be quite fun if we could have hierarchical commits, so I could add bits to a commit without having to squash/amend. Then you'd see the top-level work item as a "commit" and the individual changes as "subcommits".
> It would be quite fun if we could have hierarchical commits, so I could add bits to a commit without having to squash/amend.
Yes! I've been saying this for years.
Branch merge commits are already this, kind of. However their UI sucks and there's an idea gap which prevents them from being fully understood as such "supercommits" composed of subcommits.
For one, the user should be forced to give meaningful, high level commit messages for merge commits and the tooling should fold commits made on a merge branch into the merge commit by default (effectively linearizing the merged history).
>> hierarchical commits, so I could add bits to a commit without having to squash/amend.
> Branch merge commits are already this, kind of. However their UI sucks and there's an idea gap which prevents them from being fully understood as such "supercommits" composed of subcommits.
Isn't that "idea gap" mainly the fault of all these weird newfangled git workflows that discourage or outright forbid branching? If you have no branches to merge, you can't have a merge commit to act as a "supercommit".
Call your "supercommit" a "feature", and what you need to implement it is... A feature branch.
The history graph not presented as a list but as a collapsible tree, sounds good! Might need a change in merge commit message culture though, or more: in the workflows I have experienced, the best time for writing the message you'd see on the collapsed presentation would be branch time, not merge.
Ideally you'd have a data model that keeps a "what is this" description for branches that can be updated as long as the branch lives and that gets moved to the commit message on merge. Could this be done with some combination of convention and hooks? I guess a file in the root that is always kept on target/into branch state, with the lost file content of the source/from branch moved to the commit message would be all that's needed?
Like I said, I’m not sold on git’s specific implementation. But breaking things into smaller, focused commits is—in my experience—a hallmark of good development practice.
There should absolutely be better tooling around it, so that these piecemeal commits can be tested in isolation from one another. That’s a far better approach than just throwing up our hands and committing everything in the working tree, even if half of it has nothing to do with what’s intended.
How do you even end up in that situation? If you start a new feature or bugfix having random local changes lying around, you are _already_ doing it wrong. Start a feature branch, do clean commits of the current state, test, push, review, merge.
> It would be quite fun if we could have hierarchical commits, so I could add bits to a commit without having to squash/amend. Then you'd see the top-level work item as a "commit" and the individual changes as "subcommits".
Would something like Mercurial's queues be something what you're looking for?
I'm not opposed to carefully crafting a good commit, I'm opposed to a state-heavy, badly named, inconsistent feature to do it. And if you use any decent git UI (including -p on the CLI), you don't really need it to be that persistent very often. It could just be a list of chunks to select in the "make commit" window, which is of course exactly how most git UIs "make commit" window looks. It's a single step then, no intermediary persistent state (with three names).
JJ seems to workaround this in the other direction, by making the concept of commits and rebases much more lightweight, which i think is a refreshing enough take that I'd like to try it.
> I’ve worked on too many teams where people would just commit everything in their working directory, and it would inevitably make reviewing their PRs a nightmare.
And they always claim to review and clean their commits before pushing, or at least before merging, but never do.
> My team is totally PR based though so if/when doing (Git compatible) feature branches lands in JJ I'm excited to switch.
I'm trying to find the part of the docs that refers to this functionality as missing but I can't -- does jj not have the ability to create, update, pull from, merge branches, etc?
1. Create a fork via GitHub's web UI
2. Clone using the SSH protocol (`jj git clone git@github.com:...`)
3. Make your changes and "commit" (`jj close/commit`)
4. Create a branch pointing to the commit you just created (`jj branch my-feature -r @-`)
5. Push it to your fork (`jj git push`)
So that doesn't seem too onerous, I guess? Maybe the "too many manual steps" are to clean up afterwards. I just tried it in a test repo and deleting the branch from the fork and then `jj git fetch` from there does delete the local branch, but it doesn't abandon the commits, so you need to run `jj abandon` on them if you don't want them anymore.
I don't think skrebbel was saying that the branches are not git-compatible, I think they were just wondering if there is support for branches at all, but I'm not sure. Anyway, there is support for branches. They behave more like Mercurial's "bookmarks" in that they are assumed to have the same name on all remotes. That usually works well, but it gets annoying if there are unrelated branches all called e.g. "fix-formatting" on multiple remotes you pull from. Jujutsu would then tell you that the branch has a conflict because it doesn't know where it's supposed to point. I'll probably add some way of telling `jj git fetch` which branches to import, but I'm not sure yet what the best solution is.
I think a lot of git is overkill for what most people use it for it. It's time for a new tool to do these basic operations push/pull/undo/redo. Branching is helpful, but if you work in a relatively small codebase a tool that did cp -r under the hood would work perfectly 99% of times.
Interesting ideas! I especially like the automatic rebasing and associated ideas.
It's a bit strange to see "jj st" will automatically add all files to the "working copy" commit. This means that when I initially create my project, run "npm install" to install 500 MB of dependencies, and then run "jj st" to figure out what to include in my first commit, the command is going to copy all of that into a commit object for no reason. I don't think I like that behavior, to be honest. Is this a requirement of any of the other unique features? Could this be turned off, or modified to be an explicit operation?
[append]
To be clear, the fact that it's `jj st` is actually a big part of the problem for me: this is the command I use to figure out what to put into my `.gitignore` in the first place! I don't think that any of the "read-only" commands should be making these kind of changes.
In this case, a global .gitignore would be a help (or something like Yarn's plug n play?). I know that it doesn't really address what you're pointing to here, I can see what you're saying.
Perhaps that's a good time to use the undo operation.
> This kind of thing is also pretty much standard for most programming languages today
No, it is not. Have a look at java, golang or rust.
None of those languages advocate for embedding deps in your source tree as it's quite clearly a terrible anti-pattern to follow.
I wouldn't consider how JavaScript does stuff a good example.
It seems to me bitcharmer was railing against the notion that you would have X MB of dependencies in your project folder. I was just pointing out that this is basically standard practice today(and the alternatives are usually awful) and that even if you do have your deps in the project folder, you don't usually check them into version control, so his entire point is moot.
500 MB may be excessive for git, but it is a good practice to check in your compiler together with your source if you use perforce. It’s just an extreme case of monorepo.
" It also means that you can always check out a different commit without first explicitly committing the working copy changes (you can even check out a different commit while resolving merge conflicts)."
That's really interesting. Having to manage stashes is annoying.
If you had two worktrees set to the same branch and made a new commit in one of them (thus changing the commit the branch ref points to), what would happen in the other worktree?
Either it wouldn't have the right commit checked out anymore, or git would have to miraculously change what files are checked out in it -- which would likely come as a big surprise to whatever you were doing in that working tree.
Ergo, it is forbidden.
I periodically find myself creating a new (temporary) branch referring to the same commit when I want a new worktree looking at the same place, or just create a new worktree with a detached HEAD looking at the same commit.
Git could hide this for you and just make it look like you cloned the repo twice.
What happens if you make a commit in dir1 and dir2 is on the same branch. Absolutely nothing, until you fetch, and worktrees could work the same way. If there is any ambiguity left you could prefix the branch names similar to the remotes origin/master, worktree2/master.
The only sane way to use worktrees today is, like you say, with detached heads. Which well…isn’t that sane for many other reasons.
If you want a separate clone you know where to find it, but then you will be plagued by the uncertainty I mentioned in my other comment (https://news.ycombinator.com/item?id=30400869).
I did not say that was the only sane way to use worktrees. I frequently use them for separate strands of ongoing work (on separate branches per work strand, named appropriately). Less common (for me) is the use case of easy reference to older versions, with more ephemeral worktrees checking out a tag, a new branch (created for the purpose, and short-lived), or a particular commit (viewed as a detached HEAD).
git could simply prevent you from committing to that branch in that case, and perhaps print a warning when adding a worktree. The current behavior is weird. Another alternative is to allow the commit, but leave it in a detached-head state, and let the user choose how to fix that retrospectively.
The way I use worktrees I'd say the majority are read-only anyhow, so abiding by restrictions because it might get confusing if a commit were made is pretty annoying.
Can you elaborate on why you would want two worktrees set to the same branch? I think if I were trying to compare two approaches involving incompatible changes, creating a branch for one approach would feel more natural anyway, and then I'd be able to use the worktrees.
When you're done with a worktree, you just delete it (via `git worktree remove`). Any additional branches or stashes etc are part of the repository that the worktree was part of, and they (of course) remain.
When you're done working in a separately cloned repository in another folder, if you're anything like me, before deleting it (via `rm -rf`) you'll want to check very carefully for additional branches, unpushed work, anything stashed. Deleting an entire repository that's been around for a while is a risky operation in that it may have accumulated other unrelated work in addition to the current branch (which was the primary reason for the clone), and you'll want to check carefully before deleting the whole lot.
Additional worktrees within the same repository are great for medium-term ephemeral separate strands of work. They can be created and removed without concern.
Worktree allows you to checkout other branches, stashes or commits which are local only / not yet pushed.
What’s also great is that you don’t need to do a pull in the other folder. Before I discovered worktree, I had my repository checked out in two places, so was either pulling both constantly, or when I needed the alternate, it was often very far behind and I had to pull a lot.
Hmm, I tried it once and iirc I couldn't see commits/branches of the other workspace without pushing and fetching so I really didn't see any difference. With different checkouts you can also easily do that by adding the other checkout as a remote.
Not sure what you were doing wrong, but I use worktree regularly and you can absolutely see the same commits from both the primary and worktree side.
The great thing about worktrees is that everything local is still available in the worktree and vice-versa, you can even stash push on the one side and pop on the other. You also don’t need to push or pull on either side first.
I have noticed that for submodules, although they share the .git folder, they don’t share references, so you can’t see local branches and stashes between the worktrees, but sub module pulls are still quicker since they share objects so those don’t need re-pulled on the other folder.
Something probably got messed up on your setup (or maybe you checked out a commit instead of a branch?) because you definitely can see all worktrees' branches from inside each other when using git worktree.
Yes, as per my other comment, I’ve noticed with submodules that the references are isolated between the different folders, but they do share the same objects in the .git folder, so you still get efficiency there in not needing to store or pull duplicate data.
Maybe I'm too brainwashed by git, but it seems to me one of its benefits is the way the index works. You're encouraged to be fairly explicit about what you're adding to a commit, which encourages making a nice version history. Why would I want everything I do to automatically be added to a commit by default? Doesn't this encourage me to either put all kinds of unintended, not-ready crap in my commits, or constantly manage the equivalent of .gitignore? What am I misunderstanding about the benefits of how this works?
Your phrasing makes me think if I'm just too brainwashed by git as well, but, yeah, I wouldn't want anything to be added by default at all. While working I do all kinds of stuff I'm not going to commit, like debug statements and rough sketches, maybe some temporary scripts. And even though I do usually glance over diffs in the end, I don't really read them: that is, even if I did, it's not really my thoughts I see on the screen at that point, I'm as likely to notice mistakes there as I am in somebody else's code, if not less.
When I'm adding a change, on the contrary, it's something I wrote minutes to seconds ago, I know what's in that code, and by adding that I'm kind of making a statement that this wasn't some random bullshit I do with the code, but something I intend to keep (even if later I'll completely change that while interactively rebasing or something like that).
On the other hand, I didn't complain that much back in the day when I was using SVN. But I feel like my current workflow that employs most unique features of git is actually much better. Basically, horribly inconsistent CLI is the only complaint I have about git.
This line of reasoning works really well for those who understand (and want to understand) how git works. I know plenty of people who don't care about their development tools (e.g. scientists), and for them the index is a chore in the way of more important problems.
The bigger picture is that the project is trying to provide a better interface on top of the existing git data store. As someone who used darcs for a major project before Git dominated DVCS, there are much better user interfaces than what Git offers, and attempts to improve upon Git is a diversity to encourage.
Unlike picking your own text editor, a whole team has to agree about the choice of DVCS, making it harder to try new ones.
The beauty of a solution with a Git-compatible data store is it allows some people on the team to experiment with it while still collaborating with people using Git.
> Maybe I'm too brainwashed by git, but it seems to me one of its benefits is the way the index works.
And maybe I'm too brainwashed by Mercurial, but to me the index is nothing but a single weird commit with only downsides (why do we need a UI to work with the index that's different from the one to interact with commits, although the capabilities should be the same? Why being limited to a single index and not several? Why the different versioning/shareability characteristics, etc).
I largely prefer Mercurial's "public vs draft" strategy and the "commit what you have and we give you the tools to tidy up the series whenever you feel like it". In practice it means that you have as many "indexes" as your series is long, and with hg's mutable-history and amazing history-rewriting extensions like absorb, it's much more convenient, fast and safe to work with than the git inconsistent equivalents.
The stage is git's killer feature to me. I'm very iterative, so I tend to change a lot of tangential things as I narrow down the best implementation of my code. Knowing that I'm free to change whatever I want and only commit the parts that I've determined are good is very liberating.
My workflow is basically just change whatever I want to get to my goal. When I finally get the behavior I want, stage the changes and stash the rest. Once I've verified that everything works, commit the stage and drop the stash. I don't have to think about anything not directly related to my commit. Feels like with tools like jujutsu, pijul, mercurial...I'd have a lot of manual cleanup to do _before_ committing...and then what if I accidentally clean up something I didn't realize I needed? It's gone. Whereas, with git I can just pop the stash and stage whatever piece I missed.
I like that this is not "a dvcs written in rust" but rather "a dvcs with these awesome improvements over git." which incidentally happens to be written in rust because of course it's the right language for this.
1) What are the advantages of using native backend as compared to git?
2) Are there any potential issues one has to be aware of when using jj contributing to git repository?
I remember when I was the only team member using git-svn plugin my coworkers were confused when I svn-committed many commits at once with some of them breaking the time order of the svn history (as git-svn commits in svn were recorded with git timestamp and not the actual svn- commit timestamps)
> 1) What are the advantages of using native backend as compared to git?
Very few. The disadvantages are generally much larger. The main advantage is that you won't run into a (harmless) race that happens once in a while with the git backend (https://github.com/martinvonz/jj/issues/27). Disadvantages include not being able to interact with git repo and performance problems (both size and speed).
I should add a note about this to the README.
The backend exists mostly to prove that it's possible and to make sure that the backend API doesn't become tied to Git.
> 2) Are there any potential issues one has to be aware of when using jj contributing to git repository?
The main one is that you should avoid pushing commits with conflicts. It won't corrupt the remote repo, but git clients won't know what to do with the conflicted files. The fix is to fix the conflict and force push the conflict-free commit.
I'll file a bug to prevent pushing commits with conflicts.
I hope to one day make it better, but it's very low priority right now because the Git backend works well and has the big advantage that it's compatible with Git :) Also see https://news.ycombinator.com/item?id=30403737.
If we ignore the UI level stuff, like doing something by default when you switch to another branch, etc — is it in any way "smarter" than git? For example, are there situations, when I couldn't reorder my commits when doing git rebase -i without resolving some conflicts manually, while jj would be able to correctly handle them for me?
Yes and no. For example, if you reorder two commits, the new child commit will never have conflicts even if the parent had conflicts. That's because the changes from both original commits still apply afterwards. Similarly, if you rebase a commit and it results on conflicts, you can then rebase it back and the conflicts will go away. See the demo at https://asciinema.org/a/HqYA9SL2tzarPAErpYs684GGR
> git clients won't know what to do with the conflicted files
This seems kind of similar to pushing git files with conflicts with the git client? Maybe the Jujutsu formatting is slightly different, but the concept is the same.
These are representations of conflicts that are stored as special files, they are not regular files with conflict markers in. If you look at the commit with regular `git`, you're going to see a weird file with some JSON in it.
One of the many breakthroughs in git was precisely not being patch based. Being patch-based means that the tools for comparing versions are hard-coded into the data format. One of the major upsides of CVS->SVN, for example, was support for file moves. Simple issues like this weren't possible to fix without reworking the whole system.
With a version-based model, how you compare them is determined by the user space, whether that's merges, or viewing history. Even compression isn't all that fundamental to much of anything; it's possible to implement underneath the system, eliminating whatever redundancy you want. With patch-based systems, all of that comes baked into the underlying representation.
On a deeper level, this feeds into how git can be distributed, robust, yet very simple.
It also means cherry picking commits between branches prevents clean merging afterwards meaning you have to force yourself to keep branches short-lived. The patch graph pijul gives you doesn't have this issue. I'm not an expert, but when reading the Pijul blog it made a lot of sense for me.
No, that's mostly the fault of many bad decisions in the git user space. The data model is quite brilliant and quite general.
It'd be possible to write a really good user space around the git data model if someone had the time and inclination.
My personal gripes:
* I like to commit every time I change a few lines of code. To have a clean version history, git requires me to rebase all of those commits. I'd like to have "major" and "minor" commits, which the git data model can support in several different ways (as a list, a tree, or DAG; I prefer the tree model). The user space makes anything like this incredibly painful. It encourages rewriting history, which is actually rather bad.
* Large files are managed very, very badly. It's ironic since the git data model is ideal for things with large files.
* Related, partial trees aren't really supported. This gives scaling issues. In many cases, I just want to work with a few files, or just the latest commit. Some of this is technically possible but practically painful enough to be not worthwhile.
* Nested projects / submodules.
Of those, only submodules might require changes to the underlying data model.
I think the programmer tendency to think of usability as an implementation detail is probably why open-source app UX was considered a joke for so many years. Only now is that perception starting to change, because enough people have taken an interest in the space who make usability a priority.
More like, people who've taken an interest in the space are having a much harder time finding jobs that aren't "fill this electron app/SaaS with dark patterns."
> Start working on a new change based on the <main> branch -- jj co main
Did you consider the recent git nomenclature change to use "switch" for branch operations, and "co" for file operations? I actually can't tell from so deep in my "git Stockholm syndrome" whether that distinction is really hard on new users or not, but the fact that git expended the energy meant they thought it was meaningful
And, do you have plans on supporting signed commits, either via gpg or the newfound SSH key signing?
I'm super excited to try out the conflict resolution mechanism, because that's a major painpoint for my long-lived PR branches
> Did you consider the recent git nomenclature change to use "switch" for branch operations, and "co" for file operations?
Actually, isn't "restore" for file operations? My impression was that everyone agrees that `git checkout` does too many different things. In particular, it's both for switching branches and for restoring file content. So they added the new `git switch` and `git restore` with limited scope. I strongly suspect that they would have used `git checkout` for the former if that wasn't already taken by the existing command.
> And, do you have plans on supporting signed commits, either via gpg or the newfound SSH key signing?
No, that's not something I've even started thinking about. I'll have to read up on how it works in Git first. Patches welcome, though :)
> I'm super excited to try out the conflict resolution mechanism, because that's a major painpoint for my long-lived PR branches
You mean so you can continuously rebase them without having to resolve conflicts right away? Yes, that's one of the benefits of first-class conflicts. Another benefit, which took me a long time to realize how useful it is, is the auto-rebase feature.
As I was trying to find the correct command in jj for where to add the new argument, I realized that this may not play nice-nice with jj's commit-everything model, so I'm actually prepared for the issue to be closed WONTFIX :-)
As for the patches welcome part, I notice that the Google CLA bot is on your repo. While I do have the CLA signed for my Gmail address, it seemed like some major tomfoolery to try and add my current github email address (and the one which backs my GPG key, to bring this full circle!) to the existing CLA process. Do you intend to keep that CLA mechanism in place, or was it just an oversight?
> I realized that this may not play nice-nice with jj's commit-everything model
Yes, I suppose you might not want to sign every working copy commit, but as you noted on the issue, it would probably make sense on `jj close/commit` (aliases).
> Do you intend to keep that CLA mechanism in place
I started working on this project internally and then open-sourced it. I don't think I'm allowed to remove the CLA bot. I understand that it's annoying :(
This looks interesting and I'm glad to see that people are not viewing SCM as a solved issue. While I don't think I'd use this on professional projects yet I'm interested at looking at this for personal projects since I can use the Git back-end and continue to use Github to host my code. I feel like Git has become such a DeFacto standard that nothing is going to replace it any time soon. I've been programming for long enough to remember feeling the same way about SVN though so I assume something will eventually supplant Git.
I remember that well too (along with Rational Clearcase), and you may totally be right, but git does feel different because it works so well with all sorts of projects, big and small. There were always operations that required hacks. With git that doesn't feel the case to me,perhaps with one exception (but I don't think this is git's fault as much as it's mine for not knowing git well enough to use the tools it provides): if I'm working simultaneously on two different branches (for example one branch is a bug fix that requires 20 minutes to build and deploy before I can test it, so I work on other things while it's running), I often have two different checkouts so I can (mostly) avoid having to git stash.
That said though, you are probably right something will eventually supplant git. It would be arrogant to think that my failure to imagine something better means there isn't anything.
I totally agree. Back when I used Subversion it felt to me like there obviously must be a better way to version control and that the tool got in my way all the time. Git on the other hand almost always can solve my problems and while some things could be improved (confusing UX, a bit too complex, plus bad handling of conflicts) it is much closer to the right tool for VCS than Subversion ever was.
What are the problems that Subversion has? My only experience of svn is of simple personal projects and in that scope it worked pretty well - not contesting your opinion, but would like to know at which point svn becomes problematic.
If subversion had branches, they were not nearly as easy to use as in git.
It also didn't have a great notion of offline work (to my memory).
For what it's worth, SVN was pretty straightforward and worked well enough at the time. Later, Mercurial addressed SVN's deficiencies with a familiar interface.
IN my experience pull requests were not a thing in Subversion. We would attach patch files to Jira for code reviews. Git allowed for much easier workflow with merging branches instead of directly applying patches to trunk. This probably has a lot to do with the fact that we used Bitbucket for Git hosting and a plain Subversion server with no extra features to help with code reviews.
> There were always operations that required hacks. With git that doesn't feel the case to me,perhaps with one exception
I think the way Git handles renames is ugly – it doesn't actually record them, it just tries to guess when they occur based on file contents (inevitably imperfect). I think Subversion handled this better.
The problem is that a Git tree object only contains file name, file mode, and blob/subtree hash. If it also contained some kind of "file ID" (such as a UUID), then you could track renames properly – renaming a file would change its name but not its file ID, so you'd track the rename properly, even if the contents changed at the same time.
Given they didn't do that... maybe create a file-id Git attribute? (Alas, "git mv" doesn't update .gitattributes, so renaming a file forgets the attributes; but it could learn.)
I do agree that renaming/moving files isn't great but I guess that is one of the downsides of having a distributed version control system. With Subversion, ClearCase, and others, they were centralized and very much meta-data driven.
I'm not quite sure why Linus decided not to make renames/moves/copies explicit/strict and my only guess is it simplified things. Being able to say with 100% certainty all the time that something was renamed or copied is extremely handy, but with Git we just get a percentage like the following shows:
I guess knowing the mapper.go file in the example above was likely copied is good to know, but it would be much better to know this with 100% certainty.
> I do agree that renaming/moving files isn't great but I guess that is one of the downsides of having a distributed version control system. With Subversion, ClearCase, and others, they were centralized and very much meta-data driven.
I don't think it has anything to do with the fact that Git is distributed. It is a question of how rich the repository data model is. The richness of the repository data model is an orthogonal concern from distributed-vs-centralised.
I think Linus just wanted to keep it as simple as possible – but, maybe in some areas he made it too simple. The Subversion developers on the contrary, maybe went too far in the other direction.
> I don't think it has anything to do with the fact that Git is distributed.
I think it does since meta-data increases merge complexity, which is probably a headache when dealing with a distributed system. With a centralized system, you are basically creating a lock when you do a rename/copy/etc. which doesn't translate well to distributed systems.
With Git as it is currently implemented, you only need to worry about one state which is the tree. With meta-data, the number of states that needs to be tracked can increase significantly.
Having said all of that, I guess it probably wouldn't be too hard to implement a layer on top of Git that ensures every directory has something like a .git.meta file and have hooks that ensures all renames,copies,etc. are recorded in the meta-data file.
> I think it does since meta-data increases merge complexity, which is probably a headache when dealing with a distributed system.
I disagree. In Git, merging always happens locally, between two local commits; there is nothing distributed about merging itself.
> With a centralized system, you are basically creating a lock when you do a rename/copy/etc. which doesn't translate well to distributed systems.
I don't think that is really true. If I rename a file in Subversion, that doesn't involve "creating a lock". I rename the file locally (using "svn mv"), Subversion locally remembers I have done it, but doesn't send anything to the server. When I want to commit, it then contacts the server and uploads the commit data, including the rename tracking info. Now, that commit operation takes a write lock on the branch – to prevent any other commit for that branch being processed at the same time – but it is the exact same write lock regardless of whether renames are involved or not. If we take Subversion as an example of a centralised version control system with rename tracking, the rename tracking and the locking are orthogonal. (When I say "locking" here, I mean the internal locks within the Subversion server, not the end-user-visible advisory locks you get with the "svn lock" command–that advisory locking feature is unrelated to the topic of rename tracking.)
Like Git, merging is essentially a local process in Subversion – the client constructs a merge commit, and then sends the completed merge commit to the server. The actual merge, meaning construction of the merge commit contents, always happens on the client side, at least as far as the core Subversion protocol is concerned.
> Now, that commit operation takes a write lock on the branch – to prevent any other commit for that branch being processed at the same time
In fact on the whole repo, since branches are just subdirectories. (Shows how long it has been since I've actually used Subversion, I'm starting to forget it.)
jujutsu looks interesting, but one thing that i find missing from git is historical branch tracking. once two branches are merged, git does not tell me which series of commits used to belong to which branch. i can't check out main from two weeks ago if a merge happened in the meantime because that information is lost.
i fear to add such a feature additional information would need to be stored in the git repo itself which would require a change to git
Git has the concept and knowledge of which side a merge came from.
A commit having multiple parents (a merge commit) maintains the order of those parents within the commit. The branch names are famously lost, but by convention you can say that the first parent should always be the main branch.
As with all other git things, the ux for this feature isn’t the best and it’s use varies wildly across tools and organizations.
To view the main branch log only you have to pass arcane -—left-only flags into git log.
Some people don’t know the difference of merging from branch to main vs the other way around, end result looks the same right. One such mistake messes up the history.
that's the problem with git. it is powerful, but some things are just not practical. if i have to be careful how i merge then that's not user friendly. on the other hand, adding a field that stores the name of the branch at commit time would be almost trivial programmatically. and it would enable a much more user friendly interface.
I don't need to repeat that Git has terrible UI, but you can get this "history" with `git merge --no-ff <MYBRANCH>` (no fast-forward).
In fact, it's a pretty common branch-based flow. Our team pushes feature branches to remote, where they are vetted by QA, and we have a chance for review.
When pulling changes into any branch, we always `git pull --rebase`. When merging feature branches, we always `--no-ff`.
For small, local merges, we don't use `--no-ff` because it's useless noise. But if someone forgets out of habit... Oh well. I spend on average 0.000000001% of my time going through git log. And when I do, I tell it to show as a tree
> The command-line tool is called jj for now because it's easy to type and easy to replace (rare in English). The project is called "Jujutsu" because it matches "jj".
Yeah, it's just a coincidence that the world Jujutsu means magic in japanese (呪術)
I've long wished Pijul had a git backend because I think it might be superior but strongly feel the only way it could get mass adoption even if it's truly better is if people could just try it on their git repos like this.
Removing the "convince my team to move away from git" part of a better DVCS could make jujutsu very popular.
Or other git compatible spinoffs could be made that follow this approach and explore other design spaces.
It would be nice if this were in nixpkgs, it's a pain to manage all the different language environments and it would save me from trying to get it to compile.
Right now
error[E0554]: `#![feature]` may not be used on the stable release channel
--> lib/src/lib.rs:15:12
|
This is awesome! And it’s so exciting to see folks investing energy into this problem space.
Out of curiosity: did you add support for multiple backends because you don’t think Git will ultimately be sufficient for the experience you want to create? Or did you just want to ensure that the tool didn’t become unnecessarily coupled to Git?
Good question. You're right on both guesses. One reason is that I want to make sure the API works well enough that it's easy to replace the backend. The current native backend is called `local_store` because it stores files locally (just like the Git backend does). I want to be able to add another backend that fetches objects from a remote instead (and caches them locally), although another option is to hide that part inside the backend like Git does with its partial clone support. I also want to be able to add more functionality that the Git backend doesn't have. Maybe that would be tracking of renames, or maybe something I haven't though of yet.
> I also want to be able to add more functionality that the Git backend doesn't have. Maybe that would be tracking of renames, or maybe something I haven't though of yet.
Awesome, that would enable a realistic no-risk path to adoption for most projects:
- trying jujutsu on work codebase
- if useful, tell a coworker and maybe they try it
- present internally about pros
- team tries and agrees its better, adopts it
- after some months confidence is built, give a presentation on features you could get with jujutsu native backend, gauge interests\worth
In git you can reorder any sequence of commits without any conflicts. Howwever, there is no nice "porcelain" for it.
The basis for it is the read-tree command.
In git, every commit is a snapshot and not a delta. (Repeat that three times.)
Any arbitrary snapshots of files can be arranged into a git history. They don't even have to be related. We could take a tarball of GCC, and make that the first commit. Then a tarball of Clang and make that the second commit.
The read-tree command will read any commit's state into the index, and from there you can make a commit out of it.
To reorder some N commits you're looking at, save all of their hashes somewhere, and then rewind the branch: git reset --hard HEAD~N. Then use git read-tree to read those hashes in whatever order you want, committing them one by one. To reuse their commit messages you have to use commit -C <hash>, though most them likely don't make any sense, because the text of commit messages usually talks about changes between a commit and its main parent.
What will happen is that your history now as N commits which represent the same states as the N you had there before, in a different order. For instance if you reverse them, then the oldest commit has all the features (is the same code baseline) as what the newest HEAD was previously. And then the subsequent child commits basically remove all the changes that were made, so the latest now looks like what the N-th looked like.
How I sometimes use this:
Suppose I made a fairly complex change that I would like to break up so that it is presented as a sequence of two or more states of the code.
There are cases when the following workflow is easy: first I make a commit out of the changes. One commit with everything. That commit is what I want the final state to look like. Then I revert some of the changes to produce the state before that state. If I have mostly been adding things, this might consist of just deleting them. Or I can do a git reset --patch HEAD^ to selectively revert. I commit this penultimate state. Then repeat the process: revert some more changes from that one, commit and so on, as many times as I see fit.
So what I end up with is the states of the code I want to present as a history, but exactly in the wrong order. Using "git rebase -i" doesn't work nicely; there are ugly conflicts, and some of them have to be encountered twice. Here is where the git read-tree trick comes into play: I simply reverse those exact states of the code to put them in the right order on the branch. After that I might do a rebase -i just to reword the commit messages to frame the code states from the POV of being changes from their parents.
You might think: why not just "git commit --patch" from the original working state to produce commits in order. The reason is that doesn't always make sense. For that you had to remember to make the commit as you were working. Because say you refactored something and then made a change. You can't easily do a "git commit --patch" which separates the refactoring from the change. Sure, if you do the refactoring and then a commit, and then make the change, you are good. But suppose you've conflated them already; now what? You can commit everything and then back out the changes that were done on top of the refactoring, and commit that. Then reorder the two states.
> In git, every commit is a snapshot and not a delta.
I know this, but also I cannot reconcile this whit what happens when you cherry-pick a commit from another branch. I'm confused because cherry-pick really seems to be taking out the delta not the snapshot.
I'm thinking that cherry-pick takes that commit and makes a diff with its parent and then that's what you get when you call it. Is it how it works behind the scenes?
I'm also thinking that the fact that each commit has at least one parent means that, conceptually, we can use it as if it was a delta (at least in the case of commit with one parent only), if you get what I mean.
EDIT: I'm not familiar with the internals of Git, just a user. Commenting out of curiosity.
When you cherry-pick a commit from another branch, it's not doing anything like a read-tree to make your current work look like that commit's snapshot state. It's doing something like three-way merge between that commit, your baseline and a common ancestor (I'm guessing: the same one that would be identified by git merge-base <hash0> <hash1>.) That's why cherry-pick identifies conflicts.
A cherry-pick deos not just do a diff with its parent 0 and then patch it onto your work. In many cases, that wouldn't work because that's not a three-way-merge, but a two-way merge, which has disadvantages.
However: there may be situations when the commits are so distant, it might make sense just to turn that one into a patch and work the patch onto your baseline. Even if that needs manual work, it can be simpler. You will only get conflicts (patch hunk rejects in that case) that are relevant to that change. I've had lots of experience working with patch stacks (both before and after Quilt was introduced to make that a bit easier) so I'm comfortable migrating patches from one code base to another without a three-way-diff process within version control.
It's doing something weirder than that, but "it applies a diff between a specified commit and its parent on top of your current work" is more accurate/intuitive than "it does a three-way merge with a common ancestor". No common ancestor is involved.
Both the "revert" and "cherry-pick" operations initialize a sequencer object with a single operation in the sequence. (This is the same mechanism underlying "git rebase -i").
From there we can find the call to sequencer_pick_revisions, which leads us to the sequencer implementation, where there's a fast-path for ordinary cherry-picks leading us to a short function single_pick, which in turn calls do_pick_commit: https://github.com/git/git/blob/v2.35.1/sequencer.c#L2066
In turn, do_recursive_merge calculates a head_tree from the current HEAD (https://github.com/git/git/blob/v2.35.1/sequencer.c#L645), and then calls either merge_incore_nonrecursive (the current default, from the new "ort" merge strategy) or merge_trees (from the older "recursive" merge strategy) with three trees (i.e., with no further ancestry information): the base is the parent it found, and the two sides being merged are your current HEAD and the (tree of) the commit being cherry-picked.
That is to say, at no point does it care whether the two commits even have a common ancestor! It's just doing an operation on trees. It is doing a three-way merge, yes, but the graph of the merge it's doing is one that potentially doesn't actually exist in reality.
Or, in other words, it's trying to compute the tree that could be equally well described as the result of
- applying the diff of the commit you're cherry-picking to your current HEAD
- applying the diff between your current HEAD and the parent of the commit you're cherry-picking to the commit you're cherry-picking
So this is more powerful than purely applying a diff with no information about the base of the diff, but it is very much like applying a diff.
One way you can test this without drilling into source code is to make two independent commit histories (using either two git repos, or git checkout --orphan, or whatever) where a commit in one history has a diff that would apply to a file in the other history if applied with the "patch" command. Then try cherry-picking it into the second history. It should work.
(In the revert case, it more or less does the same thing, just with "base" and "next" flipped. That is, if you have commits A, B, and C, and you want to revert B, it does a three-way merge where the base is the tree after B, one side is the tree after A, and the other side is the tree after C!)
FYI, this is also how `jj undo` works, except that it's a three-way merge at the repo level (https://github.com/martinvonz/jj/blob/d9b364442e2246a734d600...). So that applies changes to branches, checkouts (think: git HEAD), and sets of anonymous heads. This is how you can undo an operation even if it wasn't the most recent one (just like you can `git revert` a commit that wasn't the most recent one).
TIL about `git read-tree`. Thanks! I think it will prove very useful for me. That said, the easiest way to reorder commits is to `git rebase -i $some_base` and then reorder the `pick` lines in your $EDITOR into the order you want.
> In git, every commit is a snapshot and not a delta. (Repeat that three times.)
Yes, but every commit also is notionally a delta. That the delta is reconstructed as needed is not terribly important. It's best to accept a commit as both, a reference to a state of the world, and as a delta to the recorded parent(s).
Lastly, I avoid `git reset --hard`. I often have extant changes that I don't want to lose. Instead what I do is `git rebase -i --autostash`.
Problem is, you may get ugly conflicts along the way which serve no purpose to the end goal, and have to be resolved more than once in different directions.
It's an abstraction inversion to be doing that. The system doesn't work with diffs and merges: that's just layering on top. The system contains snapshots, and so if you want to rearrange snapshots in a different order, that's the best abstraction layer to work at.
Oh, I actually messed it up. Which in hindsight I don't know why because it's really much simpler than what I started with... but then I usually do it with more intermediate steps, and on an actual command line, not on a mobile phone...
Anyways:
You start with:
original-state
A
final-state
B
intermediate-state
You want
original-state
X
intermediate-state
Y
final-state
What you just need to do is:
git revert HEAD
Which gives you:
original-state
A
final-state
B
intermediate-state
revert (C)
final-state
Then X = A + B and Y = C, so with one rebase -i:
pick A
squash B
reword C
No conflicts. The important thing is that with appropriate reverts and reapplication of commits, you can make your history contain the states you want in the order you want, even if in between you have repeated states in the wrong order. Then you squash things so that you only keep the states you wanted in the first place.
Edit: but yes, you can achieve the same outcome with read-tree or checkout.
Edit2: come to think or it, you can probably do what you want with a couple git replace --graft and a noop filter-branch.
I’m curious to know if this has any similarities with fossil (https://fossil-scm.org/ ). If it does, it would be nice to see that in the documentation.
That Fossil doesn't support rebase is a fiction, since it supports cherry-pick. All that's missing is a script around cherry-pick, which is all rebase is.
> It combines features from Git (data model, speed), Mercurial (anonymous branching, simple CLI free from "the index", revsets, powerful history-rewriting), and Pijul/Darcs (first-class conflicts), with features not found in either of them (working-copy-as-a-commit, undo functionality, automatic rebase, safe replication via rsync, Dropbox, or distributed file system).
You lost me at "free from the index". The index is one of the most important parts of Git that makes my life easier. Opinionated DVCS UIs make my life harder -- all of them.
> The working copy is automatically committed
Right, so, the reason the index is powerful is that I get to do `git add -e`, `git commit`, and repeat until I'm done or ready to throw remaining changes away. I very much want the index / workspace distinction.
> Automatic rebase
> Comprehensive support for rewriting history
This is very welcome. At the very least this confirms something important: the fallacious argument that "rebase rewrites history, so it's eeeevil" is finally dead.
If you click the link that text points you to (i.e. https://github.com/martinvonz/jj/blob/main/docs/git-comparis...), there's an explanation there for how to achieve the same workflows. I get that it's different, but I don't think it's worse. I consider myself a (former) git power user (I think I have ~90 patches in Git itself) and I've never missed the index since I switched to Mercurial ~7 years ago.
> With Jujutsu, you'd instead use jj split to split the working copy commit into two commits.
This is more confusing? Often times when debugging/writing a fix I would have extraneous code that I wouldn't want to commit. With an index I'm always sure if what I commit, but with this workflow you have to keep track of such stuff all the time and if you forget that stuff makes it in?
Not to mention that another benefit of an index is being able to change commits and git replaying your working diff.
yeah, i feel this is going to bother me, or at least be difficult to get used to.
i often have temporary files that i do not want to commit, nor do i want to add them to .gitignore (because i want to commit them later)
but then, i'll have to spend some time using jj split. if it is powerful enough then maybe the end result is just that those files only live in the last commit.
also, what happens on push? i'd never ever want the working copy to be pushed to the remote repo. i could not find anything in the documentation about that.
Yah... I have a long-standing habit of commenting `TODO` for some debugging or WIP code. Then, before I commit, I can just do `git diff | grep TODO` and see all the new TODOs I've added.
Also, I often rebase and `edit` commits to split them or undo parts of them.
Rebase and all this is all about making commits that have just the right content, and keeping history clean and linear. The tools have to make this possible and easy.
I get that git feels... barebones for this. You really have to understand what you're doing when using git like I do, so I get that it's not very accessible.
Better UIs are great, but on the other hand, we need to be able to get down to the low level.
> You cover `git add -p`, but I want `git add -e`.
Interesting. I don't think I've heard anyone use `git add -e` before. It should be easy to add that feature, but it would be very low priority since so few users seem to like to manually edit patches.
> Also, I often rebase and `edit` commits to split them or undo parts of them.
You can do that by checking out the commit, then `jj squash/amend` (aliases) and all descendants will be automatically rebased on top, and branches pointing to them will be updated too. There's also `jj edit` for editing the changes in a commit without updating to it. And there's `jj split` for splitting a commit (without needing to update to it).
> Rebase and all this is all about making commits that have just the right content, and keeping history clean and linear. The tools have to make this possible and easy.
I did, and it's almost certainly nicer than Git for commit splitting.
But even though I might use a tool specifically designed for user-friendly commit splitting, I still want: `git add -e`, `git diff --staged` (to see what I added with `git add -e`) vs `git diff` (to see what I left out), and `git commit` w/o `-a` to commit the contents of the index. This is easier for me than `$WHATEVER commit` followed by `$WHATEVER`'s commit splitting method.
That said, I have to congratulate you again on not taking the ugh-rebase-rewrites-history/history-rewrite-bad road. This is a huge step forward for DVCS.
It may well be that what rebase workflows needed was a UI that non-power users could use.
That gist seems like a simplified version of https://github.com/mhagger/git-imerge, so check that out if you haven't. (I haven't looked at git-imerge in a long time, so I should read about it again myself.)
EDIT: The idea of being able to suspend/resume and push/fetch in-progress merges/rebases is very cool indeed. It's hard to tell if it could speed up the task of rebasing across thousands of commits or not -- the README doesn't say. I like Viktor's script because it's very focused on one thing: quickly rebasing local commits across thousands of new upstream commits with a minimum of conflicts -- a simple solution to a simple problem statement (though the problem can feel huge when you have it).
I like and use the index in git too, but I wouldn't be so quick to dismiss other models that might end up solving the same use cases in a different way...
From the README, it looks like there's robust support for editing and splitting commits. So maybe in practice the flow is similar to using the index, with the added the added benefit that your work is backed via commits along the way, and the simplicity of not having the index as an extra concept.
In general when exvaluating X from the perspective of Y, we will immediately see the thing's about Y we like that X lacks; it takes more time to see if perhaps in the fuller context of X those things are not necessary.
The index is a power user feature. Its forced presence in Git effectively constitutes a usability barrier for new users. After all, a VCS is effectively a glorified abstraction for "save a file." Any barrier imposed between changing a file and committing it can get in the way and confuse people. The Git index does this.
Furthermore, the index is effectively a pseudo commit without a commit message. Any workflow using the index can be implemented in terms of actual commits itself.
I think because Git doesn't have strong usability in general and especially around history rewriting, many Git users feel that the index or an index equivalent is somehow a required feature of a VCS because Git's shortcomings give that illusion. However, if you use a VCS with better history rewriting (such as Mercurial with evolve), you'll likely come around to my opinion that the index can be jettisoned without meaningful loss of functionality or productivity.
> Right, so, the reason the index is powerful is that I get to do `git add -e`, `git commit`, and repeat until I'm done or ready to throw remaining changes away.
You don’t need the index for that. In fact I’d say it gets in the way because the presence of the one means less pressure on improving the ability to edit commits: while it’s easy to add stuff to HEAD it’s much more painful to remove content from it.
If that is solved, then the value of the index drops precipitously, because you can create a commit with a purpose and select its content instead of having to do it the other way around then forgetting what change you were trying to craft.
The index is mostly useful to me to split a commit in multiple ones. You do that with a sequence of "git add -p" and "git commit" commands. I am interested in how to do this with jj, because otherwise it looks like a very interesting tool.
so just for understanding: repeated `git add -p` followed by a `git commit` turns into repeated `jj split; jj squash`, since you create a commit each time?
That would work, yes, but there's also `jj squash -i` to move part of the child commit into the parent. There's also the more generic `jj move` command for moving part of any commit into any other commit (ancestor, descendant, sibling), so you `jj squash -i` is equivalent to `jj move -i --from @ --to @-` (where `@` is syntax for the working copy commit and `@-` is syntax for its parents).
> `git diff --staged` is superior. For one, you get all the options to `git diff`.
Most of which you can get on git show at least if they're relevant to “what have I added”. And of course you can also use git diff on commits if you need something super specific.
> For another, it has no side effects, unlike `git commit --amend -p`.
… “git commit --amend -p” is the replacement for subsequent “git add -p”, as the GP was talking about a workfliw where they’d intersperse staging stuff and looking at what they’d staged.
> You lost me at "free from the index". The index is one of the most important parts of Git that makes my life easier. Opinionated DVCS UIs make my life harder -- all of them.
Mercurial has an 'index' / staging area, but not exposed by default. You can access it with some extra CLI options, but there is an optional idea that may be 'better' and worth looking into:
> If you need the index, you can gain its behavior (with many additional options) with mercurial queues [1] (MQ).[2] Simple addition of changes to the index can be imitated by just building up a commit with hg commit --amend (optionally with --secret, see phases [3]).
MQs (optionally) expand on the idea of only a single staging area:
> This single "intermediate" area is where git stops. For many workflows it's enough, but if you want more power MQ has you covered.
> MQ is called Mercurial Queues for a reason. You can have more than one patch in your queue, which means you can have multiple "intermediate" areas if you need them.
If you only want to use one queue/index then that's fine too.
As a Mercurial user from almost the beginning, it’s not accurate to say Mercurial has a hidden index.
That Steve Losh post is from 2010 and it was mainly highlighting a workflow that was popular at the time for a particular use case. It also highlighted how Mercurial’s plug-in architecture can be used to to support different workflows.
Fast forward to the present and the use of MQ isn’t really a thing anymore, but is available for those who want it.
This looks like an interesting project, and I'm glad that people are still thinking about how to improve on version control.
That said, building a version control system seems like a problem similar to a social network: the network affect causes an enormous amount of friction. i.e. most people don't want to use it until other people are using it. A vicious cycle.
The fact that it's compatible with git as a backend is really great though, and is probably the key to driving adoption. However I have several immediate thoughts that come to mind that cause me hesitancy. Answers in the README.md would be super helpful for marketing to me (and deep apology if it's there and I missed it):
1. How does it look to others working on the repo using just git? For example, "when the working copy is automatically committed," where does it go? is there a remote branch that someone could see if they `git fetch` while I'm working? I often put all sorts of things in the code that I don't want to have committed, ranging from harmless (a bunch of personal comments or `IO.puts()`), to very important (API keys and such that I'm testing with. I always move them to env vars before committing but for first-pass testing to prove the concept I "hardcode" them at first).[1]
2. Similar to "how does it look to others," what sort of burden does "all operations you perform in the repo are recorded, along with a snapshot of the repo state after the operation" put on the git backend? If I'm hacking on an ffmpeg script and I (temporarily) copy a 4GB mp4 file into the working directory so I can easily run `./mycode video.mp4`, does that whole thing get committed, potentially every time it changes? That could easily turn into a 40GB set of changes.[1]
3. Do you have to use the jj backend to get all the feature? For example, with a git backend could you get the two things mentioned above and also "Conflicts can be recorded in commits" as well?
A quick section in the README.md about which features require the jj backend instead of git would be super helpful, and if written well could answer all of the questions above.
To sum up my comment in a TL;DR: To really sell me on a project like this it has to work with git, and being able to quickly tell which features I can use with git, would make this substantially more interesting to me.
> For example, "when the working copy is automatically committed," where does it go?
It becomes a regular git commit with a git ref called something like `refs/jj/keep/9f1a0fb0-a0aa-4a4b-922f-d6d48687996a` pointing to it (to prevent GC). It won't get fetched or pushed by default, except with `--mirror`, I think.
> If I'm hacking on an ffmpeg script and I (temporarily) copy a 4GB mp4 file into the working directory so I can easily run `./mycode video.mp4`
Yes! You'll have to more diligent about keeping your .gitignore (or .git/info/exclude, etc) file updated. I plan to add commands for fixing such mistakes by forgetting the commit.
> Do you have to use the jj backend to get all the feature?
> To sum up my comment in a TL;DR: To really sell me on a project like this it has to work with git, and being able to quickly tell which features I can use with git, would make this substantially more interesting to me.
Makes sense. Thanks for the suggestion! I'll put it on my TODO list.
That's very cool to have support for rsync/dropbox collaboration of the repo files. I'm always sad that there isn't a maintained p2p git implementation anymore. I'd love to just dump a bare repo to a syncthing share and collaborate with a small group of people accessing it.
As one who has set up skunkworks git boxes before, I think OP is probably referring to the fact that there is not a no-brainer out-of-the-box way for git repos on different dev machines to autodiscover each other, even on a local network.
You can have everyone manually set up a git hosting env on their dev machine, then have everyone manually add a remote for every other developer's box, but it sure isn't convenient.
And if you settle on sharing one central box as the canonical node (which IMO is the right answer), you're no longer truly using a distributed system - it's a de facto centralized system.
Exactly, syncthing has flawless p2p discovery and networking/sharing of files. Being able to put a git bare repo effectively on a syncthing share would be super convenient for small private collaboration. Tailscale might be an option to make a bunch of local git repos all start pushing/pulling, but it has some headaches too and isn't really designed for opening up small single services.
> I think OP is probably referring to the fact that there is not a no-brainer out-of-the-box way for git repos on different dev machines to autodiscover each other, even on a local network.
Ah, OK, gotcha. Thanks!
> You can have everyone manually set up a git hosting env on their dev machine, then have everyone manually add a remote for every other developer's box, but it sure isn't convenient.
Yeah. But fortunately not all that much of a hassle for a small(ish) dev team.
> And if you settle on sharing one central box as the canonical node (which IMO is the right answer),
Yeah, we called it "the Dev server". :-)
> you're no longer truly using a distributed system - it's a de facto centralized system.
But in a corporate (or even Open Source project?) environment, The Powers That Be probably want that kind of / that much centralisation anyway.
> But in a corporate (or even Open Source project?) environment, The Powers That Be probably want that kind of / that much centralisation anyway.
Agreed. I have used git in genuinely decentralized ways and have concluded that in practice centralization is the right answer for almost all projects.
Even though having the full history locally is a transformative improvement pet Subversion's model, in practice it's really helpful to know there's a canonical source of truth and to coordinate around it.
You can have sync conflicts with multiple people pushing to a bare repo on a shared folder at the same time. This kind of file sharing with git only works on NFS or filesystems that support locking (look at the git manual for this warning). Syncthing doesn't support that--it _might_ work for some time, but stress it with multiple people all at once and it will explode into sync conflicts eventually.
Yeah, I've probably just simply been lucky so far: IME with git, which hasn't been very extensive -- and only with a quite small dev team -- that just never happened to occur.
For comparison, another simpler (somewhat?) git-compatible client called "game of trees" or got (about which I know to little to compare/contrast well): https://gameoftrees.org/ .
Am I right to say that the working-copy-as-commit and undo-functionality features are inspired by piper? Mighty neat to see those make it to a normalish vcs.
“Powerful” has become to me a shibboleth for people who are full of it.
I can’t recall the last time a coworker who liked things because they were powerful didn’t end up being untrustworthy. Even dangerous. It’s like nobody remembers the Principle of Least Power.
That said, I will take someone obsessed with “powerful” over “flexible” any day of the week.
Hm, I guess for tools like this I always read "powerful" as "flexible" - as in: this tools has strictly more power/capabilities making it more flexible. In terms of "dev tool marketing speak" I guess it's the opposite of "robust" meaning: fewer features that are less likely to break on you.
The only flexible tool I use in the real world is the Leatherman, and that's for situations when I don't know what I'll need, or if I'll need it. For every other task I have a tool that is designed (sometimes quite well) for a set of tasks that includes the one at hand (see also Alton Brown, no single-purpose tools).
The Leatherman is part of my EDC, along with an LED flashlight with some respectable lumens so I don't have to use my phone to see in the dark.
In software this is known as the Unix Philosophy, but we violate it quite often, and call those tools 'powerful' or 'flexible'. Everything is a Swiss Army Knife all the time, and we aren't self-aware enough to see how consistently - and sometimes badly - we struggle with this.
But you can't tell an addict they're an addict. They will fight you to the death about how they Don't Have a Problem.
Everything? The default state after checking out a git project with submodules is for them to be in the wrong state! Keeping the state of submodules and the main repo in sync is a complete shambles.
Language matters quite a bit to those considering contributing. It also provides some context for expectations about performance and reliability thanks to language idioms, features and culture.
But the info is right on the linked page. Somehow I don't think people who haven't even followed the submission link aren't the ones to worry about yet for "potential contributor".
People do tend to see tools written in a systems language in another light than tools written in a glue language. Did you read it as "rust (not C)" or did you read it as "rust (not Python)"?
i used to look down on languages other than C, considering programs written in C a better choice than eg. python. i have since reversed my stance because i realized that python is a lot more hackable. i haven't tried rust yet, but i would prefer it over C for sure. compared to python i won't know until i actually get familiar with rust.
git is already very simple, at least for the 95% of use-cases that I need. add, commit, checkout, push, pull.... maybe a branch and merge here and there, but that's it.
Of course I cannot remember how exactly the "git rebase" command works and have to look it up every time, same as all those switches to the "simpler" commands.
I guess it all depends on what you need. Python looks really complicated too, if you start by browsing the function and class reference. But for someone starting to learn programming, it is apparently pretty easy (or so I've heard)
> A Git-compatible DVCS that is both simple and powerful
We all know the dig here - Git is not simple.
Like many tools, Git has evolved significantly over the years. Git today was not like Git 10 years ago.
Also, like many replacements to existing tools and software, they always start out simple and beautiful. Then they grow in complexity to serve the domain. The reason Git is complicated - not "simple" - is mostly because version control _is_ complex.
I also don't agree that Git is hard to use. I feel it is an odd goal to try to make everything - even tools that experts use - simple to use, when they are fundamentally not simple. I feel like Git is sufficiently complex - no more than it needs to be and certainly not less.
> I feel like Git is sufficiently complex - no more than it needs to be and certainly not less.
Perhaps the biggest mistake (IMO) was to expose the index to the user. I happened to just watch https://www.youtube.com/watch?v=31XZYMjg93o (by the person behind Gitless). They explain the issues well there.
I'm gonna watch this video. What's been bugging me for a long time, as someone who didn't quite get to see all of the supposed ugliness of Subversion (which by description alone sounds an awful lot like the rerere situation with Git, which I have).
It really feels to me like a commit identifier should be the combination of a monotonically increasing number and a hash of the content, rather than an either-or. If I merge or rebase branches in git, I lose the index, just as I would in subversion. But at least in svn I have some notion of how far back commit 12345 is in the tree. ac4def2 could be HEAD or four years ago.
I haven’t watched that video, but I agree — the index is one of the biggest hurdles to making git easy to understand and use for newcomers. It’s not hard to explain the fundamental model of how git works — it’s hard to explain the UI, and it’s hard to explain the index. If you remove the index, you’re only working with commits, and that simplifies the UI enormously.
I am by all accounts the SME on git at my company. I often am the VCS expert at my company. I don't like using tools I don't understand, and my brain is pretty good at handling problems that look like graph theory.
After using git for 6 years git still terrifies me. After 9 months of svn I performed open heart surgery to remove a 1GB zip file that some dummy merged before anyone thought to stop him. It was at least 18 months before I tried something similar with git and I made copies of my copies to make sure I didn't fuck it up. And I still almost needed a do-over.
The level of dread I have using git is pretty close to how I feel when using a newly sharpened chef's knife - hypervigilance. And that feeling gets stronger when I hand that knife to someone else. Be very careful, that's super sharp... hold still, I'll get you a bandaid. Now you get a mini lecture on how to hold a knife without cutting yourself next time.
It's not that hard man. Really isn't. Your anecdote sounds like someone who thinks they're awesome at git setting a noob up for failure and then mansplaining when they fuck up.
It's not hard to be better at git than 85% of people, most devs I've met don't understand the basics beyond pull commit push.
It must be admitted that if merges are involved, it becomes somewhat harder to do something like this, as you can no longer use the easy and safe path of an interactive rebase to edit the commit and remove the file, and must reach for filter-branch or similar, which are generally a good deal scarier and easier to make difficult-to-identify mistakes with. (Fortunately, the man page git-filter-branch(1) deals with this specific case as its first example.)
But it’s certainly not the easiest thing to mess up a Git repository to the point of losing data; the reflog keeps it all intact for at least a month unless you very deliberately tell it not to (see the “CHECKLIST FOR SHRINKING A REPOSITORY” section of git-filter-branch(1) for sample invocations after such an edit), and I’d be much more comfortable about open heart surgery on Git’s model than on SVN’s.
Almost all of the complexity of VCS is in or around merges. Take away accidental complexity and there is a lot of inherent complexity in merges.
First git project we had a guy who was bad at merges, and his solution was to avoid them as long as he could, which just made things worse. He also disliked another engineer. That engineer was definitely making mistakes but at the end of the day his Dunning Krueger was causing me less trouble than merge boy’s. Also suspect a little ageism, but would not have been able to defend that suspicion.
So one day merge boy was loudly exclaiming that old guy had written a terrible bug and the hit history proved it. Only the bug in question was something I specifically looked for in the code review, and had been happy to find that he had gotten it right. But here in git annotate it shows he got it wrong. What the hell?
So I step through the commit history one commit at a time, sure enough the code was initially correct… until merge boy did a merge and fucked up the three way. Again. And got old guy’s name to show up in annotate. I didn’t even know you could do that.
The Git compat seems like a great idea for this to really take off. My team is totally PR based though so if/when doing (Git compatible) feature branches lands in JJ I'm excited to switch.