I find people are religious about being git cli purists and only interacting with it in this black box (the terminal). On top of that a lot of people stop learning git after add commit push pull branch and merge so concepts like rebasing and cherry picking are scary.
In this day and age we have state of the art GUI tools that change the game and allow git users to see and interact with the state of a git repository in real time. It's a great way to demystify things like rebasing and interactive rebasing because they show you what's happening in a modern UI designed specifically for git.
I often suggest git CLI purists to get something like git kraken and just use it as a visual dashboard. Watch what happens when you run git commands. You can see everyone else's remote branches and have a much better idea of what's going on than you can without it.
I've worked on teams that just didn't rebase at all and I think people are oblivious to the mess they make on branches with their chaotic merge strategies.
These days, I tend to envision the state of the branch that I want and make that happen. It's so easy to create a new branch before I do anything that if I think something might go south, I just hard reset to my previous state (a branch I created right before the operation)
Gitkraken also has an undo button that handles all sorts of scenarios and I rarely click it without getting exactly what I expected.
> I find people are religious about being git cli purists […] we have state of the art GUI tools
I would humbly suggest you avoid framing it that way, even if you believe it’s true. From someone who is only sometimes a cli advocate, my immediate assumption is that your opinion here may be formed out of naïveté and a bit of fear of the cli.
Please note I’m actually quite a fan of using various GUI tools for basic git workflows, and things like branching diagrams and even diffs are much better in GUI tools than in the console. Some of the tools are damn good and damn convenient most of the time. So you’ll get a huge amount of agreement from me about the benefits of sometimes using GUIs.
But git was designed around the command line, and there are no GUI tools (Kraken included) that have UI for everything or even most of what git can do. Advanced workflows and repo spelunking often require the command line. Using the cli is the only interface that can do everything, and it’s the advanced interface, so telling people who are already advanced (possibly more advanced than you) and already comfortable on the command line that they’ll find enlightenment in GUI tools isn’t always or even generally true. It’s not hard on the command line to see everyone else’s remote branches. Better to listen to them, and suggest GUI tools only when they express frustration about their workflow that you think a GUI can help with. Also perhaps better to ask the about CLI workflows and find out some of the benefits. One benefit is that CLI workflows are always usable over basic ssh connections.
And worth mentioning, the question’s partially moot for people using Github or other hosting services where some GUI tools are built-in. Using the CLI for all interactions with Github, and using the site for the visualizing diffs & branches is totally reasonable.
> I’ve worked on teams that just didn’t rebase at all and I think people are oblivious to the mess they make
I have to fully back you up and agree there! It’s unfortunate that rebase is a little scary and that some people just don’t like it. Rebasing local work before publishing it is the way git was designed to work, and it’s extremely helpful for creating a history that’s not insane. Squash commits are better than nothing I guess, but you can really tell and appreciate when people are comfortable with rebase and care whether their history is presentable.
I'd really like to know what mess people make when they don't rebase.
I've found that using merge gives a readable trail of when something was merged, whether that be from a branch's original branch, or if you're merging into another branch.
Rebasing just seems to cause a lot more headache when something doesn't go perfectly correct in between commits.
Some people are good at making changes relevant only to one task, making commits at appropriate points to that task, and ensuring that none of the intermediate commits are regressions. Sometimes people prefer to deal with future pain of complicating a bisection with broken, unrelated issues.
Personally, I make a lot of mistakes. I commit too early when I missed some bugs or broken builds (eg build on release but not debug, rhel X but not rhel Y). My working area has unrelated changes. I forget to change branches...
So, I figure out what the destination looks like, do my best to keep things clean and small, and rewrite history before the PR as if I was one of those careful people.
My recent advancement has been to realize that reordering commits for an efficient fixup is much easier than splitting commits, so I'm better off doing things out of order. I also use worktrees to be able to ensure each commit is correct as checked out without stale state.
> I'd really like to know what mess people make when they don't rebase.
I think this part is really telling:
> And worth mentioning, the question’s partially moot for people using Github or other hosting services where some GUI tools are built-in. Using the CLI for all interactions with Github, and using the site for the visualizing diffs & branches is totally reasonable.
GitHub's branch view is uniquely awful, and doesn't try to call out branch history at all. Commit lists should either include a railroad diagram or only include the left-facing history. Anything else is just inexcusably broken.
No disagreement from me, so I don’t know what that snippet is telling you. I wasn’t saying GitHub or any other service is the greatest, I was only saying that using the GitHub site mixes GUI and CLI workflows, since using GitHub is so common.
In some sense that goes to the point that CLI can be better than some GUI tools. The ascii railroad diagram you get with git log might be preferable to what GitHub can do.
> Rebasing just seems to cause a lot more headache when something doesn't go perfectly correct in between commits.
You mean merge conflicts? Rebase gives you an opportunity to deal with small merge conflicts arising from single commits. I much prefer that to a big merge commit dealing with all the conflicts simultaneously.
Also, rebase makes the merge conflicts disappear for future readers, making the included commits nicer to read.
Is this about rebasing vs merging or is it about whether to use `git rebase -i` to polish my own commits?
Sometimes I think maybe we should have two different names for these commands.
I think an interactive rebase requires a good mental model of the changes to split or merge them correctly in addition to knowing the CLI. I find it quite hard at times.
Interesting take. If anyone care to actually learn how git works when rebasing, then it becomes much simpler. I always rebase and in 10 years of professional use, never once had a problem with it. That doesn’t mean other people don’t.
> If anyone care to actually learn how git works when rebasing, then it becomes much simpler.
That's a very arbitrarily vague answer. It's very easy to hand-wave the misuse of a tool just by saying "you don't know it well enough". It's much harder to admit that maybe the tool is just difficult to use in the first place.
I have similar experience as the person you are answering to. Unlike them, I got into rebasing only recently. It isn't really a hand-wave of saying 'you don't know it well enough'. This paradox is caused by a quirk in git's design. Git uses a hybrid model for dealing with versions. It uses snapshots for storing them, as well as for operations like fetch, clone, etc. But that mental model causes a lot of confusion for operations like merges, pulls etc. Git uses the diff-patch model for those operations. Even weirder, much of git is actually designed to use the diff-patch model. The first time I ever got a grip on those operations was when I did rebasing. I found git much easier to understand when I could attribute each operation to the model git uses behind the scenes.
> It's much harder to admit that maybe the tool is just difficult to use in the first place.
That is unfortunately true. Git feels like a tool with several parts that were bolted on to solve problems its users (developers) faced. I didn't have that same difficulty with mercurial. I get this feeling that a pure patch based tool like pijul would match their workflow well and still be easy to learn.
Naw, merge is orthogonal to cleaning up local history. Merging a mess is still a mess. Some people are fairly clean from the start, and some aren’t. And it depends on the task at hand, messes are easier to make when you’re doing more than one thing at a time.
Do you care to explain? How can you have a worst history when the `main` / `develop` branch have a linear history? I rather have a linear history where git bisect is trivial to use then a mess of merges and a hard to follow history.
This is my reaction based on experience too. I find that in most companies I've worked at that there are generally 2 camps; one is the "I want history to track what I've actually done; good, bad and ugly", the other is "I don't want to wade through 1000 "typo" commits - they should be pretty and streamlined".
I fall into the former camp, but others (who I respect) fall into the latter.
FWIW, I was mainly referring to making local branches presentable and/or just organizing things, not to rebase as a merge strategy. Don’t assume that rebase always means moving the local branch to the head of main, and don’t assume that rebase and merge are interchangeable. There are a lot of ways to rebase.
That's what I thought the discussion was about, locally 'clean' commits. In this context Rebase is so much more useful. I recently had to use the 'unto' option of the rebase command. 'unto' may show someone that they didn't really understand rebase though they could use the simple form of it, rebase -i x.
I'm someone who has started enjoying rebasing workflow after using branch-merge workflow for more than a decade. So perhaps, my experience may be useful since you can compare the same person using both of them. The first thing to realize is that branch-merge and rebase workflows try to achieve different goals. Branch-merge tries to preserve history of the code, with every wart and quirk included. Rebase tries to achieve a clean history where every commit is an entire fully functional (without known errors) feature with complete explanation in the commit message. People fight over them, but I find both goals to be meaningful and the choice depends on your priorities. In fact, I do both- publish a clean history in the master and the original messy history as a separate branch.
> I'd really like to know what mess people make when they don't rebase.
As a developer, we have the expectation that git will allow us to experiment with features, make mistakes, correct them, roll them back and improve. That stage is also so messy that you'd leave short commit messages that would make sense only to you. This is fine while developing. But this leaves many commits that are functionally broken, partial or rolled back later. That, along with vague commit messages and illogical commit order make it really hard for someone else to pick up a working commit from your branch and continue. Heck! I find it hard to choose a commit even from my own older repositories. That isn't the case with good projects like the kernel. You can pick any commit on the master and it will compile and work with all the features advertised up to that commit message. It makes a users' life easier.
> I've found that using merge gives a readable trail of when something was merged, whether that be from a branch's original branch, or if you're merging into another branch.
This is true. It's harder to achieve with rebasing. But it's possible with some work. I usually leave the original messy feature branch intact, and mark the rebased HEAD with a similar-worded tag or branch.
> Rebasing just seems to cause a lot more headache when something doesn't go perfectly correct in between commits.
Don't take any offense, but those are beginner blues. It happens in the early stage of learning rebases when you don't have a full grasp of what is going on. People evolve different strategies to overcome this once they are a bit more comfortable. My strategy is to create a temporary branch for rebasing (at the same commit as the feature branch) and do multiple rebases on it. I do only one or maximum two operations during each rebase. The result of each rebase is reviewed before doing the next round of rebase on the same branch. The original feature branch is left intact in case something goes wrong - though I never needed it.
Other people do rebasing occasionally while developing the branch. They do this after every two or three commits. They end up with a clean history to merge (fast-forward) to master, by the time they finish the branch. All of these operations can be simplified using helper tools like git-revise [1].
My absolute favourite method is to not use rebase at all. Craft the perfect commits as you develop. This can be achieved with a patch stack tool like quilt or stacked-git [2]. It allows you to move your changes to a patch (commit) of your choice. This is like having multiple staging indexes available. You can also split, combine or reorder patches. The patch stack evolves as you develop, but you end up with a perfect history to merge (fast-forward) at the end.
As a counter point. Every time I've seen people get themselves into trouble with git has been when they are using a GUI tool. I have yet to see a gui tool that they can use to get themselves back out of it. There's a lot of value in knowing your way around the command line side of git even if you regularly use gui tools to interact with the repository.
I have the same theory, and had the same experience: I worked for a company where it was impossible to clean up the repository, because a dev using a GUI was regularly pushing back the old tags (and consequently, references to old history), without knowing they were doing it. It was never discovered who they were, since almost the whole team uses Git via GUI.
I've had the exact opposite experience. I've had to bail out cli purists because they just can't see what went wrong.
Objectively git kraken has more visual information density. To get the same information on the cli requires multiple commands and the user has to hold the information I'm their head between the commands.
Git kraken's buttons are also tightly correlated to commands and you can even bring up a window that shows what cli commands are being run under the hood.
Gitkraken has too much information density, since it weights nearly everything equally or alphabetically. Next to impossible to see what you or your colleagues are working on when there are a lot of branches in flight and half of what Gitkraken is showing you are merged or abandoned branches you don't care about. It also gives up and freezes when opening large repos, like my small company's monorepo.
I canceled my subscription and returned to the CLI + VS Codes native SCM UI because GitKraken is now useless in my work environment. Never mind the startup time is in minutes. And no, I'm not paying more for the ability to write a bug report and get it to them (their support is functionally non existent)
Great for a tiny project. Awful for companies. Not worth the money given how much more time it takes for me to do my work than use the CLI.
Sorry for the strong words, GitKraken is just a great example of bad design.
> Objectively git kraken has more visual information density. To get the same information on the cli requires multiple commands and the user has to hold the information I'm their head between the commands.
Can you give a precise example? The cli has support to using visual tools for diff/merge, so one doesn't need to use git via gui in order to use graphical interfaces where effective.
> I find people are religious about being git cli purists and only interacting with it in this black box (the terminal)
Git cli purists are probably cli purists in general, not just for git. And for good reason.
When you work with the cli/terminal, you aren't bound by what the creator of the GUI decided should be built.
> I often suggest git CLI purists to get something like git kraken and just use it as a visual dashboard. Watch what happens when you run git commands. You can see everyone else's remote branches and have a much better idea of what's going on than you can without it.
There are git commands for displaying whatever information you want, capable of drawing nice trees and whatever complete with everybody else's branches.
> It's so easy to create a new branch before I do anything that if I think something might go south, I just hard reset to my previous state (a branch I created right before the operation)
Duh! That's what git is for!
And when you say "It's so easy to create a new branch" I'm genuinely curious to know how you've been creating branches before you discovered the "state of the art GUI tools that change the game".
Some operations lend themselves well to a graphical interface, such as partial staging and interactive rebasing. Others might not. As long as you show and read before pushing to a remote repository, the worse you can do is create extra work for yourself.
However I'm not sure Kraken is a good example. The GUIs for git are all similar, but Kraken has some unfortunate combination of being counter intuitive at times, and very easy to access undo and force buttons.
We have a lot of developers using all common GUI tools and the ones using Kraken are the only ones who not only regularly end up not only shot in the foot but force pushing the remnants upon their colleagues. For what it's worth, the few using Magit seem to have the least trouble but I suspect they may be more familiar with their tools. The ones using IntelliJ seems alright too, as long as they don't venture too far outside the familiar edit and push cycles.
Generally, any graphical git tool should probably be as closely integrated into the IDE as possible. That's where it's most useful.
> Some operations lend themselves well to a graphical interface, such as partial staging and interactive rebasing
I can't emphasize how pleasant those operations are in magit. I'm not sure if it (TUI) would be considered as textual or graphical. However, the interface model should lend well to a fully graphical implementation.
What GUI tools did you have in mind? git CLI is spectacularly good, GUIs tend to just make things slow and get in your way. With CLI you will also get the same experience on every platform, and even when you SSH.
The reason I learned about cherry picking, rebase & all is because when I tried using IntelliJ's UI to see my history, I saw a lot of options in the context menu and just searched what they meant.
It is also with a GUI that I learned you could rebase on pull, instead if merging and getting an ugly "Merged remote tracking changes" commit.
How many GUI tools support git's bisection search?
Just last week, I found out that GitKraken doesn't support connecting to multiple Azure DevOps organizations within the same profile. So my team now needs to learn git CLI anyway, to efficiently migrate our 30-odd repositories.
Totally agree. Using git with a UI is much more enjoyable, albeit less purist but who cares. I'm working on 4 different features, more important to be efficient than pure.
If you want to use a private repo on GitHub with GitKraken, it will cost you $60 per year per user, minimum. Everyone wants to embrace, extend, and make a buck!
I think the thing that’s at the root of most git issues is the lack of atomic commits. Most people I know commit at the “block of work” or “ticket” level. Once they have the thing they wanted done, they commit. It means that fundamentally they can’t cherry pick, and rebasing occurs on huge blocks of code instead of single lines.
It’s not there yet, but I kind of think git should be used the way I started using the save option when computers were less reliable. Type a sentence, save. Type another sentence, save.
Funny. The way I found to not get yelled at for having too many commits is to just “git commit —amend” every time I want to checkpoint and only push when the ticket is done.
Git has an extraordinary collection of foot guns and unwritten rules. I’ve been using it three years and often feel like I just get by (and also the devs who came up with the conventions in my org maybe could have chosen better)?
For me git is really difficult, and the CLI makes it even more so. That's just my personal experience. That's just me. I have a really terrible short term memory, so when using a CLI I tend to scroll up and down a lot since I forget after a few seconds what was the error I was just shown and what state I'm in. This creates a lot of friction. When using a GUI I see the new state immediately, in a single glance.
My theory is that people who like the CLI have a good (or just working...) short term memory.
I have a sample of 1 to prove my case: my SO, who has excellent memory (short and long term) and handles the git CLI just fine, and really thinks all git guis are like bicycle training wheels for kids.
I on the other hand use tortoise git almost exclusively (and get mocked by her that I'm still riding on training wheels...).
I actually use only Tortoise Git's git log view, all the important actions are available from there, instead of accessing them from their somewhat haphazard right-click menu. In this way it is very much like Git Kraken or SourceTree but more powerful, since it exposes (IMO) the most useful subset of git commands AND also shows in a single glance all the necessary information about the repo.
I find it very easy to explain git branching / merging / rebasing / push/pull to new people using the tortoise git log view. Basically I'm using the same explanation as the one written by somewhereoutth below [0] but without referring to brambles... I also always stress using git reset --hard to get out of really confusing and bad situations, assuming you had committed first.
Three random complaints about git:
- The fact that it doesn't track renames at all, and instead just makes an educated guess based on comparing the file content. I'm not sure how this could be fixed, though, since git doesn't have a daemon running in the background keeping track of changes. This can lead to the dreaded "modify/delete" or "delete/delete" conflicts. Granted, this would happen less if people would merge/rebase their work branch often instead of accruing commits over time and not occasionally merging from trunk. But it does happen, and it's really unpleasant, it often means that git got confused and it takes a lot of digging and comparisons through history to find out why. Having a gui helps me a lot here but it still a heck of a detective work.
- The fact that there are many workflows and ways to do things and doesn't have a recommended way to do stuff. This can lead to arguments and conflicts, especially between people who have "opinions" (I'm definitely including myself in that group of people!).
This can happen especially with people who switch teams and where the new team works in a different workflow than what they are used to. Actually I've been part of just such an argument this week, it was unnecessary and unpleasant. Especially considering that at the end we both understood each other and agreed that my team's workflow is really not that different than her last team's workflow (for which she was responsible), and our team works under different set of circumstances and with different clients.
- Not specifically about git: Personally I never found it necessary to have a clean history, and I'm confused as to why people think this is important. I certainly don't require this from people in my team.
I review pull requests regularly, and my way of doing that is to run a diff between the start of the feature/bug fix branch and its tip, to see the set of changes introduced by this pull request. I usually don't care about all the "dirty" commits in the middle, although I will look through them if I don't understand something to see how something has developed.
But I never review pull requests solely through the pull request GUI (in github or azure). I fetch their branch locally and do what I just described through tortoise git. It usually takes me only about 30 seconds to generate the diff which lets me see the files changed. Reviewing the pull request itself usually takes far longer compared to this, especially with people who are worked with areas in the code they are unfamiliar with. I write all my notes offline in my favorite editor, and then copy them into the pull request web GUI. I also try to compile and run their local branch on my machine and run the relevant unit tests added by the author to verify they run.
I find the pull request GUI in repo hosting sites to be very ugly and cluttered, and it only shows the latest changes.
So my theory is that people who insist on having a clean history only know how to do review pull request through the web gui, which does show the diffs in the last commit and doesn't show the whole history. But this is probably me missing something or just my 5-person team being small compared to others.
I don't think any of the three problem areas are intrinsic to gui or cli
The rename aspect is a problem in both cases and is related to how git works internally.
Dirty commits van be really useful for the above, do a commit to rename, then another to modify. That keeps the rename operation clean and modifications in their own diff.
For workflow, that is a team process and documentation problem. Every team should document the typical CLI commands for their workflow. Having no documentation around this is negligent, simply referencing some webpage is lazy, it should be spelled out for you
I’m trying to write a helpful reply instead of downvoting. If you use those kind of criteria to determine someone’s knowledge, you’re going to come across as dogmatic, reductionist, etc. I doubt you are really that way, but it’s how it comes across, and it doesn’t make for good discussion. I’d suggest laying out why you think someone’s preference for a gui means they know nothing.
I only use the CLI (because I learned git before GUIs got useable), but my mental representation of branches and commits is *graph*ical, and most of the git user guides and tutorials I see use a graphical representation to communicate git concepts. I imagine it would have been easier to learn git's functionality if I had an interface to git that was closer to my mental representation.
My mental model of a git repo is basically a bramble bush with labels on it.
The bramble stems and branches are the commit history (which may join and well as split, unlike a bramble). The labels are stuck to particular bits of the bramble. Some labels are even stuck to other labels!
You can move the labels around, and you can glue extra bits of bramble to the tips of what you have. You can even hack about with the bramble, but this is not recommended. Label moving often happens automatically, e.g. when 'growing' a bramble tip.
It gets interesting when you compare two brambles (e.g. remote and local repos). You might determine that one bramble is identical to another, just with the labels moved. Or one bramble is the same as the other but with extra stems added. Or both brambles had a common ancestor bramble, but now they both have extra stems. Or perhaps they are completely irreconcilable.
Learning the underlying model behind git is well worth the effort.
For me watching Steve Smith's talk - Knowledge is Power: Getting out of Trouble by Understanding Git - was a lightbulb moment.
I think this is what can confuse people. We have to face the facts that not everyone will need or be able to grok all of what goes on in git and what makes the car go forward. We can all drive that car still!
Take the recursive merging stuff around minute 39. Do I need to know why git's model for merging is so much better and how it works its magic? I don't think so. It's an implementation detail.
Do I need to know how a gearbox works to drive stick shift?
I have never thought of git as a bramble (tree worked fine for me :)) but the thing I always tell people about too is the labels part. What I think is enough for people to realize is that it's just like say SVN or CVS or any other source control mechanism in that there's a tree (bramble) of commits and then every commit can just be pointed to by a label. I can move those labels around any which way I want. Everything else follows from that on a surface level that is the only thing required to work with git in most situations, including some advanced ones.
You don't need to know why certain operations in git are faster, better, have less conflicts or how they work internally. You just need to know what they do and when to apply them.
I don't need to know why I can't make my car start or switch gears without pushing down the clutch. I just need to know when to push the clutch, i.e. if I want to start the car, push it. When I want to switch gears push it (well, mostly, on cars you want to last a while still lol). Of course some people won't even be able to learn how to drive stick shift and can only ever drive automatic.
> Do I need to know how a gearbox works to drive stick shift?
No, but it helps. If I treat my car's powertrain as a black box then it won't be intuitive that:
* I shouldn't slip the clutch to hold my car on an incline.
* Blipping the throttle gives me smoother downshifts.
* Double clutching lets me shift from 2nd to 1st while rolling.
* I can use the engine to slow my descent on longer hills and avoid brake fade
* I should park the car in gear for safety
* etc.
Whether it's better for a given person to memorise that list of facts or to understand the concepts behind them would depend on how much they drive. As a developer I've found it helpful to gain an understanding of the tools I use daily beyond "here's the commands to copy/paste when you want to do xyz".
It may not have sounded like it but I'm actually going to agree with some of this. We might be on a different part of the gradient so to speak. Maybe you can find a good analogy but at least from my point of view the gearbox analogy is falling short now and we'd need a different one or look at git commands one by one :)
Like you say, you can just remember those things. In fact 4 out of those 6 were taught in driving school and you had to remember them. One I only learned because trucks still needed that (double clutching while shifting - not just your case, just in general) but cars didn't when I learned. I personally don't like copy and pasting commands like that but I see a lot of people doing that even for stuff that should be second nature because you need it all the time. I think - to stay in the analogy - for me this is the difference between knowing that I should engine break and how to do it when I want to slow down vs. having a piece of paper in the glove compartment that I pull out and check for what to do and how every time I approach a red light ;)
I'm also someone that likes to get an understanding of the things that I use and do every day. The thing is that there are so many things we use and do all the time that I think (almost) everyone just has to keep a certain level of abstraction away from many things, because there are just so many rabbit holes out there and it's not beneficial for most people to have explored every single rabbit hole all the way to the end (the 'gradient'). Git commit graphs are a DAG and lots of cool things can be done with DAGs, most of which I totally forgot about since I learned about and enjoyed them in university and have never needed them again at that level. It's good to know they exist and be able to dig in when needed/wanted.
You're absolutely right, there are about a million and one rabbit holes you can go down if you _really_ want to understand the tools you use but in the vast majority of those cases you're better off relying on a higher level abstraction.
The interesting part to me of your analogy is that it can demonstrate how having an understanding of what's going on under your layer of abstraction allows you to generalise.
To wring the last bit of life out of the gearbox analogy: if you tell a mechanically inclined driver that blipping the throttle will make their downshifts smoother, they'd hopefully understand how rev matching can be generalised to upshifts too. If you tell a "black box" driver then they probably wouldn't be able to do the same. Of course, the analogy falls apart a bit because understanding rev-matched upshifts isn't particularly useful :)
I've forgotten most of the "advanced" git knowledge I ever learned but I get a lot of value out of the (admittedly not very advanced) understanding that, as you said, commits are a DAG and that branches are just named pointers to nodes on the DAG. That understanding lets me generalise to, for example, backing up a branch (with git branch/tag) before doing a tricky rebase so I can restore it (with git reset --hard) if I need to undo.
I agree that exploring every rabbit hole to the end isn't beneficial - understanding the DAG is typically the only "advanced" git knowledge I need and I've only very rarely had to peel back more layers. I think this:
> It's good to know they exist and be able to dig in when needed/wanted.
is a good way of putting it. My ideal low-level understanding of most tools is knowing just enough that I know what to search for if I ever need to go deeper.
> That understanding lets me generalise to, for example, backing up a branch (with git branch/tag) before doing a tricky rebase so I can restore it (with git reset --hard) if I need to undo.
You can also do `git reflog` on branches to see what they used to point to :)
To a large extent, yes. But when you find yourself stopped at a red light on a steep upgrade, and some idiot behind you decides to wait for the light three inches from your back bumper, things will go better when the light turns green if you have a decent mental model of the physical mechanism of the clutch. Sometimes you want to let those abstractions leak a bit.
I tested a Subaru manual at the dealer a few years back. The exit from the dealership was uphill onto a busy road. So, I stopped and reached for the handbrake.
There was no handbrake.
I managed to work my way back down to the dealership and asked, "Where's the handbrake?"
It turned out that Subaru had decided to replace it with a button and an "automatic" "hill-holder" feature.
Actually you don't need a mental model of the clutch at all for that. You just need to know how to handle steep inclines. There's a technique for that. All you need to know for that is the fact that on a steep incline you will go back a bit until your clutch has a chance to get you moving forward against the hill and that in this case you need to apply that technique. Like knowing when to use rebase, when to use cherry-pick etc. Now why you will roll back on the steep hill is not something a normal operator of a car needs to understand. Neither the clutch part nor the gravity one actually.
Also, if he's literally 3 inches, I would say the best approach is to slowly ease off the brakes until you actually touch the other car but do it so slowly that it's not an impact. All without even pushing the clutch. Then you don't even need to use the brake while pushing the gas pedal trick, because the guy behind you is your brake pedal ;)
For a group of people supposedly primarily in the SV area this is a scary comment thread.
I learned to drive stick in the Bay Area and the way everyone I know there drives stick on an incline is to use the parking brake with a second hand when letting off the brake and letting out the clutch.
Now, when you get good you can stop doing this for most, but for really steep streets (I now live in San Diego and we have a couple; Laurel being one) it’s still an excellent skill to have.
You should go drive a bit in (some parts of) Spain! Not sure if something like this exists where you are.
Imagine: Small towns, really narrow one way streets with foot traffic and underground parking. Getting out of some of those underground car parks is scary stuff!
You get onto a steep incline to get out of the car park but you come around a corner onto that. Cars might be coming down towards you at this point and they're sometimes hard to see. It's cramped too. So you can't just take it w/ speed to get up there. Also on the top you have pedestrians on your side, so you might need to stop on the incline, then when pedestrians have scurried away, go a little further but not directly onto the street until you can actually see if cars are coming. Half your car is still on the incline at that point.
thanks, I'll be watching this. seems like I get myself in trouble every couple of months and have to zip up a directory, do a fresh checkout and overlay it. Seems sometimes something gets lost and it will refuse to push to origin on a branch that I've had checked out for a while. yet if I do a fresh checkout (and losing a bit of history as I like to commit locally a lot and squash just before I'm ready to push to git repo.
Conceptualizing a commit graph with branch/tag labels is the easy part. What gets messy is operations regarding remote vs. local (as you mention) and regarding the various local state one may have (working directory, index/staging). Then there’s also the issues that arise with merging, but you have those with every VCS.
I love mercurial’s solution of having phases for commits. Anything you created locally is Draft phase and can be mutated without “force”. Anything which was pulled or pushed is automatically marked as Public and is not easily mutated. Visualizing the commit graph also colors commits based on phase making it really easy to get a grasp of what’s going on.
Is there an extension or something for Git to have similar behavior to this?
Instead of using colors, a visualization that can be combined with branch colors would be nice. For example using bold for remote (“public”) commits, or dashed lines for local (“draft”) commits.
My mental model of a git repo is basically a stack of backup tapes with sticky notes on them that say “this is tape #2983 and it's almost the same as tape #2429, except that I fixed a typo.” Things like `git log` and common workflows make you think “this is change #2983, where I fixed a typo”, and that's where the trouble starts.
It's an organizational and personal thing, but you really have to get in the habit of treating commits as single diffs to the codebase and not the equivalent of ctrl + s in your editor.
I try and make a habit of squashing my commits so that git log is a narrative of development and each commit is a page number, not a sticky note.
I for one like how git works. Some of the command names are weird, but I know more sane names are being introduced (e.g. git restore, git switch).
Yes, you need to follow a tutorial to use it effectively. Designing for beginning users is often detrementantal to advanced users, and I object to that being called "user friendly".
I'm not very vocal about git usually because I'm not an expert I have no big complaints, and I'm not that interested in arguing about it. I think there's a lot of us that are perfectly happy with git; we just make less noise.
I'd argue the opposite - that the model was so good and so much better than anything else that it was adopted despite the UI/UX flaws. Things can be good, have a flaw and still be considered good. That doesn't mean that we pretend the flaw doesn't exist.
Mercurial is built on what is almost the same model and has what I think is a better UI/UX. Some people say that git is technically superior but you really need to be an advanced user for it to matter, or work on a really large project.
At work, we switched from SVN to Mercurial, and from Mercurial to Git. Most people were happy to switch away from SVN, but few enjoyed the change from Mercurial to Git. Personally, it took me much longer to get used to Git than with Mercurial, even though Mercurial was my first DVCS. I now have a slight preference for git, but I am happy with either (just don't bring back SVN!).
Why Git came to dominate and not Mercurial? I am not sure, but I don't think technical reasons explain everything. Its association with Linux probably helped a lot.
I was also a Mercurial fan who had to reluctantly accept the rise of Git. Git was always a bit quicker, but Mercurial was much more predictable until you really understood the ins and outs of Git.
> Why Git came to dominate and not Mercurial?
The answer is actually really simple and non-technical: Github. We take these feature rich online code hosting platforms for granted now, but Github was really the leading edge of the wave. It made it easy for people to work together on writing software so it had great takeup and started a Git snowball effect.
It has been described as distributed SVN. I really hated SVN, almost to to point of wanting to go back to CVS, so I didn't want anything that took any inspiration from it. But maybe, in reality, it is not that bad.
The funny thing about the "svn vs git" debate (if one could call it that), is that people always tend to focus on the whole "svn is centralised whereas git is not" bit as the main argument.
Yet, whenever I've worked on git with others, it's been on github (i.e. a centralised model). And I've worked in a decentralised way on svn on several occasions, simply by making my own local repositories and merging changes back to the parent repository when I'm done (which is effectively what git does too when working with remote repositories).
I feel that a lot of what ends up being 'bad' about svn really boils down to the fact that you need some good conventions to be honoured across the project in order to get things done (including using it as part of a decentralised workflow), but humans being humans take shortcuts and mess things up for everyone else. Which really means at the end of the day, the problem isn't the technology per se, but human relationships, manifesting as commit behaviours. Whereas git just imposes its highly opinionated model of doing things on you in order to ward off some of the more destructive human behaviours, which in a sense is good, but at the same time, it means that git can be too rigid, and svn effectively gets a bad rep for being potentially more flexible and scriptable. I've been in many situations where I had to get into an incredibly convoluted manual process to work around git's mental model to get it to work for me, when the equivalent in svn (for better or worse) would have been fairly straightforward.
(Disclaimer: This is just my personal experience from happily using both with no specific preference for one mental model over the other. If anything, I think I may probably prefer svn a bit more now. I'm probably completely wrong about all of it.)
It dominated the world because GitHub was better than BitBucket and had more generous free levels.
If only that wasn’t the case we’d all be using a sensible source control system now instead of one where people repeatedly say things like “if you just learn the underlying model…”
BitBucket limited the number of repos you could have while GitHub allowed you to have as many as you wanted so long as they were public. Which had the (intended?) effect of making it more useful socially.
Yes, but again, the number of people collaborating on Linux kernel development is truly tiny.
I think perhaps you're over-estimating the network effect for VCS's -- having multiple version control binaries on your machine is low cost, and aliases can, up to a point, give you a consistent interface to them all.
The highly sophisticated demographic of kernel developers would not have been at the pub insisting that their friends drop svn and use git (exclusively!), anymore than they would have been trying to get any other projects they worked on to adopt bitkeeper previously.
The fact that bitkeeper was required for 'collaboration in Linux development' for so long supports this position.
They are bad. Some concepts like rebase etc. I still don't understand.
In software we often go for further complexity instead of less. I think because most of us who are the lead developers are often the most intelligent. And we often enjoy these complicated abstract models and they come easy to us. However in satisfying our own intellectual vanity we often don't see how many we leave behind. Which is good for our hourly rates, but less good for creating affordable and simple software.
Anyway one of the main practical advantages git had was decentralized repos meaning you were not dependent on an external server which meant git was often much faster if working with in daily tasks compared to centralized versioning systems
I'm sorry if this sounds arrogant, but if you don't understand basic Git concepts like rebase, I don't think you're in a position to comment on the architecture of Git
Git's UI/UX is one of the worst engineering sins to be committed in the last two decades, and this website shows why. Literally nothing about git is intuitive and the "underlying model" is entirely ad-hoc. Instead of celebrating how Linus built git in only a few days he should be castigated for knowingly setting up ill-conceived software to go viral.
I think the underlying model is fantastic, but the names for operations, and how they are grouped, are somewhat ad hoc.
The idea of a directed graph of file system snapshots is pretty intuitive. Add in branches as pointers to locations in that graph. This is a fantastic model for source control.
However the operations that stage a potential update to the graph of snapshots is prettt confusing. The "index" is a terrible name that is overloaded with so many other non-git meanings, non of which really map to git usage.
That, and all the rest of the names are pretty hard to understand. Particularly reset, whose documentation is inscrutable without translation from git-speak into technical language, or at least a dictionary of what all those words that are used actually mean. And since reset is such a useful tool and has about eleventy different functions, it all becomes impossible to learn from the docs on your own.
The well-documented underlying model used to confuse me a lot - especially for operations like merging, rebasing (squash, reorder, ...), cherry picking, etc. They started to make sense only when I realized that git uses diffs/patches to propagate changes between unrelated commits. While the on-disk format is purely snapshot-based, the tool itself is a hybrid. As far as I can tell, this trips up a lot of others too.
Agreed, the first step is realizing that the data structure is a graph of snap shots. And then a lot of the primary operations are about calculating deltas between arbitrary commits, and the applying those deltas elsewhere.
You shouldn’t present your own opinions as facts. I find Git intuitive, and it works as I expect it to do. I suspect a lot of people agree since Git became the champion of the DVCS crusades.
It’s the first thing in the article. Git reflog. (BTW you won’t get any disagreement from me about the lack of intuitiveness of the command name. But, in a way, git’s whole reason for existing is to undo, and almost everything you do in git can be undone by design.)
The reflog is very helpful but I don't think it counts as an "undo". Some operations (like git add or git push) won't show up in the reflog.
Even the operations which do show will often require more thought to undo than a hypothetical "git undo" would. I know how to use the reflog but I often go out of my way to avoid it because "git branch tmp HEAD; git $POSSIBLE_MISTAKE; git reset ---hard tmp; git branch -D tmp" requires less effort than deciphering the reflog's output.
Git reflog absolutely counts as the first step of undo for several workflows, but you’re right there are other commands needed for some kinds of undo.
Undoing a push does require a different set of commands, but my point, to the question @amelius asked, is that you can undo both push and add, and whatever other mistake you’re thinking of, difficult or not.
Ah yes, the classic `rm --undo` saved me so many times. No, command-line interfaces rarely offer undo. The onus is on the user to not do irreversible things when they may need to be reversed.
Git, coincidentally, does have something equivalent to undo history: the reflog.
> for knowingly setting up ill-conceived software to go viral
I doubt that was the intention. Linux just needed a versioning system tailored to its needs, and that's exactly what Git is. Can't blame its creators that other people used it for scenarios it wasn't built for.
Right, you are the cop-out I am complaining about. Linus is the premier figurehead of the premier open-source project. When a build tool he made for that project goes viral it isn't an accident. If it was anybody else's pet versioning control system would we even be talking about it?
What you're saying is that anything Linus does, for whatever purpose, should be designed to cover a wide variety of use cases and please a large number of people, even if he's just writing something for himself. I think that's an unrealistic expectation.
If it went viral, yes. I agree most of the time I feel like the world is run by evil folks twiddling their fingers in secret.
This is not one of those times. I agree there's a chance there are better word choices or feedback for some commands, but overall once you 'learn the language' it really is a lean, mean, well designed piece of software.
Folks that complain about the UI/UX don't realize it wasn't designed for less technical folks. It was designed for the folks who needed it.
It's success must at least partially prove that the UI/UX is not 1/10. Any real engineer will tell you, there are times where they wish they could do something better but the requirements and constraints left them making tough decisions, and that doesn't mean they aren't proud of their work.
It's success is also partially due to the fact that it is lean and mean, which allows it to be applicable to nearly all software projects of any flavor. So I don't understand why folks argue it could have been done better. If it was 'better', in my view it wouldn't have been successful. The success was driven by it's succinct design and Linus' take it or leave it attitude.
Technically accurate studio monitors don't sound as pleasing to the ear as good well tuned speakers. But they are exceptional at the job they were designed for. This is like that.
There is no cop-out. I use git, with pleasure and gratitude, for everything, because of its excellent design and performance. It appeals to my intuition, and I like the interface. You are entitled to your opinion, but not to confuse it with objective reality.
But some of types of things have in fact changed, sometimes for seemingly trivial reasons.
For example, the master/main branch shift. Everything broke when that change was made, but it happened and it wasn't a big deal.
I'm not seeing the difficulty here. It seems that a more reasonable interpretation is that git has the type of interface that is hard to learn, but intuitive once learned.
What I don't get is that the UX problem is not hard. I still like the underlying commit tree model a lot.. it's just some commands that are not clear enough (reset has 5 meanings and can backfire quickly). Maybe it's just retrocompatibility inertia.
I recommend you create aliases. I've got a 'git-squash' 'git-back' and 'git-update' aliases created, I almost forget they are not part of core git. Combined with scm-breeze that allows files and branches to be referenced with numbers... it's really fast and powerful
Please correct me if I'm wrong, but Linus probably didn't plan for git to be adopted by other people than just kernel developers. Because that was what git was created for originally.
Mercurial is still actively developed, and isn't really "dead", although it did lose a lot of popularity and "mindshare".
Projects like Firefox and nginx use it, although others like Vim and Python switched to GitHub. I don't know if Facebook is still using Mercurial internally, but for their public stuff they use GitHub.
I think that if GitHub had supported Mercurial like BitBucket or Google Code it would still be a lot more popular, but ah well...
> # remove the last commit from the master branch
> git reset HEAD~ --hard
I see this all the time, this "commits ON a branch" mentality.
My secret to understanding git is: branches are just adresses, labels, pointers, aliases to commits. A branch is just a label pointing to a commit.
So, you don't "remove (or add) a commit from a branch", you change where a branch point to.
Commits are tree nodes, they have a parent and they can have children. If you point a branch to a commit, you can now think of a branch as that commit and its parent commits.
So, `git checkout master` is just `git checkout <commit>`. But in a smarter way, as git is storing this as a special reference for you. If you do a `git checkout commit-ref` git will warn you that you are now working in a commit tree without a special name (detached).
Oh, I wrongly "commited to master". Ok, just point master back to the previous commit.
Oh, I want master to point to another point in time. Just find the commit reflecting that point in time and `git switch master; git reset <commit ref>`.
Reset should be called "point the current branch to this commit and make the working tree reflect the state of the code at that commit".
>...So, you don't "remove (or add) a commit from a branch", you change where a branch point to.
Well, once committed, the corresponding deltas to parent are part of the branch (that is a node of the branch graph). Thus, if user wants to have this commit in a different graph, then the deltas need to be regenerated and recommitted (then deleted from the wrong graph).
As for the branch name, in Git it's just a label for the graph leaf. However, some SCMs maintain branch name for all constituing nodes in the branch graph.
If you are talking about the thing that git calls a branch, it's literally just a pointer to a commit. You can look it up manually. Here is a random repo I have on my PC:
In principle all these objects are in the .git/objects folder. But the encoding is some binary thing for branches, and anyway it's often stored differently (I assume for space saving reasons). But git can explain what every object is.
The git software just updates the branch to create the illusion that it's not just a pointer to a commit. But really, it's literally a file with the sha1 hash of the commit.
> Well, once committed, the corresponding deltas to parent are part of the branch (that is a node of the branch graph).
Not sure what you mean by deltas here, but if you are taking about changes, git don’t store changes, each commit references the state (the content) of the whole repository.
The diffs are just a representation calculated to you.
Like: `git show <commit B>` shows a diff, but actually what happens is that git calculates the difference between all repository’s files as they were (their contents) in commit A vs their contents in commit B.
Git does that in a very efficient way, but commit are actually snapshots.
For anyone who is hesitant to spend time learning the git cli like I was, just spend an hour on this interactive tutorial [1] and I guarantee you all of these solutions would have come to you easy
If anyone is curious: git diff --cached is the same as git diff --staged. I don't know why they aliased them but I run git diff --cached very frequently and after seeing this post's reference to --staged I wondered what the difference was. Turns out there is no difference.
I use --cached as well but I think I will try switching to --staged since that name makes more sense to me. I similarly have gotten away from checkout in favor of switch/restore.
Or you can spend some time once to understand how git works, and never run into a similar situation again. git is built on a beautiful and very simple model, and it's easy to unbreak the repository 99% of time if you understand it (and I'm saying this as a very mediocre programmer).
It's not going to help when one fucks up a complicated merge/rebase and can lose a ton of work if something goes wrong. Understanding what you're doing goes a long way.
I've seen this a couple of times at $DAYJOB when someone doesn't understand how rebase works, smashes keys until it looks like they've achieved their goal, and then I have to ssh into their box and try to unravel the damage they've done, hoping that reflog has not been GC'd yet and there's something to revert to.
A 1000x this. I'm unfortunately that guy. I've learned never to use rebase and to copy and paste important code to a separate text editor (or use my IDE's separate undo cache) before trying a complex merge, and then just trying all over again if I fuck up.
That's easier than trying to get git working right, lol. It feels like every git command is a PhD rabbit hole. All I know is that if I screw something in git, trying to fix it will just screw it up worse. Reset early and often and it usually works in the end. Lol, it's terrible...
I've gotten out of sync with the remote where that would still fail and decided to just reclone. I think the reason it happened is git-history purists rewriting and pushing.
That just resets to your last local commit. To sync back up to the remote, you need at least do a git reset origin/HEAD --hard (and capitalization matters, I don't know why).
But that doesn't work if you rewrote history in some way (I'm not sure why that's even possible). In that case your local git can get pretty messed up, some gui tools (IntelliJ) start to bug out and fail to diff properly, and it's easier to just start over.
After ten years of using git, I've more confused by it than ever. People here keep talking about mental models, but I've read a shit ton of docs on it and am still totally confused. Probably some of us just aren't smart enough lol.
Not sure what origin/HEAD is supposed to point to. HEAD is a ref that points to the tip of the current branch. It's not going to be available for remotes (it just doesn't make sense).
To reset the current branch, you need to do two things:
$ git fetch
to fetch commits from the default remote and point remote branches accordingly, and
$ git reset --hard origin/master
to reset HEAD to the remote branch. Substitute master for any other remote branch if you wish.
That's it. I don't know if this is more difficult than re-cloning from scratch (especially if you're doing frontend and then have to reinstall node_modules and such…)
I don't know what the difference is, honestly :( I believe you, but I see both online and nobody seems to know the difference and it seems to work. Shrug? Most of my git life is copy pasting somebody else's commands because the model of it is really too complex for me. I've given up on caring... it's easier and faster to rewrite some code than deal with git's commands.
Same thing for node modules, lol. Delete the folder and try again. If that still doesn't work, delete the lock file and try again.
It's a terrible practice, I know, but it's the only thing that actually seems to work in my experience.
EDIT: Apparently origin/master points to a specific branch. origin/HEAD points to the top of the "default" remote branch, which is often, but not always, master. Or something like that. I don't know for sure.
Two things to make sense of how origin/HEAD is working.
First is that HEAD always points to the commit that you currently have checked out. So, for example, if you have the master branch checked out, HEAD will be whatever is the latest commit in master in your local copy of the repository.
Second is that to git there is no difference between the originating repository and your own. If you had your copy of the repo fully exposed, the remote machine that you cloned it from could push changes to your copy exactly the same way that you push changes to the remote machine.
That second part is important because it means that the remote machine you're pulling from also has its own HEAD pointer. When you reset your state to origin/HEAD, you're telling git to set your own HEAD to point to the same thing as the remote machine's HEAD. This is very likely to be the same asking it to reset to origin/master because HEAD is probably pointing to master on the remote machine.
The reason it's likely that HEAD=master is that in most circumstances the repo on the remote machine isn't manually being touched. When you first setup a repo there is only a single branch so that is what HEAD points to. If no one ever logs into the machine and executes a checkout command, that's what the server is going to continue to point to.
However since there's no guarantee that HEAD=master, you shouldn't rely on that and instead always use the actual branch name.
Thank you, that clears it up! And probably explains why I've lost work when resetting other branches; I wanted to revert to the latest remote tip of their branch, not reset that branch to master, but I screwed up because of HEAD. Thank you for clarifying!
Even though I understand git really well it’s quicker to just blow it away and start again rather than accidentally make it worse realising you didn’t understand it as well as you thought you did.
Slightly orthogonal but a less destructive way of 'Fuck this noise' is to use worktrees. They were a game-changer for me. I often have several tasks on the go and it's really useful to have different branches in different directories and not have to stash changes if I need to quickly switch to a different task.
> [git reflog … git reset…] You can use this to get back stuff you accidentally deleted
Pro tip - this is true for commits, but not for accidentally dropped stashes. This is why it’s better to commit first, branching if necessary, than to stash.
“If you mistakenly drop or clear stash entries, they cannot be recovered through the normal safety mechanisms. However, you can try the following incantation to get a list of stash entries that are still in your repository, but not reachable any more [git fsck…]”
> Oh Shit! I accidentally committed to the wrong branch! [git reset … git stash …] A lot of people have suggested using cherry-pick for this situation too
Because of the above warning about stash, cherry-pick is indeed a bit safer and more easily recoverable if something goes wrong when moving to the other branch. This particular situation isn’t dire given the premise that you already committed, so the reflog can be used. The situation where it’s more important is if you’re sitting in the wrong branch, have uncommitted work, and git won’t let you switch branches first. In that case, committing first into the wrong branch is recommended over stashing and switching branches, even though it’s slightly more work. I’ve actually watched people mess this up and then get frustrated with the magic incantation fsck stuff and give up and spazz and nuke their repo instead, while shouting “wait! no no no no…” over their shoulder.
Stash is convenient sometimes, but never necessary, always less safe, and there are always commit flow alternatives. This is why I avoid it myself.
My worst git disaster was when I attempted to, iirc, stash a half-completed merge. Maybe I tried to pop the stash; I can't really remember why this seemed like a thing I should do. I've always felt like stash was made of gum and duct tape, and whatever sequence of operations I performed completely destroyed the repo. It's one of the very few times I've ever deleted the directory and started afresh, past the first few weeks of learning git.
so I use that instead. Only instead of an anonymous name, I pick something sensible in case I get distracted and return to the work later. Also, it means I don't ever need to commit to the wrong branch -- `switch -c` carries uncommitted changes along. Sometimes cherrypick isn't a fine enough granularity, and I'll use difftool to distribute changes between two branches.
This is my main reason to use Intellij - it tracks all changes without having to commit them. There must be a vscode plugin to do the same, but I dont know of it.
For someone to get out of a mess they would have to know that they caused a mess in the first place! I'm starting to think trying to get people to use rebase and squash is a losing battle when they frequently just merge without pulling.
I got my team on side to use rebase/squash instead of blind merge commits after I showed them how much easier git bisect is to use when you have a linear commit history in your main branch. Now nobody wants to be 'that person' who breaks the bisect feature in case we urgently need it.
Another useful thing if you need to backdate commits for whatever reason are the GIT_AUTHOR_DATE and the GIT_COMMITTER_DATE environment variables, upon executing git-commit they'll override these fields in the commit to whatever date, time and timezone you specify. I use this sometimes when I'm making some previously private work public, and am redoing the commit history to make more logical sense to others who may read it.
Also useful are git-fast-export and git-fast-import, if you really need to delve into the inner details of a commit. For example, I had three separate but related git repos that I needed to merge, so I created a new repo with separate branches to hold each repo, merged everything manually, committed that to a new branch, then used export/import to edit the commit to have the tips of the three other branches as its ancestors. Maybe there's a better way to do this with other git commands but I found it easier just to delve in and edit the commit data manually.
> I had three separate but related git repos that I needed to merge, so I created a new repo with separate branches to hold each repo
Similar story here: at a previous job we had a monorepo with a Rails app and Rails engines extending its appearance and behaviour and per customer.
At some point the architecture became problematic and we moved towards a shell app, a core engine, and extension engines depending on the core.
We refactored code to that end, and "forked" the original monorepo into multiple clones, one per component, then stripping the other components in each one, ending up with 1:1 repo/gem/component. This worked for a while, easing a lot of issues we had previously, allowing for proper dependency expression, independent development and releases... Everything was great and we lived happily ever after.
Then much later we hit a snag (I can't exactly recall what that snag was, IIRC it was not technical but organisational). So we looked at options and decided to merge into one single repo again. To that end we could do a big code drop, starting afresh, but (again I can't recall why) there was a need/requirement to keep at least some git history.
But at that point, "some" ends up ~== "all". So I devised a plan.
I git init'd a blank repo, added each one of the repos as separate remotes, and fetched each of them. Thus all git objects of all these repos were present. Then I checked out each one of these remote's master as a separate branch, created a subdirectory with the component name, moved every file for that checkout into that directory, and committed that. This way a) each project could live separately in the new monorepo and b) there would be no conflict for a merge.
Then came time for the merge. Two options: a) perform N merges subsequently or b) perform an octopus merge. a) just felt wrong and ugly, so I decided to try if I could work b) out, but I ended up not being able to achieve that with porcelain commands as git was being too smart and attempted to look into the history for some reason I can't recall which produced senseless conflicts (IIRC git merge isn't entirely assymetric)
So, since merge commits are merely commits with more than one parent I figured out I may be able to do that with plumbing commands instead. So the steps were:
- for each branch, check out content (but without moving the current HEAD, so, actually, export the git tree corresponding to a specific ref/sha)
- add all that to the index
- create a merge commit object with each branch's sha as parent
And it Just Worked, with the bonus that since up til the commit where we forked, each branch had common parents that were untouched, and commit history properly zipping up, by git's design, which is really a DAG of commit objects.
My first introduction to git was ... well, the professor's solution to not having a place to upload. Given that the vast majority of the students involved had done almost no programming before, it didn't go so hot. I managed.
My first "production" introduction to git was really about someone polishing their resume and chanting magic words at me. Merges ... happen? Who decides whether Bob or Cindy's code is used here? It just happens okay.
Then I bought some books on git and was pretty unhappy with how arcane the naming was. How is "reflog" the correct and intuitive choice for "undo"? Was there something I wasn't understanding? No, I'm just supposed to accept that.
Thankfully, my current programming gig is simple enough that I don't have to look at git. Either eventually sanity will come to the naming of commands or something else will appear, just as it has for every other tool.
> How is "reflog" the correct and intuitive choice for "undo"? Was there something I wasn't understanding? No, I'm just supposed to accept that.
Actually your initial intuition was correct. There are things you obviously don't understand about git. "reflog" is short for reference log. As "git help reflog" will tell you:
Reference logs, or "reflogs", record when the tips of branches and other references were updated in the local repository.
It isn't arcane. It's perfectly logical choice for what the "git reflog" comand does, as explained in the man page: "This command manages the information recorded in the reflogs."
Now if you don't know what a reflog is, you will be confused by this. But the solution to that is for you to learn what a reflog is. And yes, in order to use git well you need to have at least a rudimentary understanding of how it works.
I know that some people do find git hard to understand. Often that is because they want it to work a different way than it does. But git is a fairly complex tool designed to solve a very complex problem. It does an outstanding job of doing what it was designed to do. However, if you are not willing to invest the time in understanding git to a minimal level (and many people aren't), you will find it to be confusing.
There is no need for "sanity" to "come to the naming of commands" for git. The commands already have sane names. But if you don't know what those names mean, it will seem to you as if they are in a foreign language. Most of the confusion people have with git is due to them having an insufficient level of knowledge of how it works. Again, if you want to use git well, you have to gain that minimal knowledge . If you are not willing to do that, then you should either become willing or choose to use some other version control program that is more to your liking.
You have made my point for me. "This command manages the information recorded in the reflogs." WHAT?
Of course everything has an obvious name if you have to learn everything about it. The point I am making is that the name ought to be obvious before you have to learn everything about it. "Undo" is a reasonable choice if you barely understand what git is for and "reflog" is not. Once you master a system and agree to all of its axioms and warts, it's all logical from the inside. That's true of almost any system, though.
There is a minimum level of knowledge you need to know to use git. It is not as straightforward as, say a simple text editor. The problem git is solving is more complex than that, and therefore understanding how to use it requires investing a significant amount of time in learning how it works.
If I was going to go back to my previous commit, I would use "git reset" to go back to the commit prior to the one I just committed. The only reason to use "git reflog" as part of that process is to see which commit was prior to the one you are working with now (in order to pass it to "git reset" to undo the changes).
> "Undo" is a reasonable choice if you barely understand what git is for and "reflog" is not.
But "git reflog" is not an undo mechanism. It can be used to determine a commit SHA to reset to (in order to 'undo' the last commit with "git reset"), but it is used for a bunch of other things also. It is appropriately named for what it does.
If you "barely understand what git is for" then the solution is to learn the minimum you need about git in order to use it effectively. I am not talking about mastering git, that's an entirely different topic. I'm talking about basic-level knowledge. Again, the required initial time investment for git is significant, but in my opinion it is worth it.
If you don't feel the investment is worth it, then you should use something else.
No. You are being deliberately obtuse. There is no way to argue that "reflog" is something that is intuitive or that should be known by somebody who just wants to commit on branches and merge them sometimes. Which is what most people need git for.
If the minimum level of knowledge to operate a car is to know what every piece of it does and how to reassemble it from scratch the designer of said car hasn't been doing his job.
This is exactly the problem with git. You wanted 2 branches and now you have to understand the that the underlying system is actually a filesystem cause that's what Torvalds was used to.
Also, if you don't want to use "git reflog" at all for "undoing", you could undo your previous commit with:
git reset HEAD^
Or if you also want to discard the changes you made to files in the repo at the same time:
git reset HEAD^ --hard
But use the --hard option wisely. Be sure you really don't want to keep any of the changes you made (or that you have already saved them elsewhere before running it).
I avoid using git command line for all the reasons above
Personally I use a UI for git which basically solves all of these problems. All the branches and commits are visible. If you want something somewhere, right click on it and you'll get all the available options. Nothing to memorize and everything is available!
My own favourite after testing out a few is SmartGit, but it's paid so not for everyone. There are lots of free git UIs out there as well, but what I like about SmartGit is that it's completely full featured - every obscure git command is available somewhere - so I have never ever need to use git command line when on my local machine, not even for these obscure things like resets, rebase, cherry pick, squashed commits, etc you name it.
Also SmartGit is cross platform so I can use it anywhere
I generally don't like putting linters on commit hooks. Commits are easy to change and I think people would be better off making more of them. If your lint ends up taking a long time it will discourage frequent commits.
Also, unless you jump through a lot of hoops, linters generally run against what's on-disk instead of what's actually going into the commit. So checks that run at commit time discourage partial adds.
I dislike that approach as it slows down committing. Preferably linter should be integrated in the editor and there should be one in the CI pipeline, but the steps between writing code and pushing it into CI should be as quick and smooth as possible.
I’ve been using git command line for over a decade. I know the vast majority of options and know how to bail myself out when things go awry. I’m the guy that people come to when they bork a rebase.
I still prefer to use Sublime Merge, I’ve tried almost every git gui out there and that’s the one that “stuck” and I rarely use the CLI. I’ve ever used and it takes so much pain out of coming up with all the crazy command lines and all the switches.
I think the steps under https://ohshitgit.com/#accidental-commit-master are wrong: it reverts the commit on the new branch -NOT on the master. This is because git branch auto checks out the new branch. You need to do
git branch NEWBRANCH
git checkout master
git revert —hard HEAD^
If you want to continue working on the new branch you do
>The command’s second form creates a new branch head named <branchname> which points to the current HEAD, or <start-point> if given. [...] Note that this will create the new branch, but it will not switch the working tree to it; use "git switch <newbranch>" to switch to the new branch.
Which is correct, assuming master is already checked out.
0. prior state is that we're on master and have committed something that should have been on a branch
1. create a new branch that is identical to master (i.e., contains the commit) -- note this does NOT checkout the new branch (`git checkout -b some-new-branch-name` would do that)
2. reset current branch (master) to point at the commit before (i.e., strips the commit from master)
3. checkout the new branch to continue work there
At the end, master doesn't contain the commit anymore, and the new branch does. It's all correct.
I second the recommendation to use git rocket filter [1] from this link.
A guy in our team committed a big file unnecessarily into the repo which already had a year's history behind it, and it was only a month later when I found out. git rocket filter filtered it in a few seconds.
Since all team members had already gotten it, I ran it on all their repos to verify it's gone, and ran the usual git gc incantation [0] to clean their repo.
Came here to post this - better than the link under discussion above. The interactive steps through a problem helps a ton for diagnosing the right steps to fix something!
Git is just a tool, or better, a toolset for software development.
Saying that it is too hard to me is a huge red flag. It means you do not want to spend the time, or the effort, to learn it.
It's like a mechanic saying that a wrench is too complex.
I think your analogy is a bit reductionist. The difference between git and a wrench is that the wrench presents almost all of its intended uses. I can quite confidently say that most people will understand it within a few minutes.
Git is like CSS in the sense that for experienced users will sometimes forget the difficult learning process they went through, e.g. "oh use margin auto to centre the div"
I've been there, finding git too hard and giving up, but I've came back to it and eventually got past that uncomfortable learning bump.
It would be nice to see something like this for Magit.
Also, the fact that this is so complicated and that this document is as long as it is is strong evidence that git is just poorly designed from a UI standpoint.
I liked how the Japanese translation of the page retains the two versions, but the language difference made the variants pretty much pointless, because there are hardly any "profane" words in Japanese that are offensive enough to warrant censoring by their utterance alone. (There do exist a group of offensive-enough words that relate to class/race discrimination, but no equivalent of a censor-worthy 'shit' nor 'fuck'.) The 2 Japanese translations are merely written as one mildly colloquial version vs another very-slightly-less-colloquial version.
While I'd like to think that this is _somewhat_ useful, I am a little hesitant. The issue I see with these bite-sized recipes is that there is no context, no place for nuance, and no hint that the behavior you see might be different for a variety of reasons.
Take the following for example:
> Oh shit, I need to change the message on my last commit!
> git commit --amend
It's important to realize here that if you are simply trying to edit the last commit message, you *should not* have anything in your index (that is, staged). Otherwise those changes will be recorded in the amended commit! What Git does is essentially move all the changes recorded in the commit you are amending _into_ the index, and then run `git commit -m <amended-message>` ... so if you have files in there, those will get mixed up with the ones in the commit.
Here's another one:
> Oh shit, I accidentally committed to the wrong branch!
> A lot of people have suggested using `cherry-pick` for this situation too, so take your pick on whatever one makes the most sense to you!
Umm ... No! The solution proposed (with `git reset --soft`) and a cherry-pick are NOT the same! Not even close! You will produce two completely different histories.
This final one, given when this page was written, _may be_ understandably incorrect
My book, Head First Git, was published by O'Reilly this January. I posted a submission here on HN about it https://news.ycombinator.com/item?id=30072348 so if you want any details feel free to peruse that.
It always amuses me that when Linus presents git to “the Google audience” they objected to the complexity, those people are in fact elite, but measured against what? The TailScale people? The TS people are ex-Google in some places and early-Nix in others, either way no one to fuck with.
I’m frustrated if I take a day to work out some something that Jeff Dean mentioned to, uh, a friend, and it took me a day to work it out.
@bradfitz works on TailScale, there’s also a reason he’s a chapter in a book. @jwz is better, by so little you’d never notice, so his chapter is cooler.
It’s admittedly a snippy, snarky, kind of cryptic comment and I probably deserve more downvotes than I got on it.
In a more measured tone: there is, in my opinion, a kind of creeping anti-intellectualism that’s been slowly-but-surely gaining ground on HN for the last 5-ish years.
We used to just really openly admire and respect iconic pros in this business, we used to openly acknowledge that some of the software work we talk about is pretty friggin elite and few of any of us will ever even work on some of it.
Whether it’s Linus or the TailScale pros or John Blow, I’ve seen people get gang-tackled by the “no one ever uses this LeetCode stuff in a real job” crowd repeatedly in the last week.
Knowing how to use “git reflog” to do surgery on a fucked-up repo isn’t necessary every day, but when you need it, you need it bad, and there’s nothing outdated about how knowing how the damned tools work.