
Ten Years of Git: An Interview with Linus Torvalds - lclark
http://www.linux.com/news/featured-blogs/185-jennifer-cloer/821541-10-years-of-git-an-interview-with-git-creator-linus-torvalds
======
jordigh
Let's not forget the other contender for replacing Bitkeeper, Mercurial:

[http://lkml.iu.edu/hypermail/linux/kernel/0504.2/0670.html](http://lkml.iu.edu/hypermail/linux/kernel/0504.2/0670.html)

We will also be celebrating Mercurial's 10th anniversary next week during the
3.4 Pycon sprint:

[http://mercurial.selenic.com/wiki/3.4sprint](http://mercurial.selenic.com/wiki/3.4sprint)

~~~
tjradcliffe
I used Git for a couple of years then changed to Mercurial and have never
looked back, except now and then to wonder at the size of the crowd that has
gathered around Git, driven in no small part by the success of github (which
is not a bad thing!)

Git is an entirely appropriate tool for managing as something as large and
complex as the Linux kernel, but I'm doing much simpler stuff, and Mercurial
is simpler and has a much shallower learning curve. I'd recommend it
unequivocally to teams of modest size working on average applications.

The state of git documentation is much better today than it was five or six
years ago, so maybe its complexity is less of a big deal, but so far I've not
encountered any issues with Mercurial that make me think "If only I was using
git this would be easy!"

What I found with git was I had to maintain a fairly complex mental model of
the current state of affairs, and because I'm not very intelligent that took
quite a bit of effort. With Mercurial the model is much simpler, so I can
spend more of my very limited attention on writing code. It's quite possible
that I simply never took the time to learn git properly, but with Mercurial I
didn't have to.

~~~
imakesnowflakes
Mercurial is not only simpler, but also has got really powerful features like
revsets.

[http://www.selenic.com/hg/help/revsets](http://www.selenic.com/hg/help/revsets)

I have tried to switch to git multiple times. Every time, I keep coming back
to Mercurial. The most difficulty I have with git is the non nonsensical
naming of concepts. For eg, A branch is a pointer to a commit. Because of
this, I experience a big mental block when I try to reason about something.
With Mercurial this is very much easier. And if you are using a DVCS to any
capacity you ll have to do this often.

Another great thing about Mercurial is how easy to get help about stuff. You
can head to the IRC chat room and can have a very good chance of catching one
of the developers who, in my experience, were very helpful...

~~~
stormbrew
> The most difficulty I have with git is the non nonsensical naming of
> concepts. For eg, A branch is a pointer to a commit.

This is a frustrating thing about how people tend to talk negatively about
git. I think what you mean is "not named in a way I'm used to," because
there's nothing nonsensical about this naming at all, and it's actually an
incredibly simple and lightweight way to reason about naming things in version
control (imo, I suppose). And it permeates basically every level of how git
organizes information in a pretty darn uniform way.

To me mercurial's two kinds of built-in branches and several plugins to do
branch-like things is much more complex, but I still wouldn't call it
nonsensical. That'd just be admitting that I stopped thinking about it when it
was strange and unfamiliar to me.

~~~
imakesnowflakes
>I think what you mean is "not named in a way I'm used to,"..

Of course, that is the whole point of naming something sensibility..Just, for
example, imagine how hard it would be if you have to learn/work in a version
of C that calls pointers as 'branches'.

>To me Mercurial's two kinds of built-in branches and several plugins to do
branch-like things is much more complex, but I still wouldn't call it
nonsensical.

I don't know what you mean by 'several plugins to do branch like things'.
Isn't it more frustrating that people complain about a tool because they have
to enable some advanced stuff by configuration. But I agree that Mercurial is
actually complex than Git, because it provides more options to the user. So
the complexity of Mercurial is a side effect of it being more powerful IMHO.

So the point is I say, naming a 'pointer to a commit' as branch, is
nonsensical because it goes against our notion of the word 'Branch' from real
life. It adds unnecessary burden for a human being trying to think in the
language of the tool, without actually making the tool more powerful. Git
could have been as powerful as it is now even if it had named things better,
Right?

~~~
stormbrew
> Of course, that is the whole point of naming something sensibility..Just,
> for example, imagine how hard it would be if you have to learn/work in a
> version of C that calls pointers as 'branches'.

Yes, if you rename <arbitrary thing> to <other arbitrary thing>, it is very
likely to result in nonsense. This is not such a case.

A branch in git is literally a name for a _branch of the DAG that is the
commit tree_. It's possibly the least abstract interpretation of the concept
as possible. There is nothing nonsensical about it.

But the objection seems to be that you don't _work with it_ like in svn or hg
or p4 or cvs (which are also all different from each other), which does not
make it nonsensical, merely unfamiliar. This is the distinction I'm driving
at.

~~~
imakesnowflakes
>But the objection seems to be that you don't work with it..

The objection is that, when naming is skewed in lower levels it makes it
harder to reason about higher level concepts and creates ambiguity..

For example, let us continue with the idea of a 'branch'...

Suppose you define a branch as 'a set of commits'. Then it is easy to imagine
a 'remove' operation on a 'branch' will remove all the commits in that
'branch'. There is no ambiguity.

But when you define a branch as 'a pointer to the last commit in a consecutive
set of commits', a 'remove' operation on a branch is no longer clear what it
is supposed to do.

Does it simply deletes the pointer, in which case the commits will remain
untouched? But it would not be consistent with the abstraction of the
'branch'.

Does it remove all the child commits from the history? by which the operation
would be consistent with the abstraction of a 'branch'.

Now the remove operation is ambiguous as to what it actually does.

Note that there would be no ambiguity if the user does not know that a branch
is actually a pointer to a commit. Because git hides all the change-sets that
are not the decedents of a branch. So a users concept of a 'branch removal' is
maintained. But I think it is pretty accepted that using git requires that you
know stuff like these...

So another way of looking at the problem is that, Git forces the user to work
at multiple levels of abstraction simultaneously. And this creates ambiguity
because the user wouldn't know which level of abstraction to use when
reasoning about something, and defeats the whole point of having abstractions
in the first place IMHO.

~~~
bad_user
> _Then it is easy to imagine a 'remove' operation on a 'branch' will remove
> all the commits in that 'branch'. There is no ambiguity._

The problem is not with Git's use of pointers, but with your own thinking.

One should not be able to "remove" commits, because any operation should be
undo-able. In git removing commits implies updating a pointer. And if those
commits end up not being referenced by anything else, then they'll get garbage
collected. What git does is very close to how persistent data-structures work.
And many people complain about it just because it's unfamiliar.

And in your example, of course there's ambiguity, how can it not be? What
happens with the branches that are forked from your branch? That's the
definition of ambiguity right there.

> _So another way of looking at the problem is that, Git forces the user to
> work at multiple levels of abstraction simultaneously._

In my experience, the problem with Git is that people don't bother to read
documentation for a tool that they are using every day.

~~~
imakesnowflakes
>What happens with the branches that are forked from your branch?

Care to elaborate? Do you mean when removing a branch, what happens to
forked/child branches? There is no ambiguity, a change set cannot exist
without it's parent or cannot be moved to a different parent without changing
its identity. (The revision hash is a function of its ancestors too). So if
you remove a branch, the forks/child branches will be removed as well.

Anyway, that was just a made up example to show how naming can affect
reasoning. I don't think Mercurial or Git allows you to delete branches
directly....

~~~
Manishearth
Since Git is graph based, there's no guarantee that removing the branch
commits will work since there might be other branches using it.

Of course, if you delete a branch and then run git prune, its commits should
disappear as long as they weren't part of another branch.

------
henrik_w
"The trick wasn't really so much the coding but coming up with how it
organizes the data."

I think this really is the key to understanding git as well. When you
understand the git data structures, git makes sense. Otherwise, it can be
quite difficult to grasp
([http://ftp.newartisans.com/pub/git.from.bottom.up.pdf](http://ftp.newartisans.com/pub/git.from.bottom.up.pdf)
was really useful for me).

~~~
krylon
Not only git. There is a nice quote from Fred Brooks Jr. (I think), that if
you show me your flow charts and algorithms, I remain as clueless as before,
but show me your data structures, and everything else will follow. (I am
paraphrasing this from memory, Brooks put it more eloquently.)

~~~
henrik_w
Yes! I love that quote - so true. Linus has the same opinion:

"Bad programmers worry about the code. Good programmers worry about data
structures and their relationships." [1]

I actually refered to this when discussing switching from Java to Python. One
of the biggest problems for me was not easily being able to see the types of
the arguments to a function in Python. [2]

[1]
[http://programmers.stackexchange.com/questions/163185/torval...](http://programmers.stackexchange.com/questions/163185/torvalds-
quote-about-good-programmer)

[2] [http://henrikwarne.com/2014/06/22/switching-from-java-to-
pyt...](http://henrikwarne.com/2014/06/22/switching-from-java-to-python-first-
impressions/)

~~~
squeaky-clean
Off topic from the article, but this is something I literally just spent my
morning wrestling with and fixing, so I'd like to talk/rant about it a little.
My apologies in advance for rambling.

You can use doc comments to signify types. You mention you use PyCharm in your
article, and it supports doc comments in it's autocomplete and code analysis,
I find most popular libraries are commented well enough that PyCharm can
understand them.

I also always try (and encourage others) to use names that are declarative for
both purpose and type. Of course, you can't always rely on third party
libraries (or even colleagues) to be so nice, but I find a good name (almost)
always removes all the confusion that normally comes from a lack of static
typing.

For example, one of Codecademy's very first Python lessons includes some code
like this:

    
    
        meal = 44.50
        tax = 0.0675
        tip = 0.15
    

Which I think is unclear. Imagine these were function arguments.
calc_total(meal, tax, tip) is vague. Is meal an object, or a numeric value? (I
could possibly see it containing a list of all items in the meal with prices).
Tax is almost always a percentage, but what about tip? Judging by the the
above values, we can assume a 15% tip, but we don't know if the customer was
stingy and tipped $0.15, and by just name alone, we can't tell at all.

    
    
        calc_total(meal_cost, tax_percent, tip_percent)
    

It's now immediately clear to me the type and range of values it accepts. The
tip_percent is also an example of when a good name can provide info that even
static typing could not, because in either case it is a floating point (please
let's not get into a debate about Decimal or currency types :P ). This is a
very basic example, but it applies at all levels. Don't call the parameter
"users" if the function is not expecting an iterable of User objects. Maybe
"usernames" would be better. Etc.

But of course, this only helps if the code you're working with is named well.
In something like Java, you have better protection and tools when working
alongside lower quality code. I also completely agree with you on point #2
about No Static Types. Navigating through my editor is so much easier in a
static language than it is in PyCharm with large projects. And the most
annoying thing is that autocomplete breaks with ORMs, and most ORM usage is
actually flagged as a warning or error. Ugh.

I also feel very confident in the automated refactoring in something such as
an IntelliJ Java project, or ReSharper, but am apprehensive about using
PyCharm to refactor anything with usages spanning more than one file. Same
goes for Javascript (or any other dynamic language, I suppose. Those are just
the two I use).

Enjoyed the posts by the way, adding your blog to my reading list.

~~~
S4M
The example you mentioned, translated in Java, would be:

    
    
        Double mean = 55.50;
        Double tax = 0.0675;
        Double tip = 0.15;
    

And the function declaration would be:

    
    
        Double calc_total(Double meal, Double tax, Double tip)
    

So even with the types, you have exactly the same problem as in python, and
creating a special class to wrap up percentages is something I find quite
heavy.

Your alternatives would be to have naming conventions, like you do, or some
proper documentation that tells you what the function expects for its
arguments, with an example of a function call, which I particularly like in
python because of the REPL.

~~~
vilhelm_s
But perhaps a better type system can still be helpful. E.g. F# supports units
of measure, so you can declare 'dollar' as a unit, and write

    
    
        let calc_total (meal_cost : float<dollar>) (tax : float) (tip : float) =
           meal_cost * (1.0+tax) * (1.0+tip)
    

This way it is at least clear from the types that the numbers are meant to be
multiplied, not added.

~~~
S4M
I never used F#. In your example, would the code:

    
    
        meal_cost + tax
    

trigger a compiler exception?

~~~
vilhelm_s
Yes, you get

    
    
        The type 'float' does not match the type 'float<dollar>'
    

There is an online typechecker, so one can try out F# without installing it:
[http://www.tryfsharp.org/Learn/scientific-computing#units-
of...](http://www.tryfsharp.org/Learn/scientific-computing#units-of-measure)

~~~
S4M
And how about:

    
    
        meal_cost * tax
    

?

This one makes sense in the real world, so I suppose an advanced type system
allows the programmer to specify what operations are legit or not across
different types.

~~~
vilhelm_s
One could imagine systems for that, but units of measure doesn't do any
customization of different operations. It's just unit checking for for
arithmetic, like in high-school math.

------
thomasfl
I wonder if BitKeeper owner Larry McVoy has ever regretted not open sourcing
his software? Git tools is a whole industry now. The most promiment one,
github, recently become one of the 100 most popular websites on the planet.

~~~
luckydude
Regret it? Sure. I'd do it in a heartbeat if I could figure out how to make it
work. Still would and there is plenty in BK that Git doesn't have. Like
submodules that actually work _exactly_ like a monolithic tree, just lets you
clone what you need.

But we've never figured out how to make it work financially. If anyone has any
ideas I'm all ears (though pointing at github and saying "do that" isn't an
idea that I can execute).

BTW, BK used to be pretty darned close to open source, you got the source code
under a funky license that said "don't take out the part that lets us make
money". We stopped shipping the source when we learned that the very first
thing that someone committed to the repo was taking out the part that let us
make money.

~~~
qzw
Very cool of you to share your thoughts. I sympathize with your dilemma. It
seems that the people who end up making money out of free/open source software
are often not the ones who write the code. And I remember reading back in the
day that Linus would talk with you extensively about the nuts and bolts of
DVCSs, so I'm sure all git users owe you some gratitude for inspiring him to
create git and getting the fundamentals right from the get go.

Out of curiosity, and please feel free not to answer, is BK still a viable
commercial product bringing in significant revenue? And what obstacle do you
see with going the platform/service route like github? I assume that's
something you've seriously considered, even without open sourcing BK.

~~~
luckydude
Yeah, BK still pays the bills for our team. We're small though, I recently
found out that perforce has around 250 people, we're less than 1/10th that.
But we pull in millions a year, enough to pay our people above scale even in
the bay area, so far, so good.

I'll admit we've fallen off the radar (well, we were never really on the
commercial radar, the only "marketing" we ever did was getting Linus to use it
and that wasn't intended as marketing, it was intended to keep the kernel from
diverging like the BSDs did. But it turned out to be a form of marketing that
has kept us alive).

We're gonna try some actual marketing. Stay tuned. We'll probably screw it up
:) But we hired a marketing company, I've gone back to writing papers, we'll
give it a try. If you have ideas on how we can put ourselves back out there,
we'd love to hear them.

As for viable, heck yeah. We work well on big repos (better than git), we've
got what we call nested collections of repositories (did I mention I suck at
marketing, yeah, I came up with what to call it) that are sort of like
submodules except they work exactly like a single repo, sideways pulls work,
anything that works with one repo works with N repos, that includes all the
guis, command line, everything. We've got an answer for binaries that works
for gaming companies. We've got a sane user interface (that's what Mercurial
copied, in a somewhat sketchy way).

Git is sort of like the wild west, it never met an idea it didn't want to
implement (at least partially). We're more enterprise ready (yeah, over used
term) in that we work hard to make sure that BK has all the guard rails, seat
belts, etc, so that you can deploy to people who could care less how any SCM
works and they don't drive themselves over a cliff. Definitely less cool than
git in that we take away some (bad) options, but safer.

We have seriously considered open sourcing a version of BK. We've been doing a
lot of performance work and we essentially have two BK's, the almost SCCS
compat ascii format slow version (slow but any version of BK will talk to it),
and the fast one with a new binary file format (stuff like show the top commit
comments are 35x faster in the linux kernel, that number goes up as you add
more csets). We considered open sourcing the slow one but that effort has
stalled. It could be revived, it just has to be worth it to us.

The github ship, in my opinion, has sailed. Maybe we could have open sourced
BK back before git and done a github thing but it's all flashy UI stuff and we
sort of suck at that. We're really good at systems stuff (you'll see when we
start doing marketing, we scale, git doesn't) but flashy? Not so much. We do
our UI in tcl/tk (I know, I know, but we have one UI person who makes it all
work on windows/linux/macos and tcl/tk is a big part of that. At least we
wrote a C like language that compiles to tcl byte codes so we're out of tcl.
Thank God.)

~~~
hyperpallium
Wouldn't the standard open source-as-freemium work? i.e. a free open source
version with enough cool features (e.g. nested respositories), but not
efficient. It's free marketing to keep you on the radar, that targets the
people who appreciate your systems chops (and, like Atlasian, also gets it in
under the radar, to developers). Enterprise customers happily pay ridiculous
sums for full versions. And git/hg makes you immune to the competitive danger
of open source clones.

I'd value your thoughts on this, as I also have a popular open source
competitor, that followed me. The strategy seems sensible, but it might
undermine perceived value; and it's a hassle to maintain two versions...

Also, can I please ask a technical BK question: How much does git differ from
BK internally? i.e. git has graphs of commits, content-addressable for
efficient checks of identity and integrity. Did _git_ get any of that from BK?
Or was it more the workflow and distributed concept of everyone having a copy
of the repository? Many thanks!

~~~
luckydude
Linus definitely did his own thing with git. The general ideas came from BK,
BK gave you clone/pull/push/commit as the model. Everyone copied that because
it just makes sense. The all or nothing clone model came from BK.

How it is all glued together differs quite a bit. BK has the concept of a
revisioned file, git does not, it versions trees. That's why Linus thinks
renames are silly, he doesn't care about them, he cares about the tree.

The graphs of commits comes straight from BK, that's BK's changeset file -
which is sort of neat in that it is a version controlled file itself. BK is
the only system that I know of that uses a versioned file to store the
metadata.

OK, so on the business model thing, I'm not sure. The way we did the old
compatible format is compatible but it's pretty slow, it converts to the new
format in memory and then converts back if you write it out. It's slower than
the older implementation (but this way we have one in memory format, less
bugs). I thought it was good enough for small projects, my team overrode me
and said "too slow".

As for enterprise customers "happily paying", um, no. We constantly get wacked
with "if you don't do this or that we're moving to git". Which could be viewed
as a good thing, we have to keep making it better, but it gets tiresome.

~~~
hyperpallium
Thanks! Renames make archeology difficult in git. I've become reluctant to
change {file,directory} names, even when it's clearer...

BTW: What are the benefits of versioning changesets themselves? Isn't it rare
to only change the changeset?

Chained conversions are elegant, but slower code is unsatisfying... I guess
such hobbling is the essence of open-source-as-freemium. :(

I meant they "happily pay" for full over free versions. (For them, it's also
paying for "new" features!)

~~~
luckydude
Renames are a thing and git made the wrong choice there. It's not like we are
perfect but we are way closer.

So on versioning changesets I didn't really explain. Lemme try again.

In any DVCS you have a bill of materials, that's what describes the tree.
Git's is different than ours because they don't version files, we do. So our
bill of materials looks like:

    
    
      path/to/file <version>
      path/to/different_file 1.1
      path/different_dir/a_file 1.19
    

If you "cat" the changeset file as of any version you get what the tree looks
like, a list of files and a list of revisions.

Of course it doesn't work like that because, um, reality and merges and
parallel development. We have UUIDs for each file and each version so it looks
like

UUID_for_a_file UUID_for_a_version

and our UUIDs are pretty sweet, not sha1 or some other useless thing, they are

user@host|path/to/where/it/was|YYYYMMDDHHMMSS|checksum

those are for each node in the graph, for the very first node which is the
UUID for the file, there is a "|<64 bits of /dev/random>" appended.

So the changeset file is just a list of

UUID UUID

Not sure if that helps.

The benefit of versioning the file that holds all that data is we can use BK
to ask it stuff. Want to see the history of the repo? bk revtool ChangeSet
Want to see what files changed in a commit? bk diffs -r$commit ChangeSet Yeah,
we have to process all the UUIDs and turn them into pathnames and revisions
but we can do that and do it fast. So it works.

All the tools we built to look at stuff can look at the metadata. That's
worked out well.

~~~
hyperpallium
Thanks!

------
darkmagnus
Here is the first Git commit he talks about in the article:

[https://github.com/git/git/tree/e83c5163316f89bfbde7d9ab23ca...](https://github.com/git/git/tree/e83c5163316f89bfbde7d9ab23ca2e25604af290)

~~~
AceJohnny2
Having read "Git from the Bottom Up" [1], it's interesting and refreshing how
concise Torvald's explanation is.

Of course, he also didn't have as much to explain at that stage :)

[1] [https://jwiegley.github.io/git-from-the-bottom-
up/](https://jwiegley.github.io/git-from-the-bottom-up/)

------
ffn
Jesus, so Torvalds built the MVP for git in 1 day and pretty much scaled it up
for kernel usage in 10 days. Now, as a humble average developer, how do I
achieve Torvaldian levels of productivity?

~~~
vinceguidry
Well, he'd been collecting ideas and requirements for a long time in his head
before he finally made a system out of it. The vast majority of the time, what
you're building isn't all that clear to you before you have to start. So
there's a lot of time wasted just figuring that out. Cut that wasted time out
and you become a powerhouse.

------
worklogin
Linus says of the Github platform -

That's partly because of how the kernel is developed, but part of it was that
the GitHub interfaces were actively encouraging bad behavior. Commits done on
GitHub had bad commit messages etc, because the web interfaces at GitHub were
actively encouraging bad behavior. They did fix some of that, so it probably
works better, but it will never be appropriate for something like the Linux
kernel.

I haven't ever looked at the kernel workflow, nor have I ever heard systematic
criticism of Github's methodology. Does anyone else have input on what Linus
may mean by his opinion?

~~~
gregkh
The kernel development model is documented here: [http://lxr.free-
electrons.com/source/Documentation/developme...](http://lxr.free-
electrons.com/source/Documentation/development-process/)

Github doesn't scale at all for large projects. The kernel is averaging over 8
changes an hour, 24 hours a day. That rate of change can never be handled by
doing pull requests and web site review. It only can work with email and
review and scriptable processes.

~~~
quantumet
Eh, I mildly disagree on web site review. Perhaps not GitHub's pull
request/review model, but big projects like Android and Chromium do all their
review on web interfaces.

~~~
edejong
Using Gerrit, which is an automated mimick of Linus' development model.

By the way, we've been succesfully using Gerrit on smaller-scale projects as
well (around 8 change-requests / day) and don't even want to think about going
back to pull-requests.

~~~
sunnyps
What you've said is essentially correct, but I'd like to make a minor
correction: Chromium uses rietveld whereas Android (and some Chromium related
projects) use gerrit. Rietveld and gerrit have similar workflows except that
gerrit is more integrated with git (I think). Both are named after Gerrit
Rietveld, a famous Dutch designer.

------
jammycakes
One thing worth saying here is that Git is now the dominant player in the
source control market. According to almost every survey I've seen, it's
overtaken Subversion (according to some reports by a fairly large margin) not
only for open source and hobbyists but in the enterprise as well. It's more or
less become a lingua franca for communicating source code between teams these
days, as well as being the preferred deployment option for many cloud hosting
providers and even some package managers. No other source control system has
ever managed that.

As such, you simply can't afford not to know it these days. In fact, I can't
help thinking that not using Git should now be a major red flag for job
candidates and prospective employers alike. Sure, not using your favourite
tool may not be that big a deal, but not using what is to all intents and
purposes the lingua franca of source code communication is a serious omission.

~~~
minusSeven
Can't agree with this assertion. Your source control of choice should not be
in any way a red flag. It just a choice and because of learning curve I choose
to learn Mercurial instead. I don't see how source control matters to that
extent.

If I need to use it I will learn it then. I don't think most people care as
such.

------
restalis
"Why do you think it's been so widely adopted?

Torvalds: I think that many others had been frustrated by all the same issues
that made me hate SCM's"

Actually, the adoption was caused by the network effect. It took off with a
core group of Linux-involved developers promoting it religiously, then it went
on like a fashion (we are git-wielders, we are cool), and ended up being the
default choice for a lot of other projects that either depended on some degree
on 3rd party git-managed modules, or wanted to benefit from the existing
disciplined git-trained developers, or just... because (for all things being
equal). Having this presumption that the adopters actually thought for
themselves and Torvalds's brain-child got where it is now thanks to its own
technical merits is a pleasant thought though.

------
dangero
Somewhat off topic, but I'd love to see something more git-like that works for
large repositories. I work on video games and our repos are several TB. It's
not practical to use git in those situations and Perforce feels too clunky to
me.

~~~
jewel
Have you tried git-annex? It handles large files really well and is as close
to git-like as you can get. :)

~~~
dangero
I haven't tried it, but I've looked at it. I think it adds too much
complexity. Who decides which files go on the annex and which don't? Doesn't
seem like an automated great solution for a scale-able team.

I'm looking for something more like CVS/Perforce where you can check anything
in, but then with a more Git feeling interface. What I'd really like to do is
to look at Git feature by feature and build the closest thing possible that
does not include cloning the entire history locally under normal use. I know
that is the fundamental paradigm of Git, but it seems that something closer to
a hybrid of Git and Perforce could be created. Not sure exactly what it would
look like. It's just jarring to jump from Perforce/CVS to Git and I don't
think that's completely due to the local repository model. It's how branching
works, it's how merging works, etc.

~~~
jewel
> I haven't tried it, but I've looked at it. I think it adds too much
> complexity. Who decides which files go on the annex and which don't? Doesn't
> seem like an automated great solution for a scale-able team.

I would put all binary files into git annex (as determined by `file`). This
can be done by a commit hook automatically. With another hook that makes sure
that `git annex pull` is run when the user checks out code, you'd have a
solution that was close enough to automatic for most use cases.

(You'd have to help your users with weird situations from time to time, but
that's true of git anyway.)

------
netinstructions
I wrote an article a few months ago summarizing where Git is in 2015 if anyone
is curious[1]. Most developers at my company don't seem to want to learn a
"new" SCM so we're stuck with SVN (or CVS for some projects).

We rarely have a stable trunk. Branching takes too long so most people don't
want to do it. Merging two branches is such a scary thing that people avoid
branching in the first place. Or I've seen them copy code from one branch in
one Eclipse window and paste into another separate project (on another branch)
in another Eclipse window. They repeat this manually for the next 5 to 20
files. I don't think they've realized a merge can often be as simple as a
button click or one SCM command. But hey, it works for them.

The one time on my team that someone was interested in exploring Git was when
we couldn't find an up-to-date Maven SCM connector for SVN that played nicely
with Eclipse. The solution? Use an older version of Eclipse.

[1] [http://www.netinstructions.com/the-case-for-
git/](http://www.netinstructions.com/the-case-for-git/)

~~~
praneshp
I work for a huge-internet-corp, which I don;t want to name here (pretty easy
to find out from my username and a couple of google searches, if you care). I
sympathize (actually empathize) with your problem (developers dont want to
learn a new SCM). We had a big drive last year where management stuck down a
hammer, and told us to move to git, 100%. Several senior developers (many of
them actually architects now, not really writing much code) made a lot of
noise on internal mailing lists. One person would make terrible mistakes that
one reading of any git manual would help you understand, and then complain
loudly (with a lot of swearing) how much it sucked compared to SVN. I was
annoyed by this, because except for the hammer thing, the company did
everything else perfectly. We have a stable corp github, there were several
training sessions offered, and the reasons clearly explained. It was the first
time I was happy about the slightly dictatorial approach taken towards the
whole thing, instead of trying to reason with 50-year old babies.

------
DigitalSea
I just wanted to say thank you Linus for giving us Git. Before Git I was using
SVN and while the wounds are healing, I will bear these SVN scars for many
years to come. As a developer Git has made my life so much easier and not only
that, thanks to Github I can help collaborate on open source projects as well
as my own with ease.

------
plongeur
99% of the time I use:

\- git status

\- git add ...

\- git reset ...

\- git commit ...

\- git log ...

\- git push/pull ...

I guess that's "the basics" \- what command should I learn next?

~~~
naggie
> git rebase -i Very useful!

~~~
qzw
Yes, but watch out if you've already pushed to remote branches or even done
merge/rebase with other local branches. You're rewriting history.

~~~
Zikes
I most often use it because I forgot to include a certain file in my most
recent commit. I'll just commit that file then rebase -i to merge the two.

~~~
regularjack
If you want to alter the most recent commit, you can 'git commit --amend', no
need to rebase.

[http://git-scm.com/book/en/v2/Git-Basics-Undoing-Things](http://git-
scm.com/book/en/v2/Git-Basics-Undoing-Things)

------
pedrow
I wish Linus would now turn his attention to: 1) make (ought to be a simple
concept but never seems like it's quite been done right) 2) init (only he has
the clout to solve the pid 1 controversy) Will check back in 2025 for progress
report.

~~~
stinos
Ok, but only if he first turns his attention to a proper git
submodules/multiple reposities kind of thing

------
misiti3780
off topic: is the there a third party service that will let me collapse all
sub-comments and just read the parent comments?(and then expand them if they
seem interesting?)

~~~
kbart
There's API for HN
([https://github.com/HackerNews/API](https://github.com/HackerNews/API)), it's
not that hard to make one yourself. This is _Hacker_ News after all.

~~~
misiti3780
i know - i was hoping someone else did it already! thanks

------
INTPenis
Linus is not a magician, he is however very good at working with systems and
spotting exactly where those systems need improvement.

I respect this quality immensely.

------
xsace
Now if you can tackle dependency management and build tools Linus, I would
appreciate. With love, me.

------
sytse
There was a nice 10 year overview of git linked from the article
[https://www.atlassian.com/git/articles/10-years-of-
git/](https://www.atlassian.com/git/articles/10-years-of-git/) I think
Atlassian are great sports for featuring GitLab (that competes with their
Bitbucket and Stash products) in there, thanks!

~~~
netinstructions
Kind of a nitpick, but that infographic shows that in 2014 Git was used by 33%
of developers. Atlassian didn't say which source that came from, but I'm going
to guess it came from a 2014 Eclipse Community Survey of 876
respondents[1][2].

I found the poll a bit misleading. There were _separate_ categories for Git
and GitHub, but AFAIK you were only supposed to choose one. I'd wager most of
the GitHub users are also using Git, so the graph would look more like
this[3].

[1] [http://www.slideshare.net/IanSkerrett/eclipse-community-
surv...](http://www.slideshare.net/IanSkerrett/eclipse-community-survey-2014)
[2] [http://eclipse.dzone.com/articles/eclipse-community-
survey-2...](http://eclipse.dzone.com/articles/eclipse-community-survey-2014)
[3] [http://i.imgur.com/CEkIHSQ.png](http://i.imgur.com/CEkIHSQ.png)

~~~
sytse
I agree that using separate categories for git and GitHub doesn't make sense.

------
jkot
I am sad GIT got widely adopted. GIT-SVN was almost superpower a few years
ago. :-)

~~~
kinghajj
It still is, depending upon where you work. "I'm having trouble with merging
the latest development changes into my feature branch. Tortoise says there are
tree conflicts, what--" "Done, I'm committing the result to the SVN repo now."
"What?!"

