
What are the odds that two pull requests get completed at the exact same time? - Sindisil
https://blogs.msdn.microsoft.com/oldnewthing/20180326-00/?p=98335
======
lev99
> I found that adorable. You [Linux] have 45,000 files. Yeah, call me when
> your repo starts to get big. The Windows repo has over three million files.

That Linux repo bashing at the end was very natsy.

Yes the Windows git repo is larger than the Linux git repo. Windows also has a
larger scope than the Linux kernel. If you want to dick wag please bring some
performance benchmarks, not footprint benchmarks.

~~~
gargravarr
Agreed. If they want to get into a p*ssing contest, they ought to consider
they're comparing their entire OS to just the kernel. Let's pick any one
distro and I bet they'll be eating their words. Then consider everything that
counts as 'Linux' and I bet the daily commit count would result in Windows'
development being classed as a 'critical network outage'.

~~~
soneil
I read it as a pretty fair comparison within the context. They're not
comparing windows vs linux. They're comparing the challenges their repo faces
vs the challenges the linux kernel repo faces. And within that context, I
really hope the windows repo is substantially larger, else the linux kernel is
going very very wrong.

~~~
gargravarr
It was a fair comparison until they started using words like 'cute', mocking
the average commits per day and challenging the kernel team on size - that's
when it got unnecessarily nasty. Almost condescending, Microsoft telling the
guys _who wrote git for their own use_ that they don't know how to run it.

------
TillE
This article doesn't say much of anything, the real meat is in one of the
links:

[https://www.visualstudio.com/learn/race-to-
push/](https://www.visualstudio.com/learn/race-to-push/)

------
panarky
Microsoft brags that their repository contains 3 million files with 4 million
commits.

Compare that with Google's repository which had 1 billion files and 35 million
commits -- as of 2015.

 _On a typical workday, they commit 16,000 changes to the codebase, and
another 24,000 changes are committed by automated systems.

Each day the repository serves billions of file read requests, with
approximately 800,000 queries per second during peak traffic and an average of
approximately 500,000 queries per second each workday._

[https://cacm.acm.org/magazines/2016/7/204032-why-google-
stor...](https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-
billions-of-lines-of-code-in-a-single-repository/fulltext)

~~~
kyberias
Apples and oranges. Or some other fruits...

Microsoft's repo in question here is only for the WINDOWS OS.

Google has all their code in one repository. And it's not git.

Microsoft's repo may well be the biggest git repo.

~~~
lev99
I've heard many people claim Microsoft has the largest git repo and I've never
heard anyone successfully dispute that claim.

~~~
keule
It makes sense though. The largest Monolith has the largest Repo. ;)

~~~
lev99
I imagine that the windows OS repo includes IE (and edge), paint, Exploer.exe,
Windows Management Systems, Cortana, and a ton of other projects that would
merit their own repository in an open source community.

~~~
gargravarr
While separate repos do make sense for individual projects normally, as we
well know, most Windows components are incredibly tightly bound together (how
many different things can crash Explorer?) so a mono-repo seems to make sense
for it.

~~~
kyberias
I don't think it's fair to say that _most_ Windows components are _incredibly_
tightly bound together.

------
dylan-m
I'm really confused by this monorepo thing. So many people have gleefully
hopped on to this particular cult, and the most reasonable reasoning that I've
heard is it helps to glob together interdependent parts of a system, but I
feel like that's throwing away everything we take for granted about what makes
good software: that it's better to separate your code into distinct pieces
with well-defined scope (and their own version numbers that reflect their
interfaces); to expose your interfaces, rather than your implementations, and
if someone is dependent on the exact implementation of module B then maybe
your design is fucked and you should start over before you hurt someone.

So I'm skeptical of that first argument. What does this do for you that a
decent architect doesn't do better? Is it a way for dangerously lazy project
managers to justify themselves? Is it because it works for Google so of course
it works for everyone? Is it a workaround for Github's private repo limit but
nobody wants to admit it because they're afraid Github will figure it out? If
Windows needs this, how the hell are Debian, Fedora and Ubuntu doing all these
releases? Do you know how many repositories they have to deal with?!

~~~
jasonpeacock
Amen. Bragging about having a large repo is showing the world that you can't
design large software systems.

Just like creating branches in Git is cheap, creating repos in Git is just as
cheap. The difference is that you need actual design and engineering
discipline to version your interfaces and manage those dependencies.

Having a kitchen-sink repo is poor engineering, and massive code smell.

------
huebnerob
I don't know why we as an industry can fetishize shaving milliseconds off of
interaction, response, or compile time, then turn around and have dick-
measuring contests about how poorly we can manage version control.

~~~
WorldMaker
Those sound like two sides of the same coin to me?

Microsoft is bragging about shaving milliseconds off source control time (some
of which has been merged upstream into main git, most of which is in GitFS
which they've made available, [ETA: this particular bit is in VSTS which
they've not made available as an open source effort, but also may have further
smaller audience than GitFS]), and the dick-measuring here is how that reads
in direct comparison to why they needed to invest in shaving those
milliseconds off. Six of one, half-dozen of the other.

------
vinceguidry
I'm rather surprised that Microsoft chose to use git for Windows rather than
roll their own SCM. It seems to clearly not be suited for the needs of
Windows. For a software company that's so invested in dogfooding, git's a
really odd choice.

~~~
thrownaway954
Did you ever use TFVC (shudder) or SourceSafe (vomit)? I'm glad they chose
git... keep them productive and making Windows 10 better.

~~~
gargravarr
I know SourceSafe was a disaster and didn't think anyone used it until I
started with a firm in 2013 - thankfully they were on SVN, but the first
commit of the repo was from 2009, which stated simply, 'Initial import from
VSS'. The company was founded in 1995...

TFVC is annoyingly proprietary but seems to follow a solid client/server
model. Git support is very nice.

~~~
thrownaway954
i don't know about solid... to me, when i can't work and have to track down
someone to checkin their changes to unlock a file, there are issues. maybe
other places have better practices in place to prevent this stuff from
happening and i was just unlucky.

------
_bxg1
I could never work on Windows. I'd just want to burn the whole thing to the
ground and start over. There is no way an OS actually needs to be that
massive; as helpfully illustrated by the comparison to Linux's metrics at the
end.

~~~
WorldMaker
Kernel versus vast monorepo of most of the OS, many of its pack-in apps, etc.
I've heard it said the core NT Kernel itself is mostly on par with the Linux
Kernel, it's when you consider all the other things that are a part of Windows
when it gets that huge. (Also why much of the work in GitFS the team did was
making sure that partial checkouts work well, because few people need a work
tree of the entire thing at once.)

------
joshstrange
This really isn't about pull requests getting completed at the exact same time
AFAICT, it's about 1 person attempting to accept a pull request around the
same time as 1 or more people and based on how many changes you made it you
may "win" the pull request race or "lose" and have to start over.

