500 GB strikes me as quite small. The entire Git repository (it sounds like Microsoft uses Git internally) must be a whole heck of a lot larger than this if there are over 60,000 commits every few weeks.
Still, it’s a mammoth project, which makes it all the more impressive that it works (most of the time).
If the reported size was from that fs command (assuming that is similar to linux's du) and that he did it in his git clone version, then it includes all the git files, including file versions (that with 60k changes every 2 weeks should be pretty huge), and branches (different editions, targetted to different markets, etc, that may not have trivial changes).
I don't think he copied the structure without the git files to make the count, nor deleted/moved elsewhere the .git directory for that. Neither that is just source code there, (versioned) bitmaps, third party drivers blobs and more probably it's there too.
That’s quite a few assumptions, at least one of which likely isn’t correct; at Microsoft, the .git directory doesn’t contain ”all the git files, including file versions […] and branche”
”GVFS allows our developers to simply not download most of those 3.5 million files during a clone, and instead simply page in the (comparatively) small portion of the source tree that a developer is working with.”
”the Windows repo, which has over 3 million files in the working directory, totalling 270GB of source files. That’s 270GB in the working directory, at the tip of master. To clone this repo, you would have to download a packfile that is about 100GB in size, which would take hours. And once you’ve succeeded in cloning it, local git operations like checkout (3 hours), status (8 minutes), and commit (30 minutes) would still take way too long to run”.
I'd say that it's very likely that a repository of such size would be "shallow cloned", i.e. with `--depth n`. This would substantially cut down on the size of a large repository with hundreds of thousands of changes.
Still, it’s a mammoth project, which makes it all the more impressive that it works (most of the time).