Hacker News new | past | comments | ask | show | jobs | submit login

Just because people are talking about it: I work at MSFT, and the numbers Wired quotes for the lines of code in Windows are not even close to being correct. Not even in the same order of magnitude.

Their source claims that Windows XP has ~45 million lines of code. But that was 14 years ago. The last time Windows was even in the same order of magnitude as 50 million LOC was in the Windows Vista timeframe.

EDIT: And, remember: that's for _one_ product, not multiple products. So an all-flavors build of Windows churns through a _lot_ of data to get something working.

(Parenthetical: the Windows build system is correspondingly complex, too. I'll save the story for another day, but to give you an idea of how intense it is, in a typical _day_, the amount of data that gets sent over the network in the Windows build system is a single-digit _multiple_ of the entire Netflix movie catalog. Hats off to those engineers, Windows is really hard work.)

> the numbers Wired quotes for the lines of code in Windows are not even close to being correct

OTOH you can't blame them for being incorrect if you (as in, Microsoft, not you personally) are being so secretive about the figures. I'm pretty sure everyone would love to see how Microsoft works internally, especially now that you teased us with that Windows build system.

Yes you can blame them for being incorrect. If they don't have a correct and relatively up-to-date figure, they should be clear about that. "Microsoft declined to comment on how many lines of code Windows has now" or "Windows XP used 45 million lines of code, but that was 14 years ago so it's not a very good comparison to anything".

They're not only incorrect but also stating facts out of context. Furthermore, Microsoft doesn't have any obligation to expose their real numbers if they don't want to.

If it's ever needed to cite numbers, at least tell what the context is instead of naming some number out of the blue from 15 years ago and assuming it's still the same windows

This makes me think of a parallel world, where all the big IT companies show off there LOC regularly. Starting a "LOC war", as if it would mean anything of value.

The reason Google developed Piper was basically that Perforce couldn't scale beyond a single machine, and our repo was getting too large to even be usable on the biggest, beefiest x86 server that money could buy (single machines with terabytes of ram and zillions of cores).

If Microsoft has close to the same amount of code in a single repository, then they must have also written their own version control service that runs on more than one machine.

The last rumors I heard is that Microsoft bought a license to the Perforce source code, and created their own flavor to host internal code ("Source Depot" ?), which presumably still runs on a single machine.

> single-digit _multiple_ of the entire Netflix movie catalog

Strange unit of comparison, although I may start using it.

Facebook gets a Flickr worth of photos every few days.

As someone who subscribes to ArchiveTeam's philosophy, it's going to be a dark day when the time comes to scrape Facebook before it goes under with that much data behind the scenes.

It's already backed up to Blueray. They'll just hand it over.

Yeah, right. How and who exactly would they hand that over to, given that there are privacy settings on the photos that people expect to have respected?

Sarcasm son....

You'd be surprised at how many people assume these bigcos are open to doing the right thing when they're shutting down.

Hey, I've still got my complimentary MySpace zip file kicking around somewhere.

I had a hard drive crash on me with all of my photos some years back and my "backup" strategy failed. Dumping a myspace backup got me some of my most precious photos back. Thanks MySpace!

Even more. Most people don't even realize that things like FB aren't eternal.

It isn't even clear what "the right thing" is here.

I wonder if this system could be used for 'burns'.

Such as: "Hacker News has an Ask dot com userbase of number of good posters" (This obviously does not include me ;))

How many VW Bugs is that ?

Well, I remember when Library of Congress and encyclopedias were used as units of measure. I would guess Netflix is the new stack of media.

Uh, how much is it actually? (Simple searching didn't seem to get me the answer.)

We should have a list of these things.

I did the same thing. I suspect that it's just "A LOT".

That's the same as saying "the same order of magnitude"

But while "1" and "9" are the same order of magnitude, so are "1" and "0.1". To say something is within an order of magnitude of netflix, you cover a possible range from 0.1 to 10 times the Netflix catalog, a range which is ten times bigger if you consider the minimum and maximum sizes...

Um, 0.1 and 1 are not considered the same order of magnitude.

I was abbreviating. We're talking about an open end of the interval right?

He didn't say "within an order of magnitude", he said, "the same order of magnitude." There is a difference.

So you think he meant within a factor of 3 and 1/3? That's not what the original netflix comment said.

"The same order of magnitude" means a multiplicative constant of a number from [1,10). 1/3 is decreasing by an order of magnitude. That is the typical usage in the sciences. Anyway, it's splitting hairs over something even less worthy of splitting hairs over.

> But while "1" and "9" are the same order of magnitude, so are "1" and "0.1". To say something is within an order of magnitude of netflix, you cover a possible range from 0.1 to 10 times the Netflix catalog

"is same order of magnitude as" is not a transitive relationship.

You seem to live in a base-tenny world. Many nerdies train themselves to do hex mental math in kindergarten, and the geekies do it base-36 because those extra letters "are there". So perhaps a 256 to 1296 multiplication ratio (those 2 numbers there are in base 10).

the order of magnitude is well known to be defined in base 10. That's regardless of your base of choice.


I've only heard of people talk about binary orders of magnitude; hexadecimal no

I assume they don't use msbuild, because they don't completely hate themselves.

So, how many lines of code does it have?

I can't say. I work here, but I don't speak for the company.

Wouldn't it be nice if by working there you felt empowered to speak for the company? The company is you and your colleagues.

I'd bet he has a lot of colleagues more qualified to question what he cares to reveal about their collective private intellectual property than you


Not sure I understand your point. The source the article refers to claims Windows XP is ~45 million LOC, not me. I myself didn't give any specific numbers about the size of the codebase.

Now, I don't think anyone would come for my head if I did give you a number -- what harm could it do, after all? But, personally, the line I draw is that I don't get too specific about internal data beyond a general order of magnitude, because I'm not here to speak for the company.

That figure is common knowledge, not new information that antics is sharing with us. https://en.wikipedia.org/wiki/Source_lines_of_code#Example

And why is that number a corporate secret?

I doubt it is, and I don't think I'd get in trouble for sharing it.

But, it's not my job to decide whether or not that information should be shared, because it's not my job to speak for the company.

I'm having a hard time understanding why people think this is not a reasonable position.

Because in a room full of interested people you said "I know something you don't." and the "ner ner ne-ner ner" was implied.


> Seems like it gives you a hard-on [...] please stop acting like you're the shit

This comment breaks the HN guidelines. We ban accounts that do this repeatedly. Please post civilly and substantively, or not at all.



I agree. The corporate policy is to not disclose that number, so all that the parent comment translates to is an appeal to authority. "You're wrong because I work for Microsoft and I know better!".

But we're not told what the correct number is, and we have no way of assessing the validity of the claim anyway, so the whole discussion is completely vacuous.

I would agree in the sense that LOC isn't an informative metric at all since there is surely a lot of auto-generated and copy & paste code in there, likely superlineraly more than a medium sized software project. It doesn't matter whether it's off by some orders of magnitude as it's beyond imagination and comparison anyway.

if combined with other data, it could show how inefficient/efficient programmers are

Well, you just said what it doesn't have, so I guess you speak for the company after all.

Anyway, let me guess. Judging by how the size of all binaries shipped with Windows varied between releases, I'd be inclined to think Windows 10 does not have significantly more lines of code than Windows Vista.

So I'd guess at most 100 million lines of code?

You are of course free to consult your employer and draw your line however you like.

For me, I'm comfortable saying that I don't speak for the company and leaving the numbers within an order of magnitude. When it becomes my job to decide which numbers are and aren't fit to talk about publicly, I'm happy to update you.

I wish you'd use standard numbers... like "Libraries of Congress".



It's a pretty hard number to come up with. Most employees only have access to a small fraction of the codebase. Even if you had access to all of it, it's hard to say what actually counts as Windows and what doesn't.

And they probably didn't count all of the test crap and IDW tools that nobody's used since Bill Gates was there, but still get built every time.

Would you rather work on the linux kernel instead of windows?

What's the relation to the post to which you're replying?

I once torrented a WindowsXP+Office2003, ripped, together for about 120MB. Has basic functionalities working great.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact