I (and others) have put a lot of effort into making the Linux Chrome build fast. Some examples are multiple new implementations of the build system ( http://neugierig.org/software/chromium/notes/2011/02/ninja.h... ), experimentation with the gold linker (e.g. measuring and adjusting the still off-by-default thread flags https://groups.google.com/a/chromium.org/group/chromium-dev/... ) as well as digging into bugs in it, and other underdocumented things like 'thin' ar archives.
But it's also true that people who are more of Windows wizards than I am a Linux apprentice have worked on Chrome's Windows build. If you asked me the original question, I'd say the underlying problem is that on Windows all you have is what Microsoft gives you and you can't typically do better than that. For example, migrating the Chrome build off of Visual Studio would be a large undertaking, large enough that it's rarely considered. (Another way of phrasing this is it's the IDE problem: you get all of the IDE or you get nothing.)
When addressing the poor Windows performance people first bought SSDs, something that never even occurred to me ("your system has enough RAM that the kernel cache of the file system should be in memory anyway!"). But for whatever reason on the Linux side some Googlers saw it fit to rewrite the Linux linker to make it twice as fast (this effort predated Chrome), and all Linux developers now get to benefit from that. Perhaps the difference is that when people write awesome tools for Windows or Mac they try to sell them rather than give them away.
Including new versions of Visual Studio, for that matter. I know that Chrome (and Firefox) use older versions of the Visual Studio suite (for technical reasons I don't quite understand, though I know people on the Chrome side have talked with Microsoft about the problems we've had with newer versions), and perhaps newer versions are better in some of these metrics.
But with all of that said, as best as I can tell Windows really is just really slow for file system operations, which especially kills file-system-heavy operations like recursive directory listings and git, even when you turn off all the AV crap. I don't know why; every time I look deeply into Windows I get more afraid ( http://neugierig.org/software/chromium/notes/2011/08/windows... ).
Unlike with Linux it's quite difficult to perform piecemeal inclusion of system header files because of the years of accumulated dependencies that exist. If you want to use the OS APIs for opening files, or creating/using critical sections, or managing pipes, you will either find yourself forward declaring everything under the moon or including Windows.h which alone, even with symbols like VC_EXTRALEAN and WIN32_LEAN_AND_MEAN, will noticeably impact your build time.
DirectX header files are similarly massive too. Even header files that seem relatively benign (the Dinkumware STL <iterator> file that MS uses, for example) end up bringing in a ton of code. Try this -- create a file that contains only:
MSBuild does suck in that there is little implicit parallelism, but you can hack around it. I have a feeling that the Windows build slowness probably comes from that lack of parallelism in msbuild.
As for directory listings it may help to turn off atime, and if it's a laptop enable write caching to main memory. I'm not quite sure why Windows file system calls are so slow, I do know that NTFS supports a lot of neat features that are lacking on ext file systems, like auditing.
As for the bug mentioned, it's perfectly simple to load the wrong version of libc on linux, or hook kernel calls the wrong way. People hook calls on Windows because the kernel is not modifiable, and has a strict ABI, it's a disadvantage if you want to modify the behavior of Win32 / Kernel functions, but a huge advantage if you want to write say, graphics drivers and have them work after a system update.
Microsoft doesn't recommend hooking Win32 calls for the exact reasons outlined in the bug, if you do it wrong you screw stuff up, on the other hand, rubyists seem to love the idea that you can change what a function does at anytime. I think they call it 'dynamic programming'. I can make lots of things crash on Linux by patching ld.config so that a malware version of libc is loaded. I'd hardly blame the design of Windows when malware has been installed.
Every OS/Kernel involves design trade offs, not every trade off will be productive given a specific use case.
Access times: According to the comments on that blog entry (and according to all search hits that I could find, see for example ) atime is already disabled by default on Windows 7, at least for new/clean installs.
The reverse, incidentally, was usually okay. If you could build it with MSBuild, it usually worked in Visual Studio unless you used a lot of custom tasks to move files around.
I personally believe the fact that Visual Studio is all but required to build on Windows is one of the single most common reasons you don't see much OSS that is Windows friendly aside from those that are Java based.
You don't necessarily have to use VS to develop on windows. Mingw works quite well for a lot of cross-platform things and it is gcc and works with gnu make.
My experience with porting OSS between Windows and Linux (both ways) has been that very few developers take the time out to encapsulate OS specific routines in a way that allows easy(ier) porting. You end up having to #ifdef a bunch of stuff in order to avoid a full rewrite.
This is not a claim that porting is trivial. You do run in to subtle and not-so-subtle issues anyway. But careful design can help a lot. Then again this requires that you start out with portability in mind.
I also used Digital Mars for the same, but DMC sometimes fails with big builds.
I use Visual Studio now because I'm using DirectX and I just want something that works out of the box.
/m enables parallelism on the .NET side. Perhaps the same thing exists for the native compiler?
Even better, you can do cross compiling with MinGW. So if your toolchain dosn't perform well on Windows, just use GCC as a cross compiler and build your stuff on a Linux or BSD machine. Then use Windows for testing the executable. (On smaller projects, you usually don't even need Windows for that, since Wine does the job as well.)
(Full disclosure: I'm the maintainer of a Free Software project that makes cross compiling via MinGW very handy: http://mingw-cross-env.nongnu.org/)
(Granted, there might be windows-specific stuff. I haven't checked the windows code)
What we weren't able to compare between the compilers was link-time optimization and profile-guided optimization since Microsoft crippled VC express by removing these optimizations.
So when someone makes claims that 'VC++ generates significantly better code than GCC' I want to see something backing that up. Had I made a blanket statement that 'GCC generates significantly better code than VC++' someone would call me on backing up that aswell, and rightly so.
Also, if you were doing anything heavily floating-point, MSC 2010 would be a bad choice because it doesn't vectorize. Even internally at Microsoft, we were using ICC for building some math stuff. The next release of MSC fixes this.
It would have been interesting to compare the quality of the respective compiler's PGO/LTO optimizations (particularly PGO given that for GCC and ICC code is sometimes up to 20% faster with that optimization) but not interesting enough for us to purchase a Visual Studio licence.
And yes we use floating point math in most of our code, and if MSC doesn't vectorize then that would certainly explain worse performance. However this further denies the blanket statement of 'VC++ generates significantly better code than GCC.' which I was responding to.
PGO on the other hand seems very unlikely to fail due to memory constraints, atleast I've never come across that happening, the resulting code for the profiling stage will of course be bigger since it contains profiling code but I doubt the compilation stage requires alot more memory even though it examines the generated profiling data when optimizing the final binary.
It seems weird that PGO would not work with Chromium given that it's used in Firefox (which is not exactly a small project) to give a very noticeable speed boost (remember the 'Firefox runs faster with windows Firefox binary under wine than the native Linux binary debacle'? That was back when linux Firefox builds didn't use PGO while the windows builds did.)
If you make a claim such as 'GCC produces significantly worse code than alternate compiler A' then it's completely reasonable to ask for something to support it. Tone wise perhaps the post could have been improved, but the principle stands.
Well, if that was true, then you could just buy the better tool and use it, right? I suspect they don't exist because
a) On Linux, you have the existing linker to build your better one on. On Windows, you'd have to write your own from scratch making it less appealing for anyone (only people who'd buy it were those running huge projects like Chrome)
b) What you said about the file system itself just being plain slow.
PS: (Long time follower of evan_tech - nice to see you popup around here :) )
There's also the issue that it seems to slow down exponentially with directory size. In short, worst FS for building large software.
As for the OP's build time complaint about XCode - don't do that. Use the make build instead. Not only does XCode take forever to determine if it needs to build anything at all, it caches dependency computations. So if you do modifications outside of XCode, too, welcome to a world of pain :) (I know evmar knows, but for the OP: Use GYP_GENERATORS=make && gclient runhooks to generate makefiles for Chromium on OSX)
Apple's switching from gcc to clang/llvm, and doing a lot of work on the latter, which is open source.
Every file on Windows keeps ~1024 bytes of info in the file cache. The more the files the more cache would be used.
Recent finding, that sped up our systems from 15->3sec on 300,000+ files filestamp check was to move from _stat to GetFileAttributesEx.
One would not think of doing such things, after all the C api is nice, open, access, bread, _stat are almost all there, but some of these functions do a lot of CPU intensive work (and one is not aware, until just a little bit of disassembly is done).
For example _stat does lots of divisions, dozen of kernel calls, strcmp, strchr, and few other things. If you have Visual Studio (or even Express) the CRT source code is included for one to see.
access() for example is relatively "fast", in terms that there is just mutex (I think, I'm on OSX right now), and then calling GetFileAttributes.
And back to RamMap - it's very useful in the sense that it shows you which files are in the cache, and what portion of them, also very useful that it can flush the cache, so one can run several tests, and optimize for hot-cache and cold-cache situations.
Few months ago, I came up with a scheme, borrowing idea from mozilla - where they would just pre-read certain DLLs that would eventually came up to be loaded (in a second thread).
I did the same for one our tools, the tool is single threaded, reads, operates, then writes. And it usually reads 10x more than it writes. So while the process operation was able to get multi-threaded through OpenMP, reading was not, so instead I had a list of what to read ahead, in a second thread, so that when it comes to the first thread, and it wanted to read, it was taking it from the cache. If the pre-fetcher was behind, it was skipping ahead. There was not even need to keep the contents in memory, just enough to read, and that's it.
For some other tool, where reading patterns cannot be found easily (deep-tree hierarchy) I've made something else instead - saving in a binary file what was read before for the given set of command-line arguments (filtering some). Later that was reused. It cut down on certain operations 25-50%.
One lesson I've learned though is to let the I/O do it's job from one thread... Unless everyone has some proper OS, with some proper drivers, with some proper HW, with....
Tim Bradshaws' widefinder, and widefinder2 competition had also good information. The guy that win it, has on his site some good analysis of multi-threaded I/O (can't find the site now),... But the analysis was basically that it's too erratic - sometimes you get speedups, but sometimes you get slowdowns (especially with writes).