Hacker News new | comments | show | ask | jobs | submit login

I don't know this poster, but I am pretty familiar with the problem he's encountering, as I am the person most responsible for the Chrome build for Linux.

I (and others) have put a lot of effort into making the Linux Chrome build fast. Some examples are multiple new implementations of the build system ( http://neugierig.org/software/chromium/notes/2011/02/ninja.h... ), experimentation with the gold linker (e.g. measuring and adjusting the still off-by-default thread flags https://groups.google.com/a/chromium.org/group/chromium-dev/... ) as well as digging into bugs in it, and other underdocumented things like 'thin' ar archives.

But it's also true that people who are more of Windows wizards than I am a Linux apprentice have worked on Chrome's Windows build. If you asked me the original question, I'd say the underlying problem is that on Windows all you have is what Microsoft gives you and you can't typically do better than that. For example, migrating the Chrome build off of Visual Studio would be a large undertaking, large enough that it's rarely considered. (Another way of phrasing this is it's the IDE problem: you get all of the IDE or you get nothing.)

When addressing the poor Windows performance people first bought SSDs, something that never even occurred to me ("your system has enough RAM that the kernel cache of the file system should be in memory anyway!"). But for whatever reason on the Linux side some Googlers saw it fit to rewrite the Linux linker to make it twice as fast (this effort predated Chrome), and all Linux developers now get to benefit from that. Perhaps the difference is that when people write awesome tools for Windows or Mac they try to sell them rather than give them away.

Including new versions of Visual Studio, for that matter. I know that Chrome (and Firefox) use older versions of the Visual Studio suite (for technical reasons I don't quite understand, though I know people on the Chrome side have talked with Microsoft about the problems we've had with newer versions), and perhaps newer versions are better in some of these metrics.

But with all of that said, as best as I can tell Windows really is just really slow for file system operations, which especially kills file-system-heavy operations like recursive directory listings and git, even when you turn off all the AV crap. I don't know why; every time I look deeply into Windows I get more afraid ( http://neugierig.org/software/chromium/notes/2011/08/windows... ).




When developing on Windows it pays significant dividends to manage your include files. There are a number of files provided as part of Visual Studio and/or the Windows SDK that bring in a tremendous amount of large other files.

Unlike with Linux it's quite difficult to perform piecemeal inclusion of system header files because of the years of accumulated dependencies that exist. If you want to use the OS APIs for opening files, or creating/using critical sections, or managing pipes, you will either find yourself forward declaring everything under the moon or including Windows.h which alone, even with symbols like VC_EXTRALEAN and WIN32_LEAN_AND_MEAN, will noticeably impact your build time.

DirectX header files are similarly massive too. Even header files that seem relatively benign (the Dinkumware STL <iterator> file that MS uses, for example) end up bringing in a ton of code. Try this -- create a file that contains only:

    #include <vector>
Preprocess it with GCC (g++ -E foo.cpp -o foo.i) and MSVC (cl -P foo.cpp) and compare the results -- the MSVC 2010 version is seven times the size of the GCC 4.6 (Ubuntu 11.10) version!


I came here to say this. Anytime Unix people complain about slow builds "on Windows," 9 times out of 10 it's because they're ignoring precompiled headers. The other 1 time out of 10 it's because they're using configure scripts under cygwin and finding out how expenseive cygwin's implementation of fork() can be.


You can build almost any Visual Studio project with out using visual studio at all. Visual Studio project files are also MSBuild files. I've setup lots of build machines sans Visual Studio, projects build just fine with out it.

MSBuild does suck in that there is little implicit parallelism, but you can hack around it. I have a feeling that the Windows build slowness probably comes from that lack of parallelism in msbuild.

As for directory listings it may help to turn off atime, and if it's a laptop enable write caching to main memory. I'm not quite sure why Windows file system calls are so slow, I do know that NTFS supports a lot of neat features that are lacking on ext file systems, like auditing.

As for the bug mentioned, it's perfectly simple to load the wrong version of libc on linux, or hook kernel calls the wrong way. People hook calls on Windows because the kernel is not modifiable, and has a strict ABI, it's a disadvantage if you want to modify the behavior of Win32 / Kernel functions, but a huge advantage if you want to write say, graphics drivers and have them work after a system update.

Microsoft doesn't recommend hooking Win32 calls for the exact reasons outlined in the bug, if you do it wrong you screw stuff up, on the other hand, rubyists seem to love the idea that you can change what a function does at anytime. I think they call it 'dynamic programming'. I can make lots of things crash on Linux by patching ld.config so that a malware version of libc is loaded. I'd hardly blame the design of Windows when malware has been installed.

Every OS/Kernel involves design trade offs, not every trade off will be productive given a specific use case.


I agree about your msbuild points, in fact I don't think it's that bad of a build system and just under-utilized (hidden, for the most part/most users, behind the shiny GUI tools).

Access times: According to the comments on that blog entry (and according to all search hits that I could find, see for example [1]) atime is already disabled by default on Windows 7, at least for new/clean installs.

1: http://superuser.com/questions/200126/are-there-any-negative...


Regarding MSBuild, the biggest problem I had with it is that if you built projects with Visual Studio, using most of the standard tooling for adding references and dependencies, you'd often be left with a project that built fine with Visual Studio, but had errors with MSBuild.

The reverse, incidentally, was usually okay. If you could build it with MSBuild, it usually worked in Visual Studio unless you used a lot of custom tasks to move files around.

I personally believe the fact that Visual Studio is all but required to build on Windows is one of the single most common reasons you don't see much OSS that is Windows friendly aside from those that are Java based.


> I personally believe the fact that Visual Studio is all but required to build on Windows is one of the single most common reasons you don't see much OSS that is Windows friendly aside from those that are Java based

You don't necessarily have to use VS to develop on windows. Mingw works quite well for a lot of cross-platform things and it is gcc and works with gnu make.

My experience with porting OSS between Windows and Linux (both ways) has been that very few developers take the time out to encapsulate OS specific routines in a way that allows easy(ier) porting. You end up having to #ifdef a bunch of stuff in order to avoid a full rewrite.

This is not a claim that porting is trivial. You do run in to subtle and not-so-subtle issues anyway. But careful design can help a lot. Then again this requires that you start out with portability in mind.


I like to make multi-platform code, and I do it with CMake, Boost, and Qt. My target platforms are Linux/g++ and Visual Studio (not mingw). It usually works OK after a little tweaking, but you have to maintain discipline on whichever system you're coding on, and not use nonportable pragmas etc.


I used to build wxWidgets and all my personally made Windows Software using mingw and gcc. Sadly, gcc is far, far slower than other compilers in windows (and once or twice I have found some compiler errors in gcc).

I also used Digital Mars for the same, but DMC sometimes fails with big builds.

I use Visual Studio now because I'm using DirectX and I just want something that works out of the box.


> MSBuild does suck in that there is little implicit parallelism

/m enables parallelism on the .NET side. Perhaps the same thing exists for the native compiler?


What is preventing you from using MinGW? That way, you could use the GNU toolchain (Make, GCC, Binutils etc.) and still have full access to the Win32 API. You could reuse almost all of your Unix build scripts, and the rest boils usually down to making your scripts aware of different file extensions (.exe/.dll instead of /.so).

Even better, you can do cross compiling with MinGW. So if your toolchain dosn't perform well on Windows, just use GCC as a cross compiler and build your stuff on a Linux or BSD machine. Then use Windows for testing the executable. (On smaller projects, you usually don't even need Windows for that, since Wine does the job as well.)

(Full disclosure: I'm the maintainer of a Free Software project that makes cross compiling via MinGW very handy: http://mingw-cross-env.nongnu.org/)


VC++ generates significantly better code than GCC. Enough so that performance-minded projects usually wouldn't consider MinGW/GCC for Windows code.


Does it matter during development though? You could always develop on GNU toolchain and then make a final build in VC once the feature code is complete.


I actually don't understand the original poster's slowness with Windows. We use Perforce and VC++ .slns, sometimes with apps split into DLLs and get none of the slowness the poster talks about. Actually we get significantly better performance with this than under Unix with GCC. No-change or small change rebuilds take a few seconds, with the time being dominated by the link and proportional to the link size.


Only if you are careful to only use the subset of the C++ spec supported by both compilers and avoid all gcc specific features.


In my experience, you don't have to be careful. I've written lot's of C++ that compile fine on Windows (using mingw/msys) or Linux/Mac using gcc. Can you provide an example of where gcc specific features are included w/o the developer explicitly doing so?


This is pretty common and not particularly difficult. Most large cross-platform C++ projects (ie most browsers and game engines) compile in both gcc and msvc. It is easy to naively write code in one compiler that won't build in another, but it's also easy to fix said code once you try to build it with another compiler.


I mentioned that in a context of application which is crossplatform already and when something is being developed, it affects (in most cases anyways) all platforms in the same way - so this limitation is "built-in" already.


Which might be a good choice for working on Chromium anyways, what with it being cross-platform :)

(Granted, there might be windows-specific stuff. I haven't checked the windows code)


They would be doing this anyway.


Please show me something to back this nonsense up.


How about you back up calling it nonsense?


We've done compiler tests between VC express 2010 and GCC (Mingw gcc 4.6.x branch) at work with GCC beating VC express at '-O3 -ffast-math -march=corei7' vs '/O2 /arch:SSE2' for our code on Intel Core i7 nehalem. GCC even beat ICC on certain tests on that same Intel hardware.

What we weren't able to compare between the compilers was link-time optimization and profile-guided optimization since Microsoft crippled VC express by removing these optimizations.

So when someone makes claims that 'VC++ generates significantly better code than GCC' I want to see something backing that up. Had I made a blanket statement that 'GCC generates significantly better code than VC++' someone would call me on backing up that aswell, and rightly so.


So when you didn't use the two most important perf features in MSC, its performance was underwhelming. This is no surprise.

Also, if you were doing anything heavily floating-point, MSC 2010 would be a bad choice because it doesn't vectorize. Even internally at Microsoft, we were using ICC for building some math stuff. The next release of MSC fixes this.


Well we obviously didn't enable PGO/LTO for GCC either when doing these tests as that would have been pointless.

It would have been interesting to compare the quality of the respective compiler's PGO/LTO optimizations (particularly PGO given that for GCC and ICC code is sometimes up to 20% faster with that optimization) but not interesting enough for us to purchase a Visual Studio licence.

And yes we use floating point math in most of our code, and if MSC doesn't vectorize then that would certainly explain worse performance. However this further denies the blanket statement of 'VC++ generates significantly better code than GCC.' which I was responding to.


I believe that at least one of the projects the original blog post mentioned - Chromium - can't be compiled with LTO or PGO enabled. Apparently the linker just runs out of memory with it and most large projects.


Well it makes sense that LTO would have high memory requirements given that the point of the optimization is to look at the entire program as one entity rather than on a file by file scope and I have no doubt this can cause problems with very large projects.

PGO on the other hand seems very unlikely to fail due to memory constraints, atleast I've never come across that happening, the resulting code for the profiling stage will of course be bigger since it contains profiling code but I doubt the compilation stage requires alot more memory even though it examines the generated profiling data when optimizing the final binary.

It seems weird that PGO would not work with Chromium given that it's used in Firefox (which is not exactly a small project) to give a very noticeable speed boost (remember the 'Firefox runs faster with windows Firefox binary under wine than the native Linux binary debacle'? That was back when linux Firefox builds didn't use PGO while the windows builds did.)


With respect, this is not how claims work. You can't make a claim and then expect your opponents to have the burden of proof to refute it.

If you make a claim such as 'GCC produces significantly worse code than alternate compiler A' then it's completely reasonable to ask for something to support it. Tone wise perhaps the post could have been improved, but the principle stands.


Chrome on Linux is awesome. At some point in the past year I stopped having to use command line options to keep it in memory as it defaults to that. Brilliant and slick, I love it. Chrome's performance and syncing is the reason I was able to transition almost entirely over to Linux from Mac this year with very little workflow disruption.


Perhaps the difference is that when people write awesome tools for Windows or Mac they try to sell them rather than give them away.

Well, if that was true, then you could just buy the better tool and use it, right? I suspect they don't exist because

a) On Linux, you have the existing linker to build your better one on. On Windows, you'd have to write your own from scratch making it less appealing for anyone (only people who'd buy it were those running huge projects like Chrome)

b) What you said about the file system itself just being plain slow.

PS: (Long time follower of evan_tech - nice to see you popup around here :) )


A large part of Windows slowness is NTFS, as you allude to in your last sentence. There are a myriad ways to slow the damn thing down, and none to make it significantly faster.

There's also the issue that it seems to slow down exponentially with directory size. In short, worst FS for building large software.

As for the OP's build time complaint about XCode - don't do that. Use the make build instead. Not only does XCode take forever to determine if it needs to build anything at all, it caches dependency computations. So if you do modifications outside of XCode, too, welcome to a world of pain :) (I know evmar knows, but for the OP: Use GYP_GENERATORS=make && gclient runhooks to generate makefiles for Chromium on OSX)


the make build didn't exist at the time I posted that AFAIK. The make build is oodles faster than the xcode build and I use it now of course but it's still the slowest platform to build chromium on by a long margin.


"Perhaps the difference is that when people write awesome tools for Windows or Mac they try to sell them rather than give them away."

Apple's switching from gcc to clang/llvm, and doing a lot of work on the latter, which is open source.


On Windows, SysInternals's RamMap is your friend. Also the System Internals book (#5 I think).

Every file on Windows keeps ~1024 bytes of info in the file cache. The more the files the more cache would be used.

Recent finding, that sped up our systems from 15->3sec on 300,000+ files filestamp check was to move from _stat to GetFileAttributesEx.

One would not think of doing such things, after all the C api is nice, open, access, bread, _stat are almost all there, but some of these functions do a lot of CPU intensive work (and one is not aware, until just a little bit of disassembly is done).

For example _stat does lots of divisions, dozen of kernel calls, strcmp, strchr, and few other things. If you have Visual Studio (or even Express) the CRT source code is included for one to see.

access() for example is relatively "fast", in terms that there is just mutex (I think, I'm on OSX right now), and then calling GetFileAttributes.

And back to RamMap - it's very useful in the sense that it shows you which files are in the cache, and what portion of them, also very useful that it can flush the cache, so one can run several tests, and optimize for hot-cache and cold-cache situations.

Few months ago, I came up with a scheme, borrowing idea from mozilla - where they would just pre-read certain DLLs that would eventually came up to be loaded (in a second thread).

I did the same for one our tools, the tool is single threaded, reads, operates, then writes. And it usually reads 10x more than it writes. So while the process operation was able to get multi-threaded through OpenMP, reading was not, so instead I had a list of what to read ahead, in a second thread, so that when it comes to the first thread, and it wanted to read, it was taking it from the cache. If the pre-fetcher was behind, it was skipping ahead. There was not even need to keep the contents in memory, just enough to read, and that's it.

For some other tool, where reading patterns cannot be found easily (deep-tree hierarchy) I've made something else instead - saving in a binary file what was read before for the given set of command-line arguments (filtering some). Later that was reused. It cut down on certain operations 25-50%.

One lesson I've learned though is to let the I/O do it's job from one thread... Unless everyone has some proper OS, with some proper drivers, with some proper HW, with....

Tim Bradshaws' widefinder, and widefinder2 competition had also good information. The guy that win it, has on his site some good analysis of multi-threaded I/O (can't find the site now),... But the analysis was basically that it's too erratic - sometimes you get speedups, but sometimes you get slowdowns (especially with writes).


Thanks for the GetFileAttributesEx tip -- https://github.com/martine/ninja/commit/93c78361e30e33f950ee...


Firefox uses VS2005 for builds in order to support XP pre-SP3.


tldr; no one at Google uses windows, a lot of time was spent optimizing the linux build, zero time was spent optimizing the windows builds.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: