
24-core CPU and I can’t move my mouse - joebaf
https://randomascii.wordpress.com/2017/07/09/24-core-cpu-and-i-cant-move-my-mouse/
======
titzer
Full disclosure: I work for Google on Chrome.

A Chrome build is truly a computational load to be reckoned with. Without the
distributed build, a from-scratch build of Chrome will take at least 30
minutes on a Macbook Pro--maybe an hour(!). TBH I don't remember toughing out
a full build without resorting to goma. Even on a hefty workstation, a full
build is a go-for-lunch kind of interruption. It will absolutely _own_ a
machine.

How did we get here? Well, C++ and its stupid O(n^2) compilation complexity.
As an application grows, the number of header files grows because, as any sane
and far-thinking programmer would do, we split the complexity up over multiple
header files, factor it into modules, and try to create encapsulation with
getter/setters. However, to actually have the C++ compiler do inlining at
compile time (LTO be damned), we have to put the definitions of inline
functions into header files, which greatly increases their size and processing
time. Moreover, because the C++ compiler needs to see full class definitions
to, e.g., know the size of an object and its inheritance relationships, we
have to put the main meat of every class definition into a header file! Don't
even get me started on templates. Oh, and at the end of the day, the linker
has to clean up the whole mess, discarding _the vast majority_ of the
compiler's output due to so many duplicated functions. And this blowup can be
huge. A debug build of V8, which is just a small subsystem of Chrome, will
generate about 1.4GB of .o files which link to a 75MB .so file and 1.2MB
startup executable--that's a 18x blowup.

Ugh. I've worked with a lot of build systems over the years, including
Google's internal build system open sourced as Bazel. While these systems have
scaled C++ development far further than ever thought possible, and are
remarkable in the engineering achievement therein, we just need to step back
once in a while and ask ourselves:

Damn, are we doing this wrong?

~~~
mpweiher
> Damn, are we doing this wrong?

Yes, and I mean "we" as in "this industry".

I just recently talked to someone whose Swift framework(s) were compiling at
roughly 16 lines per second. Spurred on by Jonathan Blow's musings on compile
times for Jai[1], I started tinkering with tcc a little. It compiled a
generated 200KLOC file (lots of small functions) in around 200 ms.

Then there are Smalltalk and Lisp systems that are always up and pretty much
only ever compile the current method.

We also used to have true separate compilation, but that appears to be falling
out of favor.

Of course none of these are C++ and they also don't optimize as well etc. Yet,
how much of the code really needs to be optimized that well? And how much of
that code needs to be recompiled that much?

So we know how to solve this in principle, we just aren't putting the pieces
together.

[1]
[https://www.youtube.com/watch?v=14zlJ98gJKA](https://www.youtube.com/watch?v=14zlJ98gJKA)
He mentioned that a decent size game should compile either instantly or in a
couple of seconds

~~~
andrepd
Are you serious? You want to make a product: a Web browser. What technology
are you going to choose? The one that makes your browser fast but gives you
more work, or the one that makes your browser slower but _makes compilation
less of a nuisance to you_? It's mind boggling that some would openly say
that, hey, who cares about performance that much, these compilation times
bother me, the developer. On a browser of all things! Would you be okay with
your browser being 2x or 3x slower?

~~~
gizmo
False dilemma. You can have both. It's okay if a fully optimized release mode
binary takes a bit longer to compile, but compiling a few million lines of
code for a debug build shouldn't take more than a second or two.

Also consider the leverage factor. Improvements to the compiler benefit all
users of the programming language, so it's worthwhile to invest in high
quality compilers.

~~~
striking
In what language can we have both?

Yes, we can use caching compilers
([https://wiki.archlinux.org/index.php/ccache](https://wiki.archlinux.org/index.php/ccache))
to speed up builds with few changes. We can lower optimization levels
(although that barely gives you an increase in compile speed compared to the
runtime speed you lose and the fact that it makes your program do slightly
different things).

There's no slider from "pessimum" to "optimum". You need to do wildly
different things to optimize past this point for compile speed. Erlang hot-
reload and at-runtime-code-gen from other langs come to mind. But that will
almost definitely slow down your program because of the new infrastructure
your code has to deal with.

I have observed that there can be a nice balance with Java and the auto-
reloading tools that are available for it. But I am unaware of their
limitations and how a web browser might trigger those limitations.

~~~
Shorel
> In what language can we have both?

D has very fast compilation times compared with C++.

Rust is another option.

~~~
twic
Rust doesn't compile very quickly at the moment. Helpfully, there's a live
thread about the matter on r/rust [1]. Broadly speaking, it's about the same
as C++. Some aspects are faster, some slower. Points worth noting (from that
thread and elsewhere):

* Everything up to and including typechecking (and borrowchecking) takes a third to a half of the time, with lowering from there to a binary taking the rest of the time; that means (a) you can get a 2-3x speedup if you only need to check the code is compilable, and (b) overall speed isn't likely to improve a lot unless LLVM gets a lot faster.

* Rust doesn't currently do good incremental compilation, so there are potential big wins for day-to-day use there.

* There is a mad plan to do debug builds (unoptimised, fast, for minute-to-minute development) using a different compiler backend, Cretonne [2]. If that ever happens, it could be much, much faster.

[1]
[https://www.reddit.com/r/rust/comments/6m97hl/how_do_typical...](https://www.reddit.com/r/rust/comments/6m97hl/how_do_typical_rust_compilation_times_compare/)

[2] [https://internals.rust-lang.org/t/possible-alternative-
compi...](https://internals.rust-lang.org/t/possible-alternative-compiler-
backend-cretonne/4275)

------
weinzierl
I grew up on the Commodore 64 (1 Core, 1 hyper-thread :-), almost 1 MHz clock
freq, almost 64 K usable RAM).

The machine was usually pretty responsive, but when I typed too quickly in my
word processor it sometimes got stuck and ate a few characters. I used to
think: "If computers were only fast enough so I could type without
interruption...". If you'd asked me back then for a top ten list what I wished
computers could do, this would certainly have been on the list.

Now, 30 years later, whenever my cursor gets stuck I like to think:

"If computers were only fast enough so I could type without interruption..."

~~~
fredley
It's all just trade offs. However fast or powerful your machine is, software
will use as much of that resource as possible, up to the point where it
occasionally interferes with input (but not too much or you'll switch to
something else).

~~~
anigbrowl
But why? What good is an operating system on a multi-core device that allows
anything to get that close to the performance envelope? This is a fine example
of competition driving change for change's sake rather than real innovation
and everything ending up worse as a result. I like new features as much as the
next person, but not when they compromise core functionality. Not being able
to type is _inexcusable_.

~~~
TeMPOraL
I agree. However at this point, I don't see anything an OS can do to help.

I see plenty of typing slowdowns every other day now. But I'm not sure just
how many of them are the OS's fault. When your typing seems to lag, there are
_two_ places that can be slowing it down - the input side (reacting to
hardware events) and the output side (drawing and updating the UI).

I suppose keyboard buffers are pretty well isolated, and native UI controls
tend to work fine too. The problem is, everyone now goes for non-native
controls. You type in e.g. Firefox, and it is slow not because of your OS, but
because Firefox does all its UI drawing by itself. And God help you if the
application you want to use is done in Electron. There's so many layers of
non-nativeness on top of that, that the OS has close to zero say in what's
being done. There's no way to help that - resource quotas will only make the
problem worse, and giving such program free reign will only take everything
else down.

All in all, it's just - again - problem of people writing shitty software,
because of laziness and time-to-market reasons. Blame "Worse is Better".

~~~
mark-r
It's hard to believe the OS couldn't help when Windows 10 has the problem but
Windows 7 doesn't.

~~~
TeMPOraL
In _this particular case_ , yes. But this subthread was about a more general
principle of letting an app exhaust system's performance. What I'm saying is
that, when facing a crappily coded app, the OS can at best choose between
letting it suck or letting the performance of _everything_ suck.

------
blunte
This is great work. I hope MS can improve Windows 10 by fixing this. I just
added a new Win10 laptop with much better specs than my 3 year old rMBP, and
I'm shocked by how much apparently random latency I experience with the UI in
Windows 10 compared to the Mac. That's not to mention the issues of sloppier
track pad (which constantly detects my left hand while I type) or the ungodly
slow unzip (via 7z).

If only Apple would give us more than 16GB of RAM (in a laptop)... what a
frustrating world for developers.

~~~
ShinTakuya
I mean the trackpad issue is generally down to the manufacturer and the
drivers they provide, but I take your point otherwise.

A pro tip with 7z is to use the two pane interface to extract and not dragging
files to Explorer. The latter option extracts the files to a temp directory
before copying them whereas the former extracts directly to the destination.

The Windows file systems are pretty cruddy in general though so decompressing
tonnes of small files in general will take longer than on OS X or Linux.

~~~
SyneRyder
_> I mean the trackpad issue is generally down to the manufacturer and the
drivers they provide..._

Was just wondering this myself. I tried a Surface Laptop this week, and its
trackpad has everything I love from the 2012-era Mac trackpads. A satisfyingly
snappy physical mouse click, gestures & acceleration, and a perfect size. (I
can't stand Force Touch on the new MacBook Pro, and I don't like their new
giant trackpads.)

~~~
zokier
MS is trying to fix the driver situation with their "precision touchpad"
initiative: [https://arstechnica.com/gadgets/2016/10/pc-oems-ditch-the-
cu...](https://arstechnica.com/gadgets/2016/10/pc-oems-ditch-the-custom-
touchpad-drivers-give-us-precision-touchpad/)

------
albertzeyer
I was wondering what ETW is, and found in one of his other posts: ETW (Event
Tracing for Windows).

He seems to be one of the main contributors of
[https://github.com/google/UIforETW](https://github.com/google/UIforETW)

Seems to be quite useful.

\---

About the post: tl;dr: NtGdiCloseProcess has a system-wide global lock which
is used quite often e.g. during a build of Chrome which spawns a lot of
processes. This problem seems to be introduced between Windows 7 and Windows
10.

I thought that there would be a solution or a fix but it seems this is not yet
fixed. "This problem has been reported to Microsoft and they are
investigating."

~~~
mhw
As a Windows outsider, I'm puzzled why programs used as part of the Chrome
build system (which I'd expect to only use console I/O) are using APIs that
cause interactions with the GUI? By analogy, is this not like gcc redundantly
setting up a connection to the Xserver each time it is run?

(I'm making an assumption that NtGdiCloseProcess is part of the GUI API (GDI
== Graphics Device Interface) which is why it may interact with the GUI
message passing.)

~~~
slededit
In the very early days of NT the GDI subsystem was in userland and you
wouldn't have this problem. Unfortunately it was too slow for machines of the
90s and so GDI+user32 is very tightly integrated with the kernel.

Even to the point where it does neat things like callback user mode code from
the kernel. Unwinding this without breaking things is nigh impossible at this
point.

------
quotemstr
Why is everyone talking about how C++ compilation is slow or something instead
of talking about the real problem, which is that GDI is doing resource cleanup
on process exit, under lock, for console-mode processes that have probably
never used any GDI resources?

~~~
yuhong
mxatone mentions that "Win32k locking design is just bad" in
[https://twitter.com/mxatone/status/884436870955913216](https://twitter.com/mxatone/status/884436870955913216)

~~~
quotemstr
It's not even bad. It's neglecting any kind of "has this process ever touched
win32k.sys" test.

------
dankohn1
I feel like the (brilliant post) is missing the context that if you were
debugging the latency on Linux, you would have the source code to continue the
investigation until you found and fixed the problem, as opposed to just teeing
it up for Microsoft.

~~~
hghwng
Most of the problems are solvable, if proper tools are used. Microsoft
provides "PDB" files, which contains symbols for ease of debugging. You can
get them from Microsoft's symbol server. Load the symbol and the binary in
IDA, and the generated pseudo code is enough for most scenarios.

In theory, debugging programs on Linux should be easier. However, for some
distributions (like Arch Linux) debug symbols are not provided. You have to
compile the program on your own if you want to debug. It's especially painful
if the target program has a large codebase!

------
ComodoHacker
I remember every other book on Windows programming saying "Process
creation/destruction is expensive, use thread pools (or at least process
pools) instead, that's the way to go on Windows". Perhaps this mindset is
ingrained for Windows QA team too - they don't have [enough] test cases for
such scenarios.

~~~
usefulcat
Seems like a feedback loop. It's expensive, so most apps avoid doing that,
meaning there's less need to check for performance regressions. Then if there
is a regression, it further increases the incentive for developers to avoid
it, so in the future even fewer apps will do that, making it even more of an
unusual use case, so even less need to test for it..

------
bitwize
The Amiga prioritized user-input interrupts before all other interrupts, so if
there ever _was_ a time you couldn't move the mouse on the Amiga, it meant
that the system was well and truly crashed.

30 years on and the peecee industry still doesn't know how to design a fucking
system.

~~~
amiga-workbench
I was just thinking that, my A1200 never skipped a beat.

------
martyvis
For my own amusement, when ever I get a new OS build on my machine, I'd open
up a task manager and watch CPU load just by wiggling the mouse a lot or maybe
simply pressing page up and down. I'm pretty sure it's always pretty easy to
generate 25% CPU doing very little. Another thing would be just opening a
local file from within a running application and wondering why multiple
seconds and hence billions of CPU cycles seems to be consumed what one expects
should be doing a fairly menial task. (I am pretty sure in DOS 3.3 with Norton
Commander it was quicker )

~~~
zlynx
That used to be true back when the CPU did all of the GUI rendering. But now
most all of it is offloaded to the GPU. Any GPU that can render Quake 3 Arena
at 120 FPS (and that's ALL of them even Intel IGPs) can wiggle a window around
very easily.

Not sure about file opens. Simple applications like GVIM don't seem to have
seconds of delay for me, but I know what you mean with things like spreadsheet
or word processor files. I guess it is all of the unzipping and XML
processing.

~~~
StillBored
Actually, its not as accelerated as it was back in the winXP days. Here is a
random link about it: [https://www.youtube.com/watch?v=ay-
gqx18UTM](https://www.youtube.com/watch?v=ay-gqx18UTM).

Basically, the GPU card vendors could hook any part of the win32 GDI pre
windows vista, and they did. Post vista, only a tiny portion of the GDI is
accelerated. In theory you can avoid this by writing your application using a
more modern API, but the vast majority of native windows applications continue
to basically be GDI based due to age, or various GUI toolkits still being GDI
based. Worse there are a number of toolkits (or browsers) which implement
their own drawing routines rather than calling the system supplied ones.

The final composited results with aero are of course accelerated, but that
only really tends to add additional latency. Switching to one of the basic
themes makes win7 noticeably more responsive, but also tends to tear a lot.
I've got an incredibly high end desktop machine, carefully tweaked/optimized
and I can frequently see it the ~1/10 of a second lags while windows update
after being maximized/etc. Compared to the 10 year old pretty high end XP
machine (with upgraded SSD/etc, and also carefully tuned) it doesn't seem to
be faster on basic desktop type operations. Fire up a recent game, or doing
builds its massively faster but running word/firefox/whatever the old machine
"feels" faster.

(tuned, as in I have a dozen or so, tweaks I've been collecting/researching
for the past decade+, on ways to make the machine feel more responsive, it all
started with MenuShow delay in win95, and has grown from there and now
includes all the usual stuff plus tweaking power profiles, and a bunch of less
obvious "feel" things like high DPI mice with fast base speeds).

~~~
stinos
_tweaks I 've been collecting/researching for the past decade_

Do you happen to have this available to the public somewhere?

------
ksk
Speaking of W10, there was another annoying W10 bug where if you started
typing immediately after using the touchpad there was a random delay. If you
care about latency and responsiveness, it makes you want to scream at the
people who implement these features.

~~~
mrstone
That's not a bug. It's some kind of feature by Symantec that can be disabled
in the settings.

~~~
ksk
Haha, it never crossed my mind that someone would implement this purposely. I
don't have a W10 laptop so I've only noticed this on other peoples machines.

~~~
NuDinNou
I use this feature on my laptop. Without it I would accidentally touch the
touchpad and insert characters where there aren't supposed to be.

------
snarfy
My 16mhz 68000 based Amiga 500 had smoother mouse movement than my 3.2ghz 8
core desktop.

~~~
rbanffy
IIRC, on the Amigas the CPU was barely involved in the process of reading the
mouse events and moving the cursor sprite on the screen.

~~~
snarfy
Yep, hardware sprites.

~~~
TazeTSchnitzel
Modern GPUs still have hardware sprites for the cursor, and OSes still use
them! However, the path between the mouse and updating that sprite has gotten
more complex, alas.

~~~
rbanffy
I imagine a modern day Amiga would run all mouse and window server code on the
GPU, only telling the CPU when the app needs to update its bitmap.

------
stargrazer
I think the best summary of how to do this is to take a look at Herb Sutter's
three article set on "Minimizing Compile-Time Dependencies".
[https://herbsutter.com/gotw/](https://herbsutter.com/gotw/)

"The Compilation Firewall", or pimpl idiom, is a clever mechanism for getting
code out of the header:
[https://herbsutter.com/gotw/_100/](https://herbsutter.com/gotw/_100/)

~~~
cheez
I use this liberally. Generally, it is a runtime performance issue only on
repeated allocations for which you can optimize if needed. Once-only
allocations can be ignored when using compiler firewalls.

~~~
lallysingh
And the second cache miss: 1 for the cache miss, 1 for the actual object.

~~~
cheez
Right. Usually the entire application is not performance critical. Only
certain sections are. For that part of the application, profile and optimize.

------
ericfrederich
I had the same exact problem. A machine that is overkill spec wise but the
mouse and keyboard would freeze up every minute on the minute.

I tracked it down to my desktop wallpaper being on a rotation. Seriously...
how bad does that have to be implemented in Windows to actually hang the mouse
and keyboard?

~~~
Eridrus
For all the hate Windows gets, UI lockup hasn't really been a pervasive issue
for me in years, whereas on my Linux desktop that's just a typical day.

~~~
_arvin
My Linux desktop never locks up. I'd suggest messing around with a different
kernel, double checking graphic drivers, etc.

~~~
Eridrus
I was probably a bit imprecise; I do sometimes get full lock ups where I have
to reboot, mostly when running TensorFlow locally, but usually I just get lock
ups that last < 10 seconds when compiling things in the background.

It's at the level of "annoying, but doesn't impact my work", so I just live
with it.

~~~
_arvin
Ah I see. Which scheduler are using?

$ cat /sys/block/sda/queue/scheduler

noop deadline cfq [bfq]

I've been very pleased with the responsiveness of BFQ when multitasking.

And actually, I shouldn't say I never get lock-ups. There's been a couple
times where the DE just freezes, and I'll have to hit Ctrl+Alt+F2 to switch to
another tty and restart the display manager. But I attribute that to running
bleeding-edge version of things and enabling experimental features, so that's
fair.

Lastly, my MacBook Pro (2010 6,2) would also experience random freezes on
Ubuntu (would have to power off/on) and upgrading the kernel from the Ubuntu
default to latest mainline solved that problem completely.

~~~
Eridrus
$ cat /sys/block/sda/queue/scheduler cat: /sys/block/sda/queue/scheduler: No
such file or directory

I guess my work does something weird with our desktop linux installs, laptop
says this though:

$ cat /sys/block/sda/queue/scheduler noop [deadline] cfq

------
iainmerrick
The first time I had to compile a Linux kernel for Android, it took only a few
seconds (on the ridiculously overpowered "build machine" my employer
supplied). I was sure I must have done something wrong, but no, that was the
entire build. It takes longer for Android to reboot than it does to build the
kernel.

It does feel like there's something seriously wrong with the massive C++
codebases we use these days for key infrastructure like browsers, and the
massive compilation times we put up with.

------
hindsightbias
What Knuth said: "Premature optimization is the root of all evil"

What most developers hear: "Optimization is the root of all evil"

------
sgift
Probably not really feasible, but I'd be interested if something comparable
happens when your build process uses many threads instead of many processes. I
still think using processes instead of threads is a hack, though I know the
mainstream opinion says nowadays processes are the way to go and threads are a
hack.

------
elorant
We've come to the point where building the browser from scratch takes more
than building the OS itself.

~~~
Beltiras
Interesting metric actually. Building Windows takes 12 hours [1]. It's a bit
harder to find metrics on Linux. You can build the kernel in 60 seconds
apparently [2] but that is not a complete operating system.

[1] [https://stackoverflow.com/questions/226377/operating-
system-...](https://stackoverflow.com/questions/226377/operating-system-
compile-time) [2]
[http://www.phoronix.com/scan.php?page=news_item&px=MTAyNjU](http://www.phoronix.com/scan.php?page=news_item&px=MTAyNjU)

~~~
shortsightedsid
Someone's old data from 2012 to build a core-image-sato using Yocto is just
over an hour - [http://www.burtonini.com/blog/2012/11/15/yocto-build-
times/](http://www.burtonini.com/blog/2012/11/15/yocto-build-times/).

That said, core-image-sato used to be just a simple demo image. In my old
build system, we would build the Arago Project for a TI SoC every night and it
would take a few hours but that included a lot of the DSP code as well which
was really slow too. So a couple of hours on average is my guess.

------
coldcode
Try running Carthage update for an iOS app with 80 frameworks. Ever seen a Mac
hit 55GB of ram? MacOS kills it. It takes many of these runs, and all this
does is fetch prebuilt binaries. Yes, everything I said here is beyond stupid.

~~~
sixstringtheory
Is it necessary to update _all_ of them at once? I think that's the same
problem as OP... rebuilding the whole world is slow, so why not do it
incrementally?

Also, 80 frameworks ⊙⊙ I'm assuming some of these are internal, and written in
Swift, meaning they can't be compiled into static libs (easily).

------
cat199
Can take the alternate approach -

Run your builds in a hugely underpowered VM, and wait much longer.. your
regular usage will be largely unimpacted, although the builds take longer.

Source: Currently running a ~1000 package dpb(1)[1] build of my needed openbsd
ports on a dual-core KVM machine hosted on a 8-9 year old amd64x2 2.2ghz. 3
Days and counting, will probably be done around next weekend.

From there, incremental updates are mostly slight, and can complete overnight
from a cron job.

.. [1] [https://man.openbsd.org/dpb](https://man.openbsd.org/dpb)

~~~
metalliqaz
Out of curiosity why do you keep such an old machine around? I have found that
newer machines can be had for free, and 7-year-old Xeon servers can be had for
<$200 from IT recyclers.

~~~
gwern
The electricity consumption numbers are also relevant.

------
goda90
My workstation was recently "upgraded" from a Window 8.1 workstation with a
4th gen i5 to a Windows 10 laptop with 5th gen i7. The extra RAM and SSD over
HDD is great, but whether it's because it went from a quad-core to a dual-core
hyperthreaded CPU, or because of the jump to Windows 10, the mouse lag is
considerably more noticeable now. I've convinced them to upgrade my laptop
again, but now this article doesn't give me much hope for my new work toy.

------
nailer
While the issue closing processes slowly is unique to Windows 10, I've found
similar situations of not being able to control my OS on RHEL, Fedora, Ubuntu
and MacOS.

At this point I genuinely believe latency will be the death of general purpose
computing.

iOS and Android very rarely get out of control. Apple is already pushing iOS
devices as laptop replacements. But we lose a lot on these devices with their
locked down OSs and inability to install the software we want.

~~~
SmellyGeekBoy
I've seen Android get "out of control" on lower end devices. If you stick with
higher end devices and/or first-party the experience is usually very slick. I
don't think I've ever had a OnePlus device freeze up or stutter on me, for
instance.

~~~
Dayshine
My OnePlus X certainly does. When charging or after 10/15 minutes of intense
graphical use it starts getting sluggish and unresponsive.

~~~
martinald
Probably the CPU/GPU scaling back because the phone is getting too hot?

------
wjd2030
I just wanted to say good work. I'm impressed you dug so deep. A fix here
could really impact the entire Windows 10 user base.

------
hoodoof
Saw the headline and thought "must be Windows".

I'm not a Windows hater, but one of my long standing gripes about Windows is
that it just seems to have terrible multitasking compared to OSX.

I'm sure there are reasons but it just seems utterly symbolic of Microsoft
that they never managed to get Windows to multitask in a rock solid, smooth
and reliable way like OSX.

~~~
hota_mazi
It seems to have improved these past years but beach balls used to be
comically widespread on OS X. I'm not at all convinced OS X is in any better
shape than Windows.

And OS X certainly doesn't have anywhere near all the kick ass monitoring
tools that Windows as, such as the ones shown in this article.

~~~
hultner
You do know that OS X have DTrace?

~~~
nreilly
And then with DTrace - instruments!
[https://developer.apple.com/library/content/documentation/De...](https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/index.html)

------
lobo_tuerto
I have a new Ryzen 7 CPU (you know, 8 cores, 16 threads), with 64GB RAM and
for HDD a Samsung 960 PRO M.2 drive.

But when I went and plugged in an external Seagate 4TB drive, and tried to "dd
zero" the s#it out of it, my whole system became unresponsive after a while,
obviously I had to reset the machine as it wouldn't "kill -9" the process that
made the system unresponsive.

Trying to type was a no go as keys would sometimes become "stuck". Moving the
mouse around was an exercise in predictability too.

All this happened in the latest Ubuntu 17.04 64bit... #sadstory

------
bane
Seems like a good way to deal with this (outside of trying to convince
Microsoft to fix it) is to spin up a VM and just give it a few cores, and do
the build inside the VM to isolate this behavior.

------
faragon
Why the OS don't prioritize UI threads on one or two cores?

~~~
mac01021
How does it know which threads are UI threads?

~~~
vivekseth
I think it should at least know which thread handles mouse movement.

------
drumttocs8
When I converted from Windows 7 to 10 I noticed that I started getting audio
latency/glitches/squelches from my external audio interface- a Focusrite
Scarlett 2i4. The mouse would also sometimes hang. Doing some basic tests, the
problem seemed to be coming from network drivers... but I could never resolve
it. I wonder if the author's discovery has anything to do with the issues.

------
SigSegOwl
Thats why my machine gets that slow after running for weeks... explains
everything!

------
hl5
I wonder if the "privacy" features in Win10 play a role here. Seems like some
extra process accounting could cause delays not present in previous versions.

------
rebootthesystem
Thank you for bringing attention to this. Experiencing this on our W10
workstations.

I hope MS does something about this immediately. It's maddening.

------
nomercy400
Does this also occur on other OSes, like Linux, or MacOS? Can you move your
Chrome build to another OS and not experience this problem?

------
fooker
Use the gold linker if your setup permits it.

This issue went away for me when I switched.

------
arwhatever
Wasn't this the title of a Bruce Springsteen song?

------
yuhong
I can't imagine that moving the Win32k stuff back to CSRSS would help much in
this case, right? Though it is still a good thing especially for terminal
servers where hopefully one CSRSS process crashing just terminate the session.

~~~
sjmulder
> moving the Win32k stuff back to CSRSS

Out of curiosity, was there some big move of functionality from CSRSS into
Win32k earlier? When and what?

~~~
yuhong
In NT4.

------
onetokeoverthe
There's too many spinning wheels. Stop it! People are going to just stop using
the internet. If I could, I would. But I can't. So please just build simple
websites!

------
isatty
I read mouse as house and was confused for the longest time.

~~~
mjsweet
Same happened to me... initially I thought it must be some 1st world problem.

------
yanpanlau
Just use linux

~~~
gurkendoktor
Staying responsive (under load) is also still a work in progress on Linux:

[https://www.phoronix.com/scan.php?page=news_item&px=BFQ-
Queu...](https://www.phoronix.com/scan.php?page=news_item&px=BFQ-Queued-
Linux-4.12)

~~~
dispat0r
That is a disk scheduler and mostly for spinning disks. I haven't seen a stuck
mouse cursor for a long time in linux.

~~~
goodplay
I always found this peculiar about linux; even when the system is swapping
hard and application windows/window managers completely freeze, the cursor
always remains responsive and movement rendered without so much as a hitch.

I wonder if xorg has some sort of kernel support to enable this.

~~~
gens
Not on my computer. Something like a WM will freeze while it waits in queue to
read those 0.2kB it needs of the disk (poor software design, if you ask me).
But the mouse will freeze when IO buffers get filled. Just writing a huge file
to a (slow-ish) usb stick will make my whole computer freeze, including the
mouse, because the kernel usb code doesn't limit the buffer to some sane size
(there was a kernel patch, and there is an option, and even with it turned on
the problem is still there) (note that the probable reasoning for that is the
fact that usb sucks).

A mouse cursor, AFAIK, is a gpu thing, not that it matters (wayland, i
remember something that, will use normal gpu rendering to render the mouse).
In the UNIX haters handbook there is a section about X where it is written
that displays used to have 2-3 planes (IIRC, 2 planes + a cursor plane. I do
recommend reading the good book, as it is funny).

I remember some talk about better kernel buffers management just for things
like this. Memory management in the kernel is one of those actually hard
things.

------
eecc
so basically a fork-bomb. I think Linux can still buckle under one, nothing
that obscene...

~~~
masklinn
A fork bomb exhausts all system resources, this is not that, it doesn't
actually use any resources, rather it locks the system out of them by
serialising both process termination and UI updates.

~~~
eecc
Well, a fork bomb doesn't really use anything except it makes the process
table management extremely time consuming. Not quite the same, but still
conceptually similar

------
aramadia
When I saw his workstation specs, I thought, thats the exact same one I have
at work! Then I checked the bottom, yep, he's at Google too.

~~~
willvarfar
Does yours run Windows? Do you get to choose? If you could choose, which OS
would you use? Any clear favourites with your peers for development? Just
collecting anecdotes ;)

------
BurningFrog
> the C++ compiler needs to see full class definitions to, e.g., know the size
> of an object and its inheritance relationships, we have to put the main meat
> of every class definition into a header file!

Without knowing _anything_ about modern C++ and its compilers, this seems
fixable.

I'm thinking a header compiler "hint" indicating the size of an object. When
compiling the full class, you get an error if the number is wrong.

------
paulio
He's also written some stuff regarding Visual Studio perf.

[https://randomascii.wordpress.com/2014/04/15/self-
inflicted-...](https://randomascii.wordpress.com/2014/04/15/self-inflicted-
denial-of-service-in-visual-studio-search/)

Close to my heart as I use Visual Studio all day. Horrid piece of software.

I'll probably get downvoted for this.

~~~
SmellyGeekBoy
I live in two worlds at the moment - supporting and maintaining a large PHP
application (for which I use phpStorm) and developing .NET applications (for
which I, obviously, use VS).

I've been a VS user for nigh-on 10 years now and always found the experience
really fantastic. I think it's certainly more coherent that the IntelliJ-based
IDEs (which, to be fair, are also very good).

Anyway, I'm genuinely surprised to see VS described as "horrid" \- what
specifically do you dislike about it?

~~~
paulio
"what specifically do you dislike about it?"

It's not the tooling I dislike it's the performance. No matter what I throw at
it I still get the same experience. It feels like 95% of everything it does is
blocking the UI.

~~~
int_19h
Improving perf in VS is hard without massive rewrites. The fundamental problem
is that this is originally a COM app (as in, heavily using COM to componentize
itself) designed back in mid-90s. Consequently, you get all the wonders of
things such as STA apartments, and code that insists running thereon.

As it gets rewritten, new managed bits don't care about all that stuff. But so
long as there's one bit of legacy code anywhere in the particular flow that
needs to run on STA (usually it's UI thread), you get this whole "you have 20
cores and 60 logical threads, but all those threads need to sync on STA, so
everything is serialized and slow" thing.

Even for the new code, the problem is that all those old COM APIs that it
needs to interact with (not just for VS itself, but also for the sake of
backwards compatibility with third party extensions) are usually synchronous.
So if you want background processing, you need to spawn a thread - but, of
course, threads aren't free, either.

------
alacombe
So what now ?

Should/does the author wait getting some traction on Hacker News and hopefully
be noticed by dev at Microsoft, or is there some way to provide Microsoft
directly these data, skipping Tier-1 support ?

The author seem to be working at Google, so he might get some leverage from
that, but what about Random Joe ? Keep enjoying the bug "forever", I guess ?

~~~
ximeng
There are many many MS bugs that last for years. Pretty much any time you
search for a bug in MS software you will find someone in a support forum
giving generic advice (known issue that we will work on, not a bug, please
reinstall Windows and applications) and users complaining.

Typical example: [https://excel.uservoice.com/forums/304921-excel-for-
windows-...](https://excel.uservoice.com/forums/304921-excel-for-windows-
desktop-application/suggestions/10374741-stop-excel-from-changing-large-
numbers-actually) basic problem that has been there many many years. For this
one there are workarounds but many others there are not.

The only real long-term solution is supporting competitive products.

~~~
intoverflow2
>The only real long-term solution is supporting competitive products.

Not really viable when the only OS that has software parity is OS X and the
only OS with hardware parity is Linux.

If we drew a venn diagram it would be a straight line of circles with windows
sitting in the middle. To switch away you have to make a choice between less
software or less processing/GPU power.

~~~
ximeng
Yes. That's why Microsoft gets away with not being responsive to customers.
However using alternative systems (e.g. Mac, Linux, Google docs, LibreOffice)
where possible does help even if only a little.

There are also advantages to the free solutions: no requirements to manage
licenses and some improved functionality (e.g. seamless sharing with Google
docs).

The fact that MS is including bash on Windows shows that the pressure is
working.

------
microcolonel
The terrifying result of having ABI compatibility with the first commercially
successful system of its kind from the '80s.

Granted, they aren't doing themselves any favours with their new straight-
jacket style application ABI.

~~~
willvarfar
You are trotting out the same tired line without reading the article
carefully? ;)

This is a regression. The article shows it worked fine up to Windows 7.

~~~
Aeolun
7 is the best

~~~
option_greek
Regardless of how many windows/any other OS versions I use later on, I will
remember Win7 fondly. Its just so.. less cumbersome. May be the drastic
changes that went into its preceeding and succeeding versions have a role to
play in this.

~~~
krylon
There is this old quote that Algol 60 was an improvement not only on its
predecessors but also its successors.

The same could be said about Windows 7, I guess. ;-)

