
What Outranks Thread Priority? - benaadams
https://randomascii.wordpress.com/2020/04/14/what-outranks-thread-priority/
======
pjc50
Excellent writeup of a useful technique. I'm going to have to learn how to do
this myself one day. Of course, if you're not inside Microsoft it may be very
hard to fix any of it.

It really does show how the CPU gains of the past decades have been
immediately wasted by badly organised processes doing work inefficiently which
is then immediately discarded, while simultaneously blocking more important
work.

------
smallstepforman
The Windows kernel scheduler causes much trouble for our simulation software
which runs 7 out of 8 cores at full pace, and then mysteriously gets suspended
by the kernel when left to run over night. No Microsoft, its not OK to suspend
a user task, and it’s not OK to reboot my box for an update while I’m running
simulation software. If I end up having to reboot to a non Microsoft OS to run
the simulations, I may End up never rebooting back to Win10 (once a solution
to our external 3rd party tools is found).

~~~
yummypaint
This is triggering some bad memories. I havent used windows for anything
serious in almost a decade, but it's pretty mind blowing they still havent
fixed their basic reliability and availability issues. It's especially
farcical because it flies completely in the face of the marketing claims used
to push commercial software over FOSS. I work at a facility where running
experiments effectively costs $1000/hour, and we have learned the hard way to
not let windows anywhere near anything remotely critical. We dont even use
windows to host guis for talking to the back end. About 10 years ago we made a
long series of attempts to remedy these problems of windows machines rebooting
themselves etc when told not to and losing irreplacable overnight data etc.
The only solution that can be trusted long term is total excorcism and
switching to an OS that is written to help users compute instead of to make
money for MS shareholders. If you absolutely cant use WINE, see if you can run
it in a VM to at least mitigate some of the nonsense. Plus you have the option
of taking snapshots to hard restore the machine. /rant

------
PaulDavisThe1st
for the *nix among us, the equivalent is the scheduling class (SCHED_FIFO,
SCHED_RR, SCHED_OTHER). A given (kernel) thread's priority only applies within
its scheduling class; if there are ready-to-run threads in a higher "ranked"
scheduling class, threads in the lower "ranked" classes will not run.

This is widely used when using Linux (for example) as a soft-real time system.
The same basic concepts apply to macOS (since they really come from POSIX and
even Mach inherited them).

------
justforfunhere
Great Post!

You would think that unlocking the screen would be highest priority task for a
general purpose OS after it has woken up from sleep, but surprising to see
that other processes manage to outrank it.

Whats interesting to me is that this LockApp.exe was waiting to get the CPU
for a full 19 seconds after it was ready. 19 seconds should be an eternity, I
guess.

Since the screen was locked and LogonUI.exe had not got the chance to run yet,
what other processes were holding on to CPU at that time. Any ideas on that?

~~~
tambre
>Any ideas on that?

As he explains, processes with a GUI trying really hard to render despite it
not being possible, as the screen hasn't been unlocked yet. Presumably another
Windows bug.

Also hardware drivers initializing stuff. Including spinning for an eternity –
his mouse and keyboard driver apparently spinned for 700ms.

------
albertzeyer
There is a whole series of very interesting (even exciting) analyses by this
author (randomascii, Bruce Dawson). Most of them are using his own tool
UIforETW
([https://github.com/google/UIforETW/](https://github.com/google/UIforETW/)).

I wonder, what alternative tool would you use on Mac? I guess Xcode
Instruments in some variant? And this is based on DTrace, right? Would that
give you all the same possibilities and traces as UIforETW?

And what alternative tool would you use on Linux? perf? eBPF? That would give
you the same information as UIforETW? And what GUI would be similar to
UIforETW or Xcode instruments?

~~~
brucedawson
It's important to note that UIforETW is just my tool for recording ETW traces.
The underlying ETW tracing and WPA (the trace viewer) are all Microsoft's.

That said, UIforETW definitely fixes some significant issues with recording
and managing ETW traces.

------
umvi
As someone who spends most of the time in *nix... Windows just sounds
nightmarish to debug. The few times I've had to debug something in Linux
kernel space I was super glad I had the full source code.

All that effort, and then... "the problem magically went away and we'll
probably never know the true root cause"

~~~
rrss
Windows has really nice tracing and WPA is pretty powerful. I've not yet seen
a tool that is similarly easy to use on Linux. You can get most of the raw
data from perf, but the tooling for analyzing the data isn't as good. I'd love
to hear about a tool that for Linux allows doing Dawson's wait analysis easily
if anyone knows of one.

I agree about the lack of source - it's really nice to be able to see exactly
what the kernel is doing on Linux.

~~~
tambre
>I agree about the lack of source - it's really nice to be able to see exactly
what the kernel is doing on Linux.

Debug symbols do help a ton on Windows.

~~~
rrss
That's definitely true, and the debug symbols story is better on windows (in
my experience) than other platforms - AFAIK linux has no real equivalent to
symbol servers.

~~~
tambre
Debian debug symbol packages are pretty nice [0].

But if you want a repository for your own debug symbols, multitude of symbol
versions and ease of use then you're out of luck.

[0]:
[https://wiki.debian.org/DebugPackage](https://wiki.debian.org/DebugPackage)

------
glitchc
Fascinating analysis! Thank you. I have noticed this effect is especially
prominent on corporate laptops configured with monitoring features. I notice a
similar effect on entering standby. Wonder if it’s related to another Rank 2
demotion of a critical process.

------
rk06
My laptop's startup time reduced by few seconds, since I tried the mitigation
steps.

I hope Bruce looks into other issues too

