Hacker News new | past | comments | ask | show | jobs | submit login
Windows Timer Resolution: The Great Rule Change (randomascii.wordpress.com)
102 points by nikbackm 9 months ago | hide | past | favorite | 93 comments



I tested my own system using Bruce's "measure_interval.cpp" program (on Windows 1909):

- Slack (sometimes) sets the global timer to 1ms when it is in the foreground, but restores it in background

- Spotify sets the global timer to 1ms, no matter what. Even if it isn't playing.

- Skype sets 1ms, if started at Startup (which it defaults to), even though I am logged out and it just has a tray icon. But when I manually start it, it doesn't (always) set it to 1ms.

- VSCode will set it to 1ms when you are interacting with it, but will eventually revert to 15.6ms if left alone (even if it is still in foreground).

- Firefox doesn't appear to set it (on its own; I presume that if I opened a tab that was using a low setTimeout or requestAnimFrame it might).

Spotify is interesting. A lot of people probably have that app, and since it sets 1ms unconditionally, it would have been setting fast-timer mode prior to the 2004 update, which could inadvertently "speed up" whatever games people were running.

That includes my own game, which uses a foreground sleep of as low as 1ms to try to hit its time target, and I don't call timeBeginPeriod. I guess I'll find out when I get the 2004 update.


Half the apps here are Chromium based though.


> A program might depend on a fast timer resolution and fail to request it. There have been multiple claims that some games have this problem (...)

Yup, I wrote such a (small, freeware) game 15+ years ago. I wasn't aware of timeBeginPeriod at the time, but I observed that for some inexplicable reason, the game ran more smoothly when Winamp was running in the background. :-)


<rant>

Our models of computer timers are woefully inadequate. These things execute billions of instructions per second. Why shouldn't we be able to schedule a timer at sub-millisecond resolution?

Answer: we can. But the APIs are very old and assume conditions no longer present. Or something like that. Anyway, they don't get the job done.

Everybody seems to start a hardware timer at some regular period, then simulate 'timer interrupts' for applications off that timer's interrupt. If you want 12.5ms but the ol' ticker is ticking at 1ms intervals, you get 13 or so depending on where in an interval you asked, it could be 12.

Even if nobody is using the timer, its ticking away wasting CPU time. So the tendency is, to make the period as long as possible without pissing everybody off.

Even back in the 1980's, I worked on an OS running on the 8086 with a service called PIT (Programmable Interval Timer). You said what interval you wanted; it programmed the hardware timer for that. If it was already running, and your interval was shorter than what remained, it would reprogram it for your short time, then when it went off it reprogrammed it for the remainder.

It kept a whole chain of scheduled expirations sorted by time. When the interrupt occurred it'd call the callback of the 1st entry and discard it. Then it'd reprogram the timer for the remaining time on the next.

It took into account the time of the callback; the time to take the interrupt and reprogram. And it achieved sub-millisecond scheduling even back on that old sad hardware.

And when nobody was using the timer, it didn't run! Zero wasted CPU.

Imagine how precise timers could be today, on our super duper gigahertz hardware.

But what do we get? We get broken, laggy, high-latency, late timer callbacks at some abominable minimum period. Sigh.

</rant>


Aren't most modern CPUs tickless[1]? They effectively work the way you describe, the scheduler will sleep as long as possible instead of just "ticking" a NOP if nothing is scheduled.

The problem is that what you describe is very hard to guarantee in a non-realtime OS. Modern CPUs are pretty damn fast, that's true, but they can't task switch on a dime. The delay between an IRQ being triggered and task being scheduled can vary pretty wildly, especially if the CPU is already under a heavy load.

You can still try your luck with nanosleep() and see how that treats you. It might work. It probably won't always work.

In general if you need to react so fast on some event you probably want to avoid polling at all costs. That's why most OSs are fine relaxing timer constraints for better overall performance. If you can't react on an external interrupt and you really need to be reactive busy-polling is still probably your best bet.

[1] https://en.wikipedia.org/wiki/Tickless_kernel


> Why shouldn't we be able to schedule a timer at sub-millisecond resolution

it's not just the API, the whole operating system need to be aware of the need to operate at such frequency, because even if the clocks are faster, the software gets interrupted all the time. what good would a nanosecond api do if an interrupt can stall the cpu for a microsecond?

hence the need for a whole class of realtime operating systems. the next question would then be, why aren't all computers using a realtime operating system? because with the relaxed time constraints we get great returns in speed and cost.


Linux has implemented hrtimer [1] a while ago and it can easily achieve sub-ms precision. You can just try some benchmark with nanosleep() on your system. It works in a similar way to the PIT you described, although I don't think it tries to offset the delay from interrupts.

What I want to say is that we already have precise timers usable on modern hardware; as for why Windows doesn't use that properly, I have no idea, but I can guess that's some technical debt in the scheduler.

[1] https://www.kernel.org/doc/ols/2006/ols2006v1-pages-333-346....


If you want to be really mad, consider that modern x86 chips can fire a timer interrupt from the timestamp counter (TSC). Early TSCs weren't frequency stable, but most (all?) now use some fixed frequency, something around the bus speed, and the most recent models even tell you what the frequency is. With that, you could have sub-microsecond timers, if your system isn't too busy and/or you have the right priorities set for preemption.


I remember that! Worked at Intel's biggest chip customer (very early on). They asked us what we wanted in the next generation. Nobody said much, so I made a manifesto - bus data breakpoint interrupts, jump history fifo, and a timestamp counter and count-match interrupt register! It showed up in the next spec, but as a privileged instruction sadly. And at clock speed. I wanted it at a fixed frequency for I/O timing, and I wanted it for everybody - drivers, apps, kernel, everbody.


Used the exact same method to schedule events in an OS-less embedded system.

I feel your frustration.


I once spent ages trying to determine why a Python unit test that sorted timestamps constantly failed on Windows. In the test, we compared the timestamps of performed operations, and checked to confirm that the operations happened in sequence based on their timestamp (I'm sure many of you see where this is going). On Windows, the timestamp for all the actions was exactly the same, so when sorted, the actions appeared out-of-order. It was then that I discovered Python's time library on Windows only reports times with a resolution of ~1ms, whereas on Linux the same code reports times with a resolution of ~10us. That one was actually super fun to track down, but super disappointing to discover it's not something that's easily remedied.

(For those about to suggest how it should have been done, the application also stored an atomic revision counter, so the unit test was switched to that instead of a timestamp.)


On Windows 8 or later you can use GetSystemTimePreciseAsFileTime to get higher precision timestamps.

https://docs.microsoft.com/en-us/windows/win32/api/sysinfoap...


At work we have an application that calls `timeBeginPeriod(1)` to get timer callbacks (from `CreateTimerQueue`) firing at 5ms resolutions but we are not seeing the behaviour described in the article. We observe no change to the timer resolution after calling `timeBeginPeriod(1)`, which unfortunatly is a breaking change to our app.

The lack of information and response from Microsoft on this has been quite frustrating.


Yeah, I just came to the same conclusion.

Win10 2004 timer changes are breaking.

Timer frequency adjustments (timeBeginPeriod) don't seem to affect Windows timers in the same process like they used to do!

Again. In Win10 2004, if you do timeBeginPeriod(1), timers in the same process (other than using the deprecated multimedia timer) seem to only trigger every 15-16 ms or so instead of 1 ms.

This is bad.

As a sidenote about Win10 2004, after timeBeginPeriod(1), Sleep(1) seems to take about 2 ms per call. Or at least it seems to take about 1950 ms to call Sleep(1) 1000 times when I tested it. Confusing.


Yeah.

There is an a ticket opened in the `Feedback Hub` app for it that I've been checking but there seems to be no acknowledgement from Microsoft about this, which is one of the more annoying aspects.


Ah yes, reminds me how on my previous project I was in charge of writing a server to mix audio for multiple clients in real time. The server worked well on my local Windows 10 machine, but when deployed to a cloud instance of Windows Server 2016 it ran very very poorly, just barely quickly enough to process data in time.

That's when I discovered that doing a "process more data if there is any, if not - sleep(1)" loop is a very bad way of doing it, as on Windows Server 2016 "sleep(1)" means "sleep 16ms". It all worked fine once the timer resolution was changed to 1ms, but yeah, the default value will screw you over if you have anything this time sensitive and are using sleeps or waits on windows.


I don't even get why you would Sleep() in such a scenario. If you want to fire every 1ms, wouldn't you use an actual timer? Like one of SetTimer/SetThreadpoolTimer/CreateTimerQueueTimer/timeSetEvent/CreateWaitableTimer maybe? Why would you rely on Sleep() for periodicity?


> I don't even get why you would Sleep() in such a scenario.

If you don't want two of the process running at once for any reason, a check-do-sleep loop can be easier to reason than debouncing an interrupt based approach. Less efficient, but often efficient enough especially if you are just throwing something together quickly with a view to if being PoC or a temporary utility.


When you're talking about Windows timer accuracy, Sleep is the simplest way to do so. Same issues and resolution affect all of the other mechanisms you mention. Talking about those separately would just introduce unnecessary complexity to the discussion without any advantage.

So no one would rely on Sleep for periodicity. Getting reliable timing performance out of Windows is... challenging.

So ultimately it's just simpler to talk about Sleep. Windows developers who have to deal with it understand that same mechanism (which used to be determined by the global timer frequency, which you could adjust by calling timeBeginPeriod) affects all of the different timers.


> Same issues and resolution affect all of the other mechanisms you mention.

Really? On versions prior to 20H2? This gives me 16ms resolution for SetWaitableTimer but 1ms for timeSetEvent:

  #include <tchar.h>
  #include <Windows.h>
  #include <MMSystem.h>
  #pragma comment(lib, "winmm.lib")
  
  static unsigned int prev_time = timeGetTime();
  
  VOID CALLBACK callback(LPVOID lpArgToCompletionRoutine, DWORD dwTimerLowValue, DWORD dwTimerHighValue)
  { unsigned int now = timeGetTime(); _tprintf(_T("%u\n"), now - prev_time); prev_time = now; }
  
  void CALLBACK callback(UINT uTimerID, UINT uMsg, DWORD_PTR dwUser, DWORD_PTR dw1, DWORD_PTR dw2) { return callback(NULL, 0, 0); }
  
  int _tmain(int argc, TCHAR *argv[])
  {
   enum { msec = 1 };
   HANDLE handle = CreateWaitableTimer(NULL, FALSE, NULL);
   LARGE_INTEGER due = {};
   SetWaitableTimer(handle, &due, msec, callback, NULL, FALSE);
   for (int i = 0; SleepEx(100, TRUE) == WAIT_IO_COMPLETION && i < 30; ++i) { }
   CloseHandle(handle);
   MMRESULT id = timeSetEvent(msec, 1, callback, NULL, TIME_PERIODIC | TIME_KILL_SYNCHRONOUS);
   Sleep(100);
   timeKillEvent(id);
  }


You're just using timeSetEvent (which is an obsolete API, if I may add) instead of timeBeginPeriod. Which ultimately changes Windows timer interrupt frequency to the specified value.

So, sure, if you change Windows timer interrupt frequency, that's what you get. It'll of course also affect Sleep while active. Back to the square one.


Try adding timeBeginPeriod(1)/timeEndPeriod(1) around the waitable timer code and you'll see it still wakes up every 16ms.


Pretty interesting.

And very alarming.

On Win10/1809 timeBeginPeriod does affect SetWaitableTimer resolution, as I'd have expected.

On Win10/2004 timeBeginPeriod doesn't seem to affect SetWaitableTimer resolution?! This could certainly break things!

SetWaitableTimer documentation lets you understand timeBeginPeriod is expected to affect the resolution. Of course, depending on hardware... but in this case both tests were running on same hardware.

"You can change the resolution of your API with a call to the timeBeginPeriod and timeEndPeriod functions. How precise you can change the resolution depends on which hardware clock the particular API uses. For more information, check your hardware documentation."

https://docs.microsoft.com/en-us/windows/win32/api/synchapi/...


Might be due to my use of APCs rather than waiting on the object. Not bothered enough to try it myself but I guess you can try WaitForSingleObject on the timer to see if it behaves similarly...


On 2004, I tried timeBeginPeriod(1) and calling Sleep(1) 1000 times. Perplexingly it took a bit under 2 seconds.

On 1809 same took about 1.3 seconds.

I'm at loss... Windows timers have yet again managed to do really weird, unexpected things.

I really hope this mess doesn't affect device drivers, or there could be a world of hurt coming up.


I think it's due to my use of APCs like I said. Try replacing SleepEx(100, TRUE) with WaitForSingleObject(handle, INFINITE).

Edit: Never mind. Seems it might not be the APCs. Maybe some callee of Sleep(Ex) is just rounding up the wait interval to a time slice somewhere. Not sure... would need some digging.


I didn't call SleepEx, but just plain old Sleep. I don't see how APCs would have anything to do with this.


I can 100% guarantee that if your system time resolution is 16ms, that timeSetEvent won't actually trigger in 1ms, no matter what it reports.


Well that's not what I'm seeing... can you paste what you get?

  > clockres && Temp

  Maximum timer interval: 15.625 ms
  Minimum timer interval: 0.500 ms
  Current timer interval: 15.621 ms
  0
  15
  16
  ...
  16
  15
  2
  1
  2
  ...


No, because I don't have access to these Windows Server 2016(virtualized on GCP) machines anymore, and the behaviour on Windows 10 isn't the same so I can't check. All I can say is that we have spent weeks investigating this and if it was possible to wait 1ms with the timer resolution set to 16ms we would have done so.


Well I don't know what Windows Server 2016 virtualized on GCP is doing, but if you believe me that I copy-pasted this, you should be wary of making such sweeping claims about Windows...

Re: your troubleshooting: wild guess, but one thing to look into (if you ever encounter it again) might be changing the "Processor scheduling" setting in sysdm.cpl -> Advanced -> Settings #1 -> Advanced to "Programs" instead of "Background services". Not sure if that affects timers, but I can't think of much else (besides the hypervisor or hard-coded kernel differences) that would cause this.


One shouldn't really test Windows timer behavior on virtualized platforms, it's bad enough as it is on bare metal.


Right, but this wasn't some sort of academic exercise - we were deploying a server for a commercial product and ran into this problem because we had to get it running in that scenario. We can of course argue that audio processing shouldn't be done on a virtualized operating system full stop, but that's not a part I had any control over.


Somewhere around 2003 I discovered that the best way to do a short sleep, regardless of timeBeginPeriod() and friends, is to use select() with a short (1ms or 0.5ms) timeout. It worked on Win2K and WinXP, even when Sleep() didn't go below 1ms and wasn't dependable for less than 1ms even with timeBeginPeriod


Literally all of those methods will only wake up at the system's timer interval at the earliest - if that happens to be 16ms, there is nothing you can do to sleep/wait less than 16ms. Even WaitForObject calls can only wait for the minimum equal to the system's timer resolution.


The resolution or behavior of other APIs is beside my point. I'm saying, even if in an ideal world Sleep did wait exactly how much you told it to, I don't get why you would feel justified in using it to trigger something every N ms. Even if N = 16ms. Your code itself might takes 3ms to run one time, 6ms another time, or 9ms a third time. Sleeping 16ms every time just won't get you periodic behavior even if Sleep is perfectly accurate. Right? i.e. Sleep() just seems like the wrong API regardless of everything else...


I don't think I understand your concern here. The idea wasn't to do something every N millis, the loop was "process work, if there isn't any more work to do, sleep(1), repeat". Like you said, the "work" part can take any amount of time - but when it does run out, the thread needs to sleep/wait. We have experimented with several APIs for this, including the methods you suggested, and it had no effect on actual time spent waiting for more work to appear - sleep(1) had the exact same effect as simply using WaitForObject and the work producer signalling on that object - the times were literally identical.

Edit: reading your comment again I think I see the misunderstanding - sleep wasn't used to do something periodically, if that's what you want to do then indeed, sleep is the wrong tool.


You shouldn't use time in this scenario at all, but wait on a condition. It's unreasonable to expect a non-realtime kernel to offer this precision with timers. Doing serious audio work on non-realtime kernels, you have to expect such issues.


Again, on windows using waits on kernel objects has the exact same result. A signaled object will only wake up the waiting thread at the timer interrupt, so it doesn't matter if you use sleep(1) and check if there is work to do, or if you use a kernel object and wait on it - the time spent sleeping will be exactly identical.


The wait time will be the same but the amount of wasted CPU won't.

It's not a lot but it adds up. If it's only on your servers then whatever, but if the code goes to clients then please wait on an event.


Well, there's a fine line here.

Correct me if I'm wrong, but if it's hardware based IRP completion (like DMA done, network card received a packet or whatever), your thread could wake up very soon after I/O completes, long before the next timer tick.

But this is not something you can use for timing, unless of course your hardware is built to do so.


Correct. You've clearly had to deal with precise Windows timing as well!

To add, it's pretty shocking there's nothing better in the kernel mode either. It's just as bad as in the user mode.

You actually have to generate an interrupt on a hardware device (like a PCI card) to get good timing performance out of Windows!


I guess I'm not understanding your use case, if it's not periodic invocation. Why you were calling Sleep(1)? Why was a 16ms sleep problematic in that case?


Because once a thread is done with all available work, it has to sleep/wait, the only other alternative is busy waiting(awful idea). So if you see ok, there is no more available work, and call sleep(1), and some more work comes in 2ms later, that work might not be picked up for a whole next 14ms. That is the core of the problem.

Like, you are focusing on the sleep(1) part too much - what I want to do is "wait until there is more work available to process, and then wake up and start processing as soon as physically possible". On Windows the "as soon as physically possible" part is unfortunately 16ms in most cases, and that's just not acceptable for something as precise as audio work.


> what I want to do is "wait until there is more work available to process"

If this is merely to free up CPU when you have no work to do, then the actual timing of 1ms vs 16ms doesn't seem like the issue, right? Shouldn't you be instead waiting on a condition that gets set when work is queued? Like an event object, or an I/O completion port, or something that indicates work is ready? Instead of polling for the condition?

> On Windows the "as soon as physically possible" part is unfortunately 16ms in most cases, and that's just not acceptable for something as precise as audio work.

Aren't you contradicting yourself here? If you're doing audio don't you want periodic multimedia timers? Which are built for this purpose and actually also do deliver the resolution you need?


>>Shouldn't you be instead waiting on a condition that gets set when work is queued? Like an event object, or an I/O completion port, or something that indicates work is ready? Instead of polling for the condition?

Again, as I said multiple times - the WaitForObject won't wake up any sooner than Sleep(1) will. Only a real proper hardware interrupt could wake up quicker - but that requires special hardware that just wasn't available in that scenario. If it's easier to think about that way, then please assume I said that WaitForObject "sleeps" way more than necessary due to the low timer resolution on Windows.

>>Aren't you contradicting yourself here? If you're doing audio don't you want periodic multimedia timers?

No, because like I said at the very beginning, I was writing a server application processing audio coming in from the clients. So essentially every time packets came in from client machines they had to be processed and sent back - that's why I had multiple threads "waiting" for work and they had to pick it up as soon as possible, with no guarantee how often that work was going to appear. It's not the same as picking up audio from a hardware device on a client machine which yes, has to be done periodically. It's a simple producer/consumer queue problem, but with it being audio data, 16ms wait times to process packets was not acceptable - pretty much no other work is that time sensitive(except for video I guess), yet it wasn't the kind of work that has to be done on a specific periodic timer.


I use CreateEvent/SetEvent/WaitForSingleObject in almost the same exact way you describe/desire. Producer/consumer type pattern, all in software. I have one or more threads waiting to process data, and they use the WaitForSingleObject function to wait for the producer to set the event. I have tested this and found I can effectively send > 100k events/"wakeups" per second through this mechanism. I tested this by having two threads sequentially wake each other up in ping-pong fashion. I use this to process data coming in off SDRs, usually at rates > 30MS/s. This code runs on 10's of thousands of PCs, I would have heard about issues by now if these worked at 1ms or greater resolution.


> the WaitForObject won't wake up any sooner than Sleep(1) will. Only a real proper hardware interrupt could wake up quicker - but that requires special hardware that just wasn't available in that scenario. If it's easier to think about that way, then please assume I said that WaitForObject "sleeps" way more than necessary due to the low timer resolution on Windows.

Above is not correct.

SetEvent/WaitForSingleObject takes a few microseconds to wake the thread. Not milliseconds, microseconds.

WaitForSingleObject doesn't use a timer if you set infinite timeout.

It does not "sleep more than necessary due to the low timer resolution", because it does not use the timer.

Even if you set a timeout, it doesn't use the timer when the wake is triggered by SetEvent. The wake is immediate.

Wake triggered by SetEvent in a different thread doesn't use interrupts or need special hardware. It's purely kernel scheduling inside Windows.

For a real-world measurement, see for example:

>> "Windows event object was used but it takes 6-7us from SetEvent() to WaitForSingleObject() return on my machine

The person is annoyed that it takes 6-7 microseconds, and they want to get it lower.

Here's a Microsoft example of how to have one thread wake another:

https://docs.microsoft.com/en-us/windows/win32/sync/using-ev...

There are other ways for one thread to quickly wake another thread on Windows, but SetEvent/WaitForSingleObject is the easiest in a general purpose thread.

What I suspect happened in tests where it appeared that "WaitForObject "sleeps" way more than necessary" is that there was a race condition in the event trigger/wakeup logic between the two threads, causing the worker thread to depend on timeout behaviour of WaitForSingleObject instead of being woken reliably by SetEvent. That is is actually a very common bug in this kind of wakeup logic, and you have to understand race conditions to solve it. When the race condition is solved, SetEvent wakeups become consistently fast.


> with no guarantee how often that work was going to appear.

But you don't need (and should neither need nor want!) such a guarantee to do this the way I'm saying! It sounds like you might just be unfamiliar with overlapped I/O? ReadFile(), WSARecvMsg(), etc. all take OVERLAPPED structures that let you pass a HANDLE to an event object that gets signaled. There's also RegisterWaitForSingleObject, WSAAsyncSelect, WSAEventSelect, CreateIoCompletionPort... you name it. Heck, if you just have a thread select() the old-fashioned way, I think it should still wake up when the data comes, without introducing a delay at all. Nowhere should it matter how fast the data is coming, or to force you to wait more than you need to. Am I missing something?


> Am I missing something?

I think what you're missing is that all of the overlapped I/O scenarios completions you describe all originate from a hardware interrupt from some device, like a network adapter.

We were talking about timing, not about waiting for I/O completion.


> I think what you're missing is that all of the overlapped I/O scenarios completions you describe all originate from a hardware interrupt from some device, like a network adapter.

The comment literally says "packets came in from client machines". And they wanted to service these immediately, not 16ms later. Which is exactly what I'm describing.

Timing requirements don't arise out of the void. Either they're externally mandated based on the wall clock (which is generally for human-oriented things like multimedia) in which case you should probably use something like a periodic multimedia timer, or they're based on other internal events (I/O, user input, etc.) that you can act on immediately. In neither case does Sleep() seem like the right solution, regardless of its accuracy...

Very few real-world exceptions exist to this. In fact the only example I can think of off the top of my head (barring ones whose solution is busy-looping) is "I need to write a debugger that waits for a memory location to change and lets me know ASAP", in which case, I guess polling memory might be the way to go? But it would seem you would want event-based actions...


Well, (and I don't mean it sarcastically) clearly you know more about this that me or any of our engineers do, because we couldn't find a way to do this. Yes, I knew that packets coming in trigger a hardware interrupt that wakes up the thread processing them - but as we had multiple other threads actually decoding this data, there isn't a good way to wake those up from the receiving thread. The sleep(1) and check for work was the best solution, as every other method/API we tried had the same problem of ultimately being limited by the system timer interrupt, and we couldn't get around it in any way.

I don't want to say that there definitely isn't a way to do this - but we never found it.


I see. It's hard to say if I'm missing something but given what I've heard so far I think what you want is I/O completion ports (though there are other ways to solve this). They're basically producer-consumer queues. I highly recommend looking into them if you do I/O in Windows. They're more complicated but they work very well. Here's a socket example: https://www.winsocketdotnetworkprogramming.com/winsock2progr...


I've written Windows device drivers, so I think I could say I'm rather intimately familiar with Windows I/O request (IRP) processing.

Sounds like people in gambiting's company are the same.

I/O completion ports is an API for getting better efficiency between kernel and userland I/O. Batching, better scheduling and avoiding WFMO bottleneck. It's great, but it doesn't really have anything to do with timers.

The bottom line here is that Windows timer behavior has changed. This is terrifying.


It "has nothing to do with timing" in the sense that it doesn't let you choose how much delay to incur. It most certainly does "have something to do with timing" in that it lets you do I/O immediately, without polling or incurring system timer interrupt delays... which is precisely what his description of the problem required. It's kind of bizarre to see you trying to interject into it to argue with me as if you know his situation better than himself while even contradicting his descriptions in the process. I feel like the discussion already concluded on that topic. I'll leave this as my last comment here.


> without polling or incurring system timer interrupt delays... which is precisely what his description of the problem required

Except sometimes polling is your only option. Your scenario works great, if the hardware supports that use case and you're not using some intermediate software layer that prevents you from taking full advantage of the hardware.

Other scenario is when Windows is being used to control something time critical. I fully agree one shouldn't do that, but sometimes you just have to. The new timer change really hurts here.

In other words, we don't live in the perfect world, but still need to get the software working.

I also feel like we're somewhat talking past each other.

> It's kind of bizarre to see you trying to interject into it to argue with me as if you know his situation better than himself while even contradicting his descriptions in the process.

I do know highly similar cases I've had to deal with. Not audio processing, but similarly tight deadlines. It was very painful to get it working correctly.


Unfortunately, sometimes timing requirements do arise "out of the void" to adhere, because not all hardware is perfect.

Also sometimes you just need to do timing sensitive processing. I guess you could argue one shouldn't use Windows for that, but unfortunately sometimes we developers don't really have a choice. Doing what you're told pays the bills. :-)


I think you misunderstood. If you want "timing sensitive processing", that's what you can and should expect to use events or multimedia timers for. That way you can actually expect to meet timing guarantees. Using Sleep() and depending on it for correctness in such situations (regardless of its accuracy) is kind of like using your spacebar to heat your laptop... it doesn't seem like the right way to go on its face, even if there was something to guarantee it would work: https://xkcd.com/1172/


> that's what you can and should expect to use events or multimedia timers for

I don't want to use a deprecated API call, timeSetEvent.

> Using Sleep() and depending on it...

Absolutely no one was using Sleep for timing. It just used to be that all Windows timer mechanisms are really the same thing, and Sleep(1) was an easy way to quickly test actual timing precision.

Obviously Windows 10/2004 changed all this.


> I don't want to use a deprecated API call, timeSetEvent.

That's not what dataflow is saying to use. The phrase is "events or timers", not "timer events".

gambiting describes this scenario:

The hardware (network) does interrupt the CPU and supplies an incoming packet to the receiver thread, which does not use a timer.

Then the receiver thread wants to send it to a worker thread.

It sounds like gambiting's implementation has the worker threads polling a software queue using a sleep-based timer, to pick up work from the receiver thread.

Unless there's something not being said in that picture, the receiver should be sending an event to a worker thread to wake the worker instantaneously, rather than writing to memory and letting the workers get around to polling it in their own sweet, timer-dependent, unnecessarily slow time.

Given that it's described as audio data which must be processed with low latency, it seems strange to insert an avoidable extra 1ms delay into the processing time...

This is kind of the point of events (and in windows, messages; not timers or timer-events). So that one thread can wake another thread immediately.

The whole "if your hardware supports it" reply seems irrelevant to the scenario gambiting has described, as the only relevant hardware sounds like network hardware. Either that's interrupting on packet arrival or it isn't, but either way, a sleep(1) loop in the worker threads won't make packet handling faster, and is almost certainly making the responses slower than they need to be, no matter which version of Windows and which timer rate.(+)

(+) (Except in some exotic multi-core setups, and it doesn't sound like that applies here).


>>Unless there's something not being said in that picture, the receiver should be sending an event to a worker thread to wake the worker instantaneously

I have kind of answered this before - you can signal on an event from the receiver thread, and it doesn't wake up the waiting threads instantaneously, even if they are waiting on a kernel-level object. The WaitForObject will only return at the next interrupt of a system timer.....so exactly at the same point when a Sleep(1) would have woken up too. There is no benefit to using the event architecture, because for those sub-16ms events it doesn't actually bypass the constraint of the system timer interrupt.

>>So that one thread can wake another thread immediately.

The problem here is that the immediate part isn't actually immediate, that's the crux of the whole issue.


> you can signal on an event from the receiver thread, and it doesn't wake up the waiting threads instantaneously, even if they are waiting on a kernel-level object.

This isn't normally the case. You should expect this not to be the case because programs would be insanely inefficient if this happened. My suspicion is there was something else going on. For example, maybe all CPUs were busy running threads (possibly with higher priority?) and so there was no CPU a waiting worker could be scheduled on. But it's not normally what's supposed to happen; it's pretty easy to demonstrate threads get notified practically immediately and don't wait for a time slice. Just run this example and you'll see threads getting notified in a few microseconds:

  #include <process.h>
  #include <tchar.h>
  #include <Windows.h>
  
  LARGE_INTEGER prev_time;
  
  unsigned int CALLBACK worker(void *handle)
  {
   LARGE_INTEGER pc, pf;
   QueryPerformanceFrequency(&pf);
   WaitForSingleObject(handle, 5000);
   QueryPerformanceCounter(&pc);
   _tprintf(_T("%lu us\n"), (pc.QuadPart - prev_time.QuadPart) * 1000000LL / pf.QuadPart);
   return 0;
  }
  
  int _tmain(int argc, TCHAR *argv[])
  {
   HANDLE handle = CreateEvent(NULL, FALSE, FALSE, NULL);
   uintptr_t thd = _beginthreadex(NULL, 0, worker, handle, 0, NULL);
   Sleep(100);
   QueryPerformanceCounter(&prev_time);
   SetEvent(handle);
   WaitForSingleObject((HANDLE)thd, INFINITE);
  }


From my other comment here, I think you have a race condition or something like that in your code, or as the peer comment suggests scheduling contention on the CPU, which is causing this effect in your test.

For a real-world measurement, see:

>> "Windows event object was used but it takes 6-7us from SetEvent() to WaitForSingleObject() return on my machine

The person measured 6-7 microseconds.

The system timer is not interrupting that fast.


SetEvent/WaitForSingleObject is a thread yield from SetEvent thread to the waiting thread. Scheduler does that when SetEvent is called. So microsecond level waiting time is expected.

You don't need to wait for timer tick.

IOW, parent was talking about timer wait, not event wait. Timer events are only processed at timer interrupt ticks.


> IOW, parent was talking about timer wait, not event wait. Timer events are only processed at timer interrupt ticks.

No they weren't. Read parent again, my emphasis added:

>> you can signal on an event from the receiver thread, and it doesn't wake up the waiting threads instantaneously, even if they are waiting on a kernel-level object.

>> The WaitForObject will only return at the next interrupt of a system timer.....so exactly at the same point when a Sleep(1) would have woken up too. There is no benefit to using the event architecture, because for those sub-16ms events it doesn't actually bypass the constraint of the system timer interrupt.

They described their threads in another comment. There's a receiver thread which receives network packets, and sends them internally to processing threads. The processing threads use Sleep(1) in a polling loop, because they believe the receiving thread cannot send an event to wake a processing thread faster than Sleep(1) waits anyway, and they believe the system would need special interrupt hardware to make that inter-thread wakeup faster.

IOW, they are using a timer polling loop only because they found non-timer events take as long to wake the target thread as having the target thread poll every 16ms using Sleep(1).

But that's not how Windows behaves, if the application is coded correctly and the system is not overloaded, (and assuming there isn't a Windows bug).

My hunch is a race condition in the use of SendEvent/WaitForMultipleObjects in the test which produced that picture, because that's an expected effect if there is one. But it could also be CPU scheduler contention, which would not usually make the wake happen at the next timer tick, but would align it to some future timer tick in future, so the Sleep(1) version would show no apparent disadvantage. If it's CPU contention, there's probably significant latency variance (jitter) and therefore higher worst-case latency due to pre-emption.


Again. Sleep(1) is just a shorthand to discuss about this topic. Exact same behavior affects every other Windows timing mechanism as well.


"sleep(1)" means "sleep 16ms"

You sure? According to e.g. the article it results in "sleep anywhere between 1 and 16mSec". But this illustrates part of the problem here: arguably problematic use of Sleep() where other solutions are more appropriate and people not reading the documentation and/or not realizing what it means.


Well, yes, I suppose technically that is correct. In our use case the thread would wake up, check if there was any work to do, if not then sleep(1). If you're doing it continuously(there is nothing to do) the thread would be sleeping for 16ms pretty much exactly every time - which meant that any time any work came in it wouldn't be picked up immediately, but anywhere between 1-16ms(which was unacceptable for this use case).


Seems like this was reported over 4 months ago! [1]

[1] https://developercommunity.visualstudio.com/content/problem/...


Author might want to disable Wordpress "pingback" feature, as it seems abused. WTF is that, it seems bots are copying content, swapping random words and reposting on some generic looking sites..? What's even the purpose of this?


From a technical point of view, this is an interesting change, and I'm not sure if it's a bug or not. From a scientific point of view, I definitely bristled at "cleaned up to remove randomness"... :P


It shouldn’t be doing this, but it is

In my opinion this still remains the conclusion, as it has been for the past decades. I cannot remember when I read a bit on Sleep() behavior and timeBeginPeriod() but I remember that what I read was enough to make clear you just shouldn't rely on these (unless you're 100% sure the consequences are within your spec and will remain so), also not because the workarounds are also widely known (IIRC - things like using WaitForSingleObject if you need accurate Sleep).


> things like using WaitForSingleObject if you need accurate Sleep

You won't get any more timing accuracy with WFSO.


Yeah I see now other comments mention this as well; any idea if this has been changed then, or was the info I seem to recall having read wrong? (and is there another way to achieve it?)


Too late to edit my other reply, but note that according to sibling comment from jlokier elsewhere, you will.


I don't know if something is wrong with my internet or machine, but a lot of graphs/pictures seem to be missing on this page.


i've always tought that Sleep(n) means Sleep at least n... not Sleep around, or Sleep exactly n, what i'm missing here?


"Sleep at least n" is too broad of a spec to be useful. It basically guarantees that users of that function will come to expect or rely on behavior that is not covered by that spec, such as expecting that their program will be woken up again in a relatively short period of time rather than next week, or that the program will actually sleep for a non-trivial amount of wall time instead of immediately waking back up. Microsoft deserves at least some of the blame when application programmers come to rely on behavior that isn't officially part of the spec, and Microsoft definitely deserves some of the blame when they change that behavior without even doing those programmers the courtesy of documenting the change.

"Windows 10" is too much of a moving target, and it's about time Microsoft stopped trying to pretend to the public that it's all one operating system. If they're not going to usefully document potentially breaking changes, they should at least do us all the courtesy of bundling these changes up into a release that increments the major version number, and gives us the option of not upgrading for a few years instead of having the software as a service model forced upon us.

(Given what I've written above, it probably won't surprise you to find out I recently spent a decent chunk of time trying to find a workaround for a different undocumented change brought by Windows 10 version 2004).


> a different undocumented change brought by Windows 10 version 2004

Any chance you could share this so those of us who haven't dealt with it yet might have a heads-up?


About the game fixing utilities, while it is annoying that these wont work at the moment, they should still be able to work by installing a hook that attaches itself to the game's process and calls timeBeginPeriod (several other unofficial game patches work like this already).


> and the timer interrupt is a global resource.

Shouldn't this at least be per-core rather than global? Then most cores can keep scheduling at a low tick rate and only one or two have to take care of the jittery processes.


That's an interesting read. I recall reading about some airline communication system that used to freeze when a 32 bit counter in Windows overflowed. Would the way windows implements timer interrupts have anything to do with this.


Does the timer behave in the same way on Windows Server?


No, but some timing related aspects are different on multi CPU socket systems.

Like it might be much (even 1000x) slower to read QueryPerformanceCounter due to cross-CPU synchronization issues.


This seems deliberate... It's trying to prevent one application 'randomly breaking' when another application is running.

Seems like a good move to me - just a bit of a shame a few applications might break.


> One case where timer-based scheduling is needed is when implementing a web browser. The JavaScript standard has a function called setTimeout which asks the browser to call a JavaScript function some number of milliseconds later. Chromium uses timers (mostly WaitForSingleObject with timeouts rather than Sleep) to implement this and other functionality. This often requires raising the timer interrupt frequency.

Why does it require that? Timeouts should normally be on the order of minutes. Why does Chrome need timer interrupts to happen many times per second?


This isn't the browser's network timeouts, this is the JavaScript function setTimeout() which can be used in scripts and which, as mentioned, takes an argument in milliseconds. Chrome needs to be able to support whatever setTimeout() call a website might make, and those are rarely going to be minutes long; setTimeout() is just used for "call this function after a specified amount of time". The fact that the word "timeout" is used in the name does not mean that it has any relation to network timeouts!


There is still the question why the script needs higher than standard timer resolution.


Higher than Windows' standard timer resolution, which is abysmal.


JavaCcript setTimeout is used for animations.


This is exactly why requestAnimationFrame is a thing though.


Browsers need to try and properly display webpages written before requestAnimationFrame was introduced.


Lots of old code exists though.




Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: