Hacker News new | past | comments | ask | show | jobs | submit login
Computer latency: 1977-2017 (danluu.com)
588 points by kens on Dec 24, 2017 | hide | past | web | favorite | 161 comments

I'm working on a fast text renderer for xi-editor, and am similarly quite concerned about end-to-end latency. I've also experimented with the high speed camera, but my main testing rig is an Arduino that injects USB HID events and then measures light from the screen. I expect to publish results soon. One difference from Dan's results is that it doesn't take the keyboard's latency into account, as that's synthesized.

The good news is that we can get to even better results than Apple 2 by choosing pieces carefully. This is not my final result, but I'm seeing about 19ms from USB to light on a 2017 Macbook Pro 13" connected to a 144Hz monitor.

Given the unexpectedly large contribution of the keyboard, I think there is a _tremendous_ business opportunity in empirically validated actual low latency keyboards. Gamers will eat these up, but I would buy one in an instant for coding.

There are gaming keyboards with 1000hz poll rates.

Those claimed polling rates don’t necessarily translate into low latency. An article that showed up here a while ago touched on that. https://danluu.com/keyboard-latency/

It appears that the main source of latency in these tests are from key travel time, which does not accurately represent the speed of the keyboard. Gaming keyboards often has a lot of key travel before the key is triggered. This is on purpose to give better control of the key and allow for "floating" where the key is half way pressed and quickly switched between actuated and non-actuated state.

For a better test, in my opinion, the key actuation point should be determined and the timer started at that point. Of course this depends on what you want to test. But to say that a gaming keyboard is slow, just because there is more key travel, is inaccurate.

The test mentions the floating and comes to the same conclusion as i: it takes time to get to the floating position and that adds to the latency.

but this probably wouldn't feel laggy, right? because your brain is going to expect full-key-down to be the point at which text appears. especially if that point is well-expressed mechanically (eg, a click).

this would be like saying a physical kick drum or high hat is laggy, because theres a delay between when your foot starts moving and when the sound happens. (which would be silly!)

I think what you'd want here is for the tactile click to be at the actuation point. My understanding is that many or most mechanical keyboards today can't claim this.

Or maybe there is some optimal separation a key should have between the actuation point the tactile click point to account for the latency of the human's nervous system, which would do an even better job of reducing the effects of latency than if the two events were at the same point?

Emphasis on "empirically validated actual low latency keyboards." Dan Luu has tested those purportedly high performance gamer keyboards before and found them mostly lacking.

Cool hack with Arduino HID. What light sensor are you using?

My parts list is here: http://a.co/fltiKey , and the light sensor is one of those Chinese MH-Sensor-Series Flying Fish with a photodiode. The difference between space and '#' at a normal terminal/editor font is around 20 counts (out of 1024 total), which is plenty enough to detect reliably, and the response time is easily <1ms, as cross-checked by the 1000fps camera.

Wouldn’t a text renderer live outside of the xi backend? So you’d have to write one for each front end? Maybe not?

Yes, this is in the front end, not the core. My current implementation is in Swift, using OpenGL, but I'm contemplating writing a cross-platform one as well (maybe Rust/Vulkan because that would target most newer devices other than Apple).

At least in a virtual terminal window, I'm fine with any of the reported latencies as long as the input stream is handled deterministically.

What I can't stand is typing "f" then pressing "<Enter>" in the url area of a browser, and the browser interpreting the url as "f" because "f" won the race against autocomplete's return value of "https://foo.com". Just hand "f" off to autocomplete and navigate to whatever the return value happened to be.

Of course it's a browser, so I'm sure someone could exploit my desired behavior to somehow mine cryptocurrencies in a hidden iframe.

Also when you type into a search field and the thing you want comes up as a succession, you go to click/tap and exactly at that moment, a different suggestion takes its place as the remote site catches up. Twitter is very bad for this.

One of the reasons i go hunt for the "search suggestions" toggle the moment i use a newly installed browser.

UI determinism to me is vital, no matter how ugly it may look to designers.

This is a UI/UX felony. Anybody who writes an interface that does this should go to UI jail.

And then you have to remove the search for "f" from your history or else it will try to autocomplete it in the future. I hate that! Happens to me all the time.

This seems like a Chromium derived behavior, as i have never seen Firefox do it but noticed it on Vivaldi after only a short time using it.

Frankly the url bar, once i kill search suggestions, is the one thing that Mozilla has gotten right in recent years.

Type something in it, and it will do a sweep of history and bookmarks, trying to match the entered string within both url and page title.

Chromium derived browsers limit themselves to macthing the url, and only as if you are typing it as new.

The behavior I described above happened in Firefox ESR on Debian Stretch.

If it has been fixed in the newfangled Firefox I'd love to know. (Not sure how to install Nightly on arm or I'd try it myself.)

Firefox definitely does it.

Older Firefox versions (not sure if it is fixed or simply it no longer shows up on faster hardware) had an "interesting" race condition where if i hit enter and ctrl+t to spawn a new tab fast enough, the url would load in the new tab rather than the old.

There's a cool blog post by the JetBrains guy about text editor latency: https://pavelfatin.com/typing-with-pleasure/

He implemented a zero latency mode which bypasses many layers in the rendering stack, which apparently makes IntelliJ one of the fastest text editors.

Having used IntelliJ -- well, it's faster than a lot of other IDEs, but "one of the fastest text editors"? Maybe vim is spoiling me but everything that runs outside of a terminal feels glacial to me.

Yes, as stated in the blog article, the optimizations they made put it in the same ballpark than Vim.

It's probably faster than Vim if you're using a slow terminal emulator.

"The JetBrains guy" ... That's just one guy?

Hah, that's what you get for rephrasing comments. It used to say "the JetBrains guy who added zero latency mode".

Apparently imperceptible differences in latency can be felt vividly through break of habit.

A couple of years back, I didn't have any experience with online games and was introduced to a popular one by one of my friends. As we were playing some introductory matches, 400ms latency connection felt very instantaneous to me, while he felt super annoyed and advised me to change my ISP. Some months after changing my ISP and regularly playing the game, I could instinctively feel the difference between a 60ms connection and a 120ms connection.

I invented a few terms for it, Latency Sensitive or Hyper Sensitivity. I thought I was having this weird latency intolerances that not many ones has, turns out I am not alone.

Most test and site you play seems to show our maximum response rate to click at somewhere between 200ms to 300ms, with most scoring 250ms+ range. And yet we can still feel the difference. And this latency problem sometimes causes me breathing and heart beat problems when things dont align with what I expected to in response time. Sometimes doing things too quickly isn't always the best option either.

It is the same with Apple's Retina Display, 20/20 eyesight, viewing distance etc, but given those condition you can still feel the difference between a 300ppi and 400ppi.

And the iPad ProMotion is really buttery smooth, it still isn't perfect though, I am not sure what frame rate do we need to a point where we cant tell the difference.

I wish latency are taken more seriously in UX and design. Hopefully Apple will lead the pack again.

> Most test and site you play seems to show our maximum response rate to click at somewhere between 200ms to 300ms, with most scoring 250ms+ range. And yet we can still feel the difference.

I think when you interact with a UI (or a physical object, for that matter), you're projecting an expectation into the future relative to your sense perception of the action. So the ~250ms delay applies to both the action and reaction, thereby effectively canceling out.

>Hopefully Apple will lead the pack again.

Like when they made a calculator that fails to calculate correctly if you push buttons too fast?


Definitely applicable because it's talking about UI latency... and Apple.

The value of habits/muscle memory should never be underestimated.

And why i feel that KDE did the unix world a disservice when they went from emulating Windows to try to be their own thing in KDE4. As long as KDE behaved closely to Windows out of the box, one could more easily get people to give desktop unix a try over the longer term. This because the initial experience would not be as jarring.

Similarly observe the backlash Microsoft got for Windows 8.

Sadly the FOSS world seems overrun with designers and busybodies that want to pad their resume these days.

I'm happy to see someone thinking about latency about like I do. This Microsoft study (https://youtu.be/vOvQCPLkPt4) keeps lingering in the back of my mind.

The most annoying latencies I've seen so far are scrolling on Android and Trackpad mouse moving on Linux (x11).

Outlook 2016 is the pits. Microsoft should pay attention to their own research.

Yep. That's the same video I was thinking of.

So it's not just my imagination. My 2014 MBP feels reasonable, but my 2017 XPS 15 (top spec) Linux machine feels way, way slower. I've wondered if I somehow made poor choices in my Ubuntu config (maybe so), but now I realize it could just be in the hardware.

This reminds me of the current state of the web. There's been such a push to add more and more to pages, largely without regard for performance, that many sites take 10+ seconds to load the first time on 4G or wifi on a new, top spec Android phone. This is one area where Apple still has one of Steve Jobs's best contributions - an attention to user feel and perception. As much as I prefer an Android, iPhone just _feels_ faster.

I'd say Ubuntu out of the box is very bad at defaults (bells and whistles turned up, Unity or Gnome don't help in the least)

For minimal latency I would go for Xfce/Gnome 2 without compositing

Thanks. I'll give it a try!

Try the tty (ctrl+alt+f1) and see how quickly your inputs respond, its amazing how X11 and wayland (especially) add so much latency

Apparently this is it. tty is very fast, although since I have a 4k screen the drawing/scrolling rate is very slow.

I recently switched from whatever Ubuntu 17.10 is using (Gnome on Wayland?) to i3 on X.org, and there was no real performance increase that I could see.

Do you have any suggestions on how I can get a tty level experience (with the occasional ability to use a web browser), but with low latency? I obviously can't use tty when the screen is native resolution (and the text is miniscule in size).

> although since I have a 4k screen the drawing/scrolling rate is very slow.

Add "vga=normal nomodeset" to your kernel parameters, and revel in the speed (and blockiness of the font). :)

Personally, over the years I find the LXDE to be the most pleasant environment. It puts all those Gnome3, Wayland, KDE atrocities to dust. Haven't tried any other recent DEs, since once I got my hands on LXDE I had no need.

Just now I checked the input response in tty and terminator, and I just don't feel any difference at all.

That's mostly not X11, but fancy DEs with fancy DE-oriented apps. Try some lightweight window manager with xterm.

This, definitely. Even on Gnome, using xterm is far more responsive than the default gnome-terminal. I switched years ago because of how much a difference it made for me.

Also try KDE Plasma with default settings, see my other comment.

I have posted this in the recent latency thread too, but let me repeat it here: I have measured 44 ms on Linux WITH compositing, from mechanical input to screen output. I have a PS/2 keyboard but that doesn't actually matter too much - the keyboard model might matter.

Hardware and software info: Measurement at 90 fps (Moto G 5 slow motion camera) so it could have been up to 55 fps due to granularity. Filmed keyboard and screen together. Ubuntu 16.10 with self compiled KDE stack on master branch. Fujitsu KBPC-PX keyboard with PS/2 connection (it can also do USB). AMD R9 280x GPU with AMDGPU kernel driver and mesa master / llvm trunk. kwin (X11) window manager with compositing enabled. Editor kwrite (kate should be the same, it's a more featureful UI around the same internals). AMD Ryzen 7 1800X CPU. 16 GB RAM.

Possibly one of the reasons why I like fat Linux desktop machines with AMD graphics so much is the great latency results.

Another interesting sorta-related thing I've noticed is that good applications only refresh the necessary part of the screen. The Weston compositor has a debug mode that shows colorful rectangles over repainted areas. When typing in gnome-terminal, the repaints only happen on affected characters.

AFAIU this doesn't do a whole lot in practice: OpenGL does not support partial repaints - or maybe it does with obscure extensions. What it may do is avoid updating the screen at all when a window updates only a part that is hidden behind something else.

A quite valuable and IME underused Wayland feature is clients sending an opaque region and the compositor using it to draw windows ("surfaces") without alpha blending where it isn't necessary. Alpha blending uses significantly more fill rate than plain overdrawing. Opaque regions can also be used to determine that a whole window or a region of it doesn't need to be drawn at all. I had implemented that as an experimental optimization in a customer project once. In the end it didn't work out because the 2D acceleration in Vivante GC2000 is broken garbage - they even pulled the documentation of it, but still advertise the feature in marketing material.

It isn't an obscure feature. You can implement partial redraws in OpenGL in the obvious manner, by selecting a single-buffered visual.

This of course has all the expected downsides, like tearing, and modern GPUs are fast enough that there's rarely any point--but it's certainly possible.

I had this one in mind: https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_...

Note that Kristian Høgsberg, for years the main developer of Wayland, is among the authors. So I guess it is actually not so obscure in the Wayland context. It tells you how many frames old the current back buffer is, so you can repaint as necessary to take it from the state n frames ago to the state now.

One could render content into a FBO and simply redraw the FBO texture to screen when necessary. If only a partial update is needed, the scissor box can be set. No extensions are used in this case.

Correction: the distribution is actually Ubuntu 17.10

tip: Use the same amount of decimals as your measuring precision!

I've heard PS/2 was faster (because it's a direct ISA/PCI device / hardware interrupt?), whereas USB can introduce more latency because it has no real-time guarantee and it's packet-based.


>There are a number of interesting results, but the point relevant to this question is that there was a fairly significant variance between keyboards, and all the USB keyboards tested had a longer effective scan interval (18.77 ms - 32.75 ms) than the PS/2 keyboards (2.83 ms - 10.88 ms).

The appendix "why latency matters" is interesting but a more current example is VR headsets where latency in adjusting the display to track head motion is directly proportional to sea sickness symptoms.

The dreaded standard HN automobile analogy is latency is like turbo boost lag where you stomp on the gas pedal on my wife's Prius and it spins the tires instantly (well, anti-spin kicks in instantly anyway), whereas my uncle had a turbo Firebird in the mid 80s, stomp the gas pedal and like 3 seconds later the turbo spins up like a sci fi warp drive and kicks you in the butt, but for the first 3 seconds it almost feels like the car stalled, kinda weird. Obviously its been over three decades so possibly the latency on a turbo Firebird was 2 seconds or 5 seconds, it doesn't really matter beyond the point that it dramatically affects drivability.

Modern turbo engines are much better but there's nothing like an electric drive train for responsiveness. The difference is that the users actually fell in love with some of the quirks unlike with latency. Petrol heads like to master things like double clutching and heel and toe downshifting and have fun doing them even though they're just ways to get smoothness despite the limitations of the drive train.

This is really interesting but you have to wonder whether it's measuring equal things. The measurement is

These are tests of the latency between a keypress and the display of a character in a terminal

What about a mouse pointer move on a current laptop or desktop, a scroll gesture on a trackpad or a current touch device? Those feel absolutely instant to me - it seems like the optimization for latency has just moved to the more common modes of interaction. To do anything at all on an Apple //e, you had to type. To do most things with a current computer, you point.

In most systems the mouse cursor is rendered with a hardware sprite like in old games consoles. In some GUIs it's actually switched to software rendering when you resize windows so it stays synchronized with the laggier redraws.

Right, I get that, my point is increases in 'keypress to character in terminal' latency don't warrant conclusions like "Almost every computer and mobile device that people buy today is slower than common models of computers from the 70s and 80s." Low-latency interaction is still there, it's just not in typing because typing is a less important mode of interaction.

On those old 8 bit machines, a mouse would trigger an interrupt, which would then draw the screen / move the sprite. Latency is likely to be one frame.

You take that back!

That makes me wonder.. how many hacks do we have implemented just for the cursor to feel like it does.

> To do anything at all on an Apple //e, you had to type.

No, with an Apple IIe Mouse Card, you could use e.g. MousePaint:

If there is one comment to make about MousePaint, it’s that for a graphically intensive Apple II application, even on a 64K Apple IIe (which ran at 1 MHz), the user interface was surprisingly responsive. The mouse was remarkably smooth, and scrolling around the pixel canvas was seemingly effortless. Menus also appeared instantaneously, with no noticable drawing lag.


That doesn't count. With a card, you could turn your Apple //e into an x86 PC as well.

At one point, I had a paper that demonstrated that Windows 95 never went idle; it was always polling, apparently for mouse events.

When I’m back at my PC I’ll have to test this for myself. I saw a great GDC talk (my favorite one actually) about measuring “real” latency for debugging lag complaints of for Call of Duty. https://www.gdcvault.com/play/1023220/Fighting-Latency-on-Ca... - this link shouldn’t need GDC Vault access.

I agree, this is one of my top 5 GDC talks.

For measuring, if you have an iPhone, there’s an app “very snappy” that works well for it, you can scrub frame by frame and mark the events to get the difference (no affiliation, just a happy user).

You're probably referring to the app 'Is It Snappy?'.

I was, yes. Good spot.

What are the other 4 in your top 5?

https://www.gdcvault.com/play/1022186/Parallelizing-the-Naug... - a studio took a well known ps3 game and ported and heavily optimised it for PS4, and this is the tech overview.

http://www.gdcvault.com/play/1021825/Automated-Testing-and-I... Automated testing and instant replays

http://www.gdcvault.com/play/1022195/Physics-for-Game-Progra... Networking for physics programmers

http://www.gdcvault.com/play/1020583/Animation-Bootcamp-An-I... Procedural animation for indies.

All are an hour long but great watches.

Thanks! If only going to GDC itself were as productive :)

One thing the article doesn't mention is that modern devices push two orders of magnitude more pixels with one order of magnitude more color bits per pixel. That requires much higher throughput, which causes much of the added latency on the display end.

That doesn't really have much to do with it. The big difference is that now we are compositing 2D UI using GPUs, which require double buffering if you want to avoid tearing.

Older 2D accelerators had hardware support for beam-avoiding blits that allowed for single-buffered compositing. You could still see unsynchronized blits between distinct buffers, though. We could easily build 2D accelerators that work like this today, and they would use less power than GPUs.

while that's true, for the use case "terminal" it doesn't matter, since the information transmitted from CPU to brain doesn't change.

Well, ssh sometimes certainly feels sluggish to me. If you have your aliases configured and are typing at a high speed, you should be basically limited by your fingers.

Terminal use at this level of focus, is inseperable from thinking. You do not think to compose a command nor contemplate to move your fingers to type and execute, it just happens.

Therefore I think latency is also severely important at the terminal level.

This is not an apology for high latency, although, on the other hand, typing at the speed of thought is the main cause for rm -rf / and other similar mishaps.

Or a sign that you need more scripts/aliases

Try mosh, it displays keystrokes locally before the response from the server comes in.

> That requires much higher throughput, which causes much of the added latency on the display end.

That's not all the reason. More pixels pushed would only increase latency if the CPU/memory/etc wasn't faster than the older computers.

People demand 60fps, so systems prioritize throughput over latency. See all the layer-based systems like coreanimation.

Most old console and arcade games ran at 60fps without any abstraction layers. 60fps is the bare minimum, not a special requirement. You can already buy 240Hz monitors.

Sure, but they pushed way fewer pixels with way fewer colors.

The main benefit of layer-based systems is latency, not throughput. Prerendered layer contents give you lower latency (and more consistent latency, which is usually more important perceptually) when scrolling, because you only have to adjust the positions of layers.

Yeah, this is actually a super-complicated tradeoff. Whether layers are a win or not is largely a function of how much of the screen can be cached in the layer and how much needs to be re-rendered, so in the scrolling use case a function of how fast you're scrolling, and of course how long it takes to re-render the content. Once your scroll speed exceeds the rate at which new content can be rendered, the experience degrades.

My conclusion is that if being able to re-render the content consistently in under one frame time is feasible, it's a win over layers. Fortunately, this is possible if you're willing to write fast code to render on the GPU.

There's a lot more complexity to this based on whether compositor latency is optional. Generally if you go fullscreen you can bypass the compositor. This is feasible for games but much less so for terminals and editors. I think that in some cases on Windows a swapchain can be promoted to a hardware overlay, and there you also get the opportunity to save one frame of latency. I think we'll see more of this in the future.

I'll write about this in more detail (with measurements) soon.

Fast scrolling of prerendered layers is what I mean with throughput. You can move layers around quickly as long as their contents don't need rerendering.

Linux would be faster with a PS/2 keyboard, non-compositing WM, and plain ASCII Athena Xterm with a bitmap font.

Why will Linux be faster with PS/2 keyboard rather than USB keyboard?

Because PS/2 is interrupt driven whereas USB relies on the CPU polling for events.


It's mostly irrelevant. USB uses polling internally, but the polling frequency is high and an interrupt is generated on the bus side when something has happened. I have taken latency measurements on PC hardware for a serial communication situation, and USB latency was a very small part of the problem. 1-2 milliseconds typically.

It isn't irrelevant because USB packets still have to wend their way through the host stack after reception. Interrupt driven IO is always lowest latency.

There is no separate interrupt line that devices can use to interrupt the host. All USB traffic is started by the host.

Yes, but with sufficiently high polling frequency it doesn't matter very much. (It is true that each hub may add latency, as your sibling comment noted.) Regarding the "interrupt on the bus side", I may have expressed myself badly - I meant PCI(e) bus.

> plain ASCII

Is Unicode support in xterm really adding the latency?

UTF-8 adds a fair amount of work to every operation; UTF-16 and -32 multiply the amount of data transferred.

But in practice, I'd expect non-ASCII to be associated with more expensive features, which has more of an effect.

> 16187 mi from NYC to Tokyo to London back to NYC

This seems wrong. https://www.submarinecablemap.com/ has ~30,000 mi (48,000 km) from NYC -> London -> Japan -> Seattle.

> than sending a packet around the world (16187 mi from NYC to Tokyo to London back to NYC, maybe 30% more due to the cost of running the shortest possible length of fiber).

The author gave "as the crow flies" numbers - you might want to contact them to point out their 30% is an underestimate.

The difference between Android and iOS is pretty huge and it's the reason why I'm using an iPhone instead of my previous Android device. I feel like scrolling on Android is way too delayed.

Would love to see someone do this on an Amiga running Cygnus Editor. That was the fastest code editor I've ever used.

Yeah - i too wonder how an Amiga, Atari ST, or even early Mac would fare.

Would be cool for someone to reproduce these results over at the Living Computer Museum in Seattle. They have heaps of old systems that you can play with.

How about the input latency of Parc Alto? PDP-7? :-)

When I had my old CRT-connected Linux box -- I would kill the X server process from the text console, and it seemed that the monitor went blank even before I pressed the Enter key.

I had forgotten about that, but I've totally had this experience too. A sufficiently low-latency response feels like it has negative latency.

While the high speed camera rig sounds fantastic, we actually got great results using the Mark 1 eyeball and good electronic stopwatches.

I was highly skeptical at first, to put it mildly, but won over by the accuracy and consistency of the results. Caveats: it’s manual effort and you probably should have the same person measuring specific areas for repeatability.

The reason I thought it couldn’t work was that I knew about human reaction times, or at least thought I knew. The thing is: most of those reported times are for unanticipated events. If you are prepared, you can get much more precision than you’d think at high levels of accuracy.

Another trick: side by side comparisons (though that requires two identical machines, so not always feasible).

I don't understand what you're proposing here. How can you measure 30-100 ms latencies with a stopwatch? My guess is that the standard deviation of human reaction time variability is at least 40 ms

Reaction time is irrelevant here, because you're anticipating a response to your input, not triggering the stopwatch in response to unexpected events. Human sensitivity to timing is much better. Great drummers can hit the beat to within a few milliseconds. If you've watched the movie "Whiplash", think of the "rushing or dragging" scene.

See http://journals.plos.org/plosone/article?id=10.1371/journal....

Following a beat will give you very consistent times, but they can still be off! Try with a stop watch and try to stop it at exactly 10 seconds (you are free to look at the clock), you'll probably be 10-30ms off. Then try stopping it at an arbitrary time, but the same every time, and you will probably stop it at the exactly same time every time!

Suppose the human is allowed to reject trials where they feel they botched the timing?

Let's say the clock flashes an LED every second, and you are trying to press a mechanical button that will make an audible click on the 10th LED flash.

To decide whether or not to accept a trial, the human has to judge whether or not the audible click and the flash of the LED were simultaneous. I have no idea how good we are at that, but I'd expect that we are better at that than at reacting to things.

Even better would be to make the button also flash an LED, so that there is no error from differences between aural and visual processing speed and latency.

I found an interesting paper on using a computer to test latency of professional skeet shooters.


The most interesting thing I learned about was the three modes of Saccadic latency. I wonder if there are aspects of computer GUIs that work for or against the three modes. Or decades later in 2017 if the three mode idea is still current thinking in the field. The variation of Saccadic modes does exceed your prediction of 40 ms.

I would assume it would be easier to test latency of FPS video game players rather than simulating skeet shooting using a FPS-like computer system. In fact that would be a novel scoring, ranking, or handicapping system for a FPS.

My experience is that a skilled clocker can have 10ms precision if you take away the reaction times. Now a bad clocker we are talking 100-400ms off. Try for example stopping a stop-watch at exactly 10.00 seconds.

Four trials on an iPhone:

  10.00 (on the dot)
I’ll have to see if it can measure and display milliseconds.

Some stats for this series:

mean: 9.9475

std: 0.0826

stderror: 0.0584

How do timing side channel attacks work over a network with 100ms latency when you need nanosecond accuracy to execute them? By taking several million samples.

> I was highly skeptical at first, to put it mildly, but won over by the accuracy and consistency of the results.

How did you measure accuracy?

By also hand-timing things we could measure automatically, for example. One case I didn’t believe was a consistent, but tiny regression that would only show up hand-timed. I was utterly convinced it must be some systematic measuring error. Then we ran two computers side-by-side, and there it was, clear as day! And then we began the investigation in earnest and found it.

> I was highly skeptical at first, to put it mildly

Same way I'm feeling right now. Any more reading on this (or easy experiments) you might be able to point us to? :-)

I'd be curious as to what game the Gameboy Color was tested on, because that 80ms figure seems off to me. I don't know much about the joypad hardware so I'm not sure how long it would take from pressing the button to delivering the updated joypad state to the CPU, but I can tell you how long it should take once it hits the CPU:

In most games, the joypad input is scanned once per game loop, which is generally the same thing as a graphical refresh, which is set by the hardware at 60Hz. So depending on at what point in the cycle you press the button, the worst case latency for the software to detect the change is ~16ms.

Supposing this button press resulted in some in-game action, it would then take visible effect at the next screen refresh.

Now, it's common that a game loop looks like this:

    Update graphics from current game state
    Check inputs
    Update game state
    Wait for next frame
This is because the graphics ram (VRAM) can only be updated during a ~1ms window after the start of a frame. An interrupt tells you when this period begins. So the most common way to code it is to rush to do all your graphics work quickly, then spend the rest of the frame on other things while you wait for the next one.

As a result, it wouldn't be surprising if our input was delayed an extra 16ms while we wait for the frame AFTER the one where our input is detected.

So I would have expected a latency around 16-32ms - to get 80ms, either the button hardware would have to have added 50ms of latency, or the game they were using was very strangely coded, or the action the game took simply did not have any visible change on screen despite internal state changing - for example, an animation whose first few frames are the same as the standing still animation.

As an interesting additional note: If they were testing the Gameboy the same way as other mobile devices - scrolling the screen - it would be even faster. In the Gameboy scrolling is implemented in hardware - you draw tiles to a 32x32 tile memory region, but the screen is only 18x20 tiles. You can update a scroll X and Y register to pan the visible screen over the 32x32 area (this is how everything from sidescrollers to Pokemon do smooth scrolling, since tiles can only be placed on an 8x8 pixel grid). This register, if updated immediately after detecting the scroll input, would take effect immediately, even if it was midway through drawing a frame (resulting in a tearing effect which in most cases wouldn't be noticed but could be abused to produce interesting special effects). So the latency then would be 0-16ms + whatever the hardware adds.

Of course, you could pare this down further by writing specialised code that checks the joypad input rapidly instead of once per frame, or use the joypad interrupt to be immediately notified upon a button press (note this interrupt suffered from significant limitations, so no-one used it, and as far as I can tell the Gameboy Color doesn't have it at all, much to my chagrin since I'd figured out how to use it in a good way). That would likely violate the spirit of the comparison though, since you could equally program any of the other contenders to be a dedicated "fast scroll" machine.

Very late reply, just for posterity: I was mistaken about the GBC not having a Joypad interrupt. Turns out I just had a bug, and apparently misread or misunderstood something somewhere along the line that made me think that was the cause.

I've switched back to Terminal from iterm on the mac because of latency. I hate even small lag.

Try alacritty. It's primitive, written in rust, though it super quick (to be fair I have never done any measurements). At least I can use vim within it, without any noticeable lag.

Thanks but ..

    $ brew search alacritty
    Closed pull requests:
    alacritty HEAD (new formula) (https://github.com/Homebrew/homebrew-core/pull/8727)

Would be interesting to do this on a modern system running MSDOS.

I especially love input latency caused by javascript event listeners triggering on every character entered into input boxes, like currently on YouTube :/

Interesting. Someone should start selling a special Linux gaming distribution coupled with the right hardware. I wonder how Alienware and the likes compare.

From a big systems perspective, circa 2002: http://bitmasons.com.s3-website-us-east-1.amazonaws.com/pubs...

On seeing the date range in the title, I was hoping that latency had been pronounced dead. Sadly, this doesn't seem to be the case, but at least we're back down to latencies we saw in the 1970s.

I wish he'd checked some BSDs while at it. They do feel so much more responsive. Do not know whether there's any factual basis to it.

This is the same guy who did the keyboard latency test:


thay was on the frontpage of HN.

IIRC there was some critism there. He didn't include the mechanical switch type and it's actuation point and IIRC the text was very unprofessional (capitalization errors).

John Carmack said years ago, "I can send a packet across the world in less time than it takes to put a pixel on the screen."

And you pretty much have to, with the poor design of web apps like JIRA!

Edit: Someone disagrees? I have definitely had text entry into a JIRA box slow to crawl — when everything else ran fine — because the AJAX calls had to close out before it could accept the input. Not being snarky, you can definitely be in a position where you do have to wait for your packets to finish round trips before you get your keystrokes on the screen.

Just YouTrack it :) - much more responsive from JIRA (I do not work neither for Atlassian, nor JetBrains, but we use both products @ work.

Or Phabricator.

Other weird things, like it's faster to send stuff around a data center than it is to write to local disk (not just latency either, but sometimes throughput too). Actually depending on how good the network is, it's probably faster to send stuff to a different data center than it is to write to local disk.

It gets really crazy when you realise that it is faster to distribute localized data in a compute cluster over multiple nodes over the network for IO than to just dump them to the local disk.

Fun coincidence: Dan links to a post of his about how fast networks change storage from the article we’re discussing. https://danluu.com/infinite-disk/

You forgot the last part: “how f’ed up is that?”


But in reality, waiting 5 seconds for modern webpage to load all the fancy Javascript (in response to that small packet), pixel rendering latency is not such a problem.

This article rules. And it led me to the website of the Recurse Center, which I'm going to apply to.

Where can I find more articles, such as this, with objective and practical perspectives on computing fundamentals?

I've seen a lot of cool stuff come out of people from the Recurse Center. If you go you should write about what it's like and what you do. It sounds like something I'd (and many others) would really enjoy but it sounds really difficult to take a "semester" or whatever off from everything, run out to some code camp, and do that until you're done.

Yes, Recurse Center is awesome. I did a "batch" there this fall and enjoyed the experience a lot. I got a chance to meet Dan there and we had some great conversations about latency, among other things. I recommend it, if you can get the chance.

Awesome. I don’t live in New York and that part might be challenging to pull off without taking a job there, due to cost of living, but it sounds like exactly the kind of opportunity I’m looking for and ready for so I’m going to apply and feel it out. If you have any tips, I could use them!

80ms latency on a Gameboy? That's 10fps. I call shenanigans.

Review the difference between latency and throughput. If you think of the machine as producing a sequence of pictures, and your input as affecting those pictures, it takes time for that input to show up. If you have a hypothetical machine that produces 1000 pictures per second then you can count the pictures between input and change and get the latency in ms.

GBA ran at 59.97fps with input applied next-frame. That's 16.6ms latency - much less than what OP "measured"

Well, that is a mystery. It's possible Dan is in error. If so, I'm sure he'd like to correct it.

I speculate that it's possible to measure a different value. First, it may include key travel. Second, a given program may choose to sit on an input for more than one frame (perhaps he picked a "bad" game for this test). Anyway, I highly suggest reaching out to him, because I agree this is an interesting discrepancy.

Missing rows in the tables are human reaction time (~250 ms) and human perceived-as-instantaneous time (~100ms).

Beyond a lot of wasteful programming, we also understand better today what we are engineering for than we did in 1977.

I don't believe these metrics, because I can certainly tell the difference between a 10ms ping and a 60ms ping in an online game. Yet there is this cat in my neighborhood that makes me think I'm deluding myself, and that I'm actually a quarter-second behind the real world.

It is a long-haired black cat. It doesn't like us. But it does like taking a shortcut through our front yard. If it sees us, it freezes. It waits, mid-stride. Then when it's satisfied we're not a threat, it starts moving again. If I move my arm or shift my body, it freezes again, sizing me up, sure that this is the day -- after all these years -- that I'm finally going in for the kill.

What's weird is I am convinced that this cat knows when I'm going to start moving. It halts before I move. By the time my head is moving or my arm is waving, it is already frozen and on high alert.

My working theory is not that this cat is telepathic or that it has defocused temporal perception. Rather, I think that because the cat is smaller than I am, its eyes are closer to its brain, its nerve impulses don't have to travel as far, and it really and truly knows before I do that I'm moving. Otherwise I can't explain this cat's superhuman reflexes or its blatant disregard of general relativity.

It seems likely to me you're just experiencing the stopped clock illusion, especially since you mention moving your head.

When your eyes focus on something new, your brain deletes the blur from your memory and fills it in with what you start looking at. The cat could notice and stop during your refocus, and because you can't see it during that period, all you see is it already stopped and perceive it as having been stopped the whole time.

Stopped cat illusion. That might be it. I will experiment.

Did you ever touch the cat? Can other people see the cat too?

Did you show this story to different people? Can other people see that different people?

I, for one, do not actually exist. Sorry.

Cats are certainly known to have good reflexes.

>human perceived-as-instantaneous time (~100ms)

Nonsense. You can test this yourself. Open a fast terminal, eg. xterm (libvte based terminals have a low framerate and are unsuitable).


    sleep 0; echo "test"
    sleep 0.05; echo "test"
The difference is immediately obvious.

Also, it's immediately obvious to anyone who's ever ssh'd to a server on another continent. The added ~100 ms latency makes for a very different "feel".

That is describing a different issue.

This test shows your visual system can detect the difference of 50ms between animations. Indeed, the human visual system can also tell the difference between 60 FPS (16ms) and 120 FPS (8ms), and even higher! This doesn’t mean that you can tell the difference striking a key and seeing the character draw on the screen after 8ms instead of 16ms.

Typing (i.e. pressing many keys in sequence instead of only a single key) could then be interpreted as animation, which by this logic would make such a difference in latency detectable again.

~0.1s is the limit after which there is a distinct delay between action and response. This is not the same as the point at which things feel instantaneous.

> 0.1 second: Limit for users feeling that they are directly manipulating objects in the UI. For example, this is the limit from the time the user selects a column in a table until that column should highlight or otherwise give feedback that it's selected. Ideally, this would also be the response time for sorting the column — if so, users would feel that they are sorting the table. (As opposed to feeling that they are ordering the computer to do the sorting for them.)


I’m not denying that. I’m saying that anything over 100ms is meaningfully deficient, and anything under that is nice to have but not distractingly bad.

... for single events.

A latency of 100ms in online games is definitely not perceivable as “instantaneous.”

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact