Hacker News new | comments | show | ask | jobs | submit login
Typing with pleasure, and low latency (pavelfatin.com)
141 points by fanf2 12 months ago | hide | past | web | favorite | 68 comments

> Because sampling rate is fast enough to misinterpret contact bounce as keystrokes, keyboard control processor perform so-called debouncing of the signals by aggregating them across time to produce reliable output. Such a filtering introduces additional delay, which varies depending on microcontroller firmware. As manufacturers generally don’t disclose their firmware internals, let’s consider typical debouncing algorithms and assume that filtering adds ~7 ms delay,

Since it's debouncing rather than outright spurious activation, it's possible to latch/register/sample on the initial switch closure and only delay recognition of the subsequent release. This seems like a good idea since the press is almost always where a user's attention is focused, and the timing of the release being important is usually associated with holding the key much longer than the bounce period (e.g. modifier keys, cursor/player movement). Anyone know if keyboards are doing this?

I spent a few years working on embedded keypad products. Often there is a hardware keypad scanner that takes care of much of the details, not allowing the programmer this level of control. Your driver configures the rows and columns (inputs and outputs), the scan rate, debounce time, etc, and the hardware will monitor for edge transitions, then perform a scan, capture the result in a register and assert an interrupt. It will likely have a double (or more) buffer mechanism to prevent dropped events due to software latency.

In low power/mobile situations this allows the host CPU to sleep as much as possible and even the scanner, which walks the rows (or cols) need not be running all the time.

While many SOCs have a integrated keypad interface, there are also keypad controller ICs (some GPIO expanders have the ability to double as keypad controllers) that free the programmer from having to manually implement the scanner logic. I suspect these types of chips would be found in a typical USB keyboard. Software may still have to deal with additional debouncing and ghost keys depending on the complexity of the IC.

Edit: I suppose one could configure the keypad controller to apply no debouncing and then perform a software debounce the way you describe.

Computer keyboards typically contain only one chip which has a combined USB-PS/2 port, scanner and a 8051 to glue things together. DIY keyboards normally just use any microcontroller with a built-in USB controller.

This is was my thought too, but most keyboards have a matrix and use chips with less inputs which makes the problem harder. There are theoretical ways to write it better (you would need to have extra state for all buttons) but most firmware I have seen just samples at a rate such that the period is longer than the longest bounce.

TL;DR of very good while long ;) article:

Use a responsive editor (makes the most difference).

* Use a low-latency keyboard, if possible.

* Choose programs that add global keyboard hooks wisely.

* Turn off unnecessary “image enhancers” in you monitor.

* Enable stacking window manager in your OS (e.g. in Windows 7, 8).

TESTED EDITORS RESULTS, Average latency, ms

    IDEA-zl     11.9
    GVim	19.0
    Gedit	23.9
    Emacs	44.7
    Sublime     52.1
    Atom	90.0
    Netbeans	119.1
    Eclipse	133.6

What is IDEA-zl? (Google turned up nothing)

does that mean IDEA 2017 is zero latency ?

Looks like it:

> So today, after almost 6 months of extensive testing, we are enabling zero latency typing as the default setting in all IntelliJ-based IDEs, including CLion.


Note: those are the worst-case timings, power-saving mode in a VirtualBox.

IDEA without the zero-latency mod is 198.8.

MacOS: VSCode and iTerm got 17.x ms, Terminal.app is at 7ms.

>Regardless of keyboard type, key switches are mechanically imperfect and are subject to contact bounce — instead of a clean transition, the switch rapidly bounces between on and off states several times before settling. Bounce time depends on switch technology, for example, for Cherry MX switches bounce time is claimed to be less that 5 ms. Though exact probability distribution is unknown, basing on related empirical data, we can assume that average bounce time is about 1.5 ms.

This is amazing. I thought I knew everything about mech keyboards, but this opens new perspective.

Cherry MX switches are much bouncier than some types of switches. (This explains why some keyboards with MX switches end up dropping or duplicating keypresses when their controller uses a poor debouncing algorithm.)

For instance old “complicated” Alps switches (circa 1990) have an extremely clean switch from off to on, with almost no bouncing.

Here some results for VSCode: https://github.com/Microsoft/vscode/issues/27378

Update: on my system Emacs has an average of 6ms vs VSCode 17ms.

Wish I could type with pleasure and low latency on my Android phone. Even on my HTC One M9 it's cringeworthy!

That's the main draw I have to try out the BB KeyOne - I can still type faster on my physical keyboard on my Dell Venue Pro than I can my Nexus 6, but if I could swipe on my BBK1, who knows?

The jitter and latency on androids is terrible but I feel like they're more accurate than the iphone for some reason.

My $20 raspberry pi with a cheap USB keyboard has neither problem though...

The lag behind your finger on Android is one of the reasons I have an iPhone. It's really noticeable.

Yeah, especially since the M8 really killed it in terms of latency: https://forum.xda-developers.com/showthread.php?t=2706200

It seems difficult to make the screen worse than the previous model.

"John Carmack explains why it's faster to send a packet to europe than a pixel to your screen"


Posting Google's web cache, since the site appears to be down:


Not all heroes wear capes. Thanks!

Also check out the author’s companion article “Scrolling with pleasure”: https://pavelfatin.com/scrolling-with-pleasure/

I thought it was a fascinating read.

> As we can see, Aero introduces at least one frame delay (~16.7 ms for 60 Hz refresh rate) and leads to time discretization.

Yep. There is one built-in frame of latency in the Windows composition engine. It's even documented here: https://msdn.microsoft.com/en-us/library/windows/desktop/hh4...

Rather than performing composition as late as possible, which would be beneficial for latency, Windows performs composition as early as possible, at the start of the frame. This introduces a completely unnecessary 16.67ms extra latency into everything you do. There is no supported way to disable the compositor on Windows 8-10 (though the article links to a scary-looking hack that apparently works in Windows 8,) so you're stuck with this.

I really hope this situation will be improved with future updates to Windows 10. Microsoft are still making improvements to the compositor, for example window resizing is much smoother in the creators update, but as far as I can tell, the one-frame delay still exists.

The zero latency typing change was the biggest "I didn't know I wanted that" feature I can think of for years.

It made the thing a pleasure to use and I didn't hate it before, I just didn't know what I was missing.

Anybody else seen a massive increase in the latency of graphical Emacs on recent versions of OS X? Please don't reply if you only ever use Emacs inside a terminal emulator: I'm very much attached to my mouse.

The latency problem began when I upgrade from 10.8 to 10.11. (I use Mitsuharu Emacs, but would be stunned if the problem were absent from plain-FSF Emacs.)

P.S. Has anyone gotten the typometer application described in the OP to work with a graphical emacs on OS X? Whenever I open the application, my Emacs just freezes till I close the application.

Yes, it works for me. I'm using the Mitsuharu version. Results: text mode - 6.7ms (SD=0.3); fundamental mode - 6.7ms (SD=0.2); js2 mode - 13.2ms (SD=11.6); org mode - 13.9ms (SD=14.2)

I'm glad someone's worrying about this!

Interestingly, this topic is quite old. The original mainframe systems had channel controllers for I/O which did a lot of processing locally, which included echoing and even local editing, freeing the CPU for "real" work. This approach was thrown out when minicomputers arrived; this is why the Unix IO system looks the way it does and why C, in a then-noteworthy departure from most languages of its time, didn't include I/O operators.

Even in the pre-TCP ARPANET, network latency on interactive connections was an important topic (this is when the main backbone was a single 56K line IIRC). The MIT SUPDUP protocol (Super Duper remote access alternative to Telnet) included a local editing protocol for connections to remote machines. Even non-line-mode applications could interact with it so essentially run part of the interface remotely all in the interest of zero latency.

This is very cool. I primarily write in Word and Texmate. Word is noticeably slower (greater latency) in many if not most circumstances. As file size grows, in particular, Word seems to slow more.

This is especially noticeable when adding text in the middle of a longer document. It seems as if it is laying out many subsequent pages in a blocking fashion, even if those pages are not visible.

I'd think that would have more to do with painting the screen then actually processing the keystroke. A few ms difference wouldn't be noticeably slower unless you were watching for it specifically.

What other measure of keystroke latency matters for typing than time from keystroke to display? I certainly am not impressed just because my keystrokes got placed in some invisible software buffer really quickly.

I guess my contention is that we all should realize a giant program like word is going to exhibit areas of latency that simpler programs can be more efficient in.

Idk, I don't expect word to pain as quickly as vim because why would it? It's huge.

Word is what I use for editing rich text, and I can tell you from firsthand experience that doing the same in vim or emacs is much more of a chore and less intuitive.

Idk, seems like comparing a wrench to a hammer from my perspective, that's all.

> I guess my contention is that we all should realize a giant program like word is going to exhibit areas of latency that simpler programs can be more efficient in.

But that's just lazy design. The immediate effect of typing a character (i.e., showing up on screen) hasn't changed in decades. Yes Word may do other stuff, but none of that other stuff is in the critical path for typing latency.

Think of a database like Oracle. Oracle does lots of stuff, but its critical latency path (committing simple transactions to the log) is as fast or faster than "simpler" ACID databases.

Considering that current Word doesn't do that much more than Word2000 but is undoubtedly slower, why shouldn't people complain about lag?

I don't experience any lag with current word except for very large files. It's slightly less responsive than typing in vim, but most operations feel instant.

This timing is pretty coincidental for me. I never really thought all that much about refresh latency/etc... (which I realize is weird since I do play a fair amount of games) but I rebuilt my home office a few weeks ago, and for the time being all I have to connect my 2014 MBP to a 4k monitor is an HDMI cable....and the 2014MBP can only do 4k@30hz over HDMI. Lets just say the keyboard/mouse lag is....infuriating. And it gets WAY worse the less you scale the external monitor.

I found this too when I tried running my PC with a 30Hz display. I was surprised how bad it was. Windows '9x's default mouse sampling rate was 35Hz, and that was perfectly tolerable. 25Hz/30Hz games are playable. 60Hz will be better, but there's no reason a 30Hz monitor has to be an absolute outright disaster. And yet...

Obviously over time a bunch of extra frames of latency have snuck in, and at a refresh rate of 60Hz it's just not noticeable enough for enough people to have proven worth fixing.

(I've read of a lot of people finding 60Hz monitors more annoying to use after they've spent some time with 144Hz. So roll on 144+Hz... perhaps either we'll all upgrade, and the cycle will repeat, or our eyes will be retrained and we'll start to demand more from our existing equipment.)

I had the same issue when I got my 4K monitor! I initially blamed the monitor but fortunately switching to DisplayPort solved the issue.

Bit depressing to read about such low latencies given every day I have to work with 500ms+ latencies due to remote access to the VM I work on.

Have you tried mosh [1]? And if latency is too big to fix it with mosh or the firewall prevents using it, I have found that using Emacs shell buffer and lsync to transfer commands and files on pressing enter or save made remote development possible over links with over 1s of latency.

[1] - https://mosh.org

"(Why you should trust Mosh with your remote terminal needs: we worry about details so obscure, even USENIX reviewers don't want to hear about them.)"

Oh, mosh...

Several (Ok, many. I feel old now. Satisfied?) years ago, doing remote development over a VPN, Emacs + Tramp mode was a lifesaver.

Some suggestions:

1. mosh instead of ssh 2. run editor locally and edit files remotely (either using the editors built-in support for that or something like sshfs)

I have found that using lsyncd works much better than sshfs for remote editing.

lsyncd - https://github.com/axkibe/lsyncd

Hmm I tried using his tool on my Mac and it did not seem to work for any of the editors I tried (system vim in a terminal, TextEdit, MacVim).

Does anyone know if there is an existing tool for these kinds of measurements on a Mac?

Doing a few google searches mostly turns up this article. But maybe my googling skills are weak.

I got it to work after a few tries. Used it with VSCode, terminal, iTerm, Hyper Terminal, and TextEdit.

It took a few trials, and I had to disable transparency. I think it also doesn't like blinking cursors, and if (...) is turned into it's own glyph, you should start the line with some dots of your own to prevent that.

I wonder how they managed to get this low typing latency on JVM...

Contrary to folklore, the Oracle JVM is one heck of a workhorse.

I don't doubt it. It's just that I was curious. I wonder if the same is possible for eclipse.

And don't even think about a bluetooth keyboard!

Atom 1.1, though. We're at version 1.20...

The article was published on Sunday, December 20th, 2015.

Although at the time this was published Atom 1.3 had been released, however I do not know when the author recorder their data. Atom 1.1 was from Oct 2015 so thats not to out of date.

Sure, I'm saying it's out of date now...

Can't relate. Coding on neovim in tmux on a remote machine obviously with a very slight lag.

>Error establishing a database connection

TL;DR: Use Windows, as the latency penalty for using Linux is excessive. I'm hoping whatever multiserver µkernel os replaces Linux actually does better.

Latency is where µkernels have historically fallen down.

That's a surprise. Does anybody have a link that talks about this?

This paper has a neat table in it. Third page.


For contrast, Linux takes whole microseconds.

Note that, for Linux, "IPC" is kind of loaded since it tends to refer to TCP/Unix sockets/signals/etc., which aren't used as much as IPC on a microkernel system. The right comparison would probably be with system calls overhead.

That table shows 0.09us (or 90ns) for a 1-way IPC; a cheap system call on my laptop (using https://raw.githubusercontent.com/tsuna/contextswitch/master... because I'm lazy) is about 59ns, 2-way.

I'm misunderstanding something. I interpret "Latency is where µkernels have historically fallen down." as implying that µkernels have historically been worse than Linux. But, that table shows them historically being sub-microsecond.

I'm unable to find references, so I'm reconstructing this from memory and the paper (http://sigops.org/sosp/sosp13/papers/p133-elphinstone.pdf). I have never had any experiences with L4 or any of its relatives; all of mine were with Mach and its derivatives, along with reading about the Scout microkernel (http://www2.cs.arizona.edu/projects/scout/).

As I recall, the problem is that raw IPC costs are a red herring. It's possible to get the IPC costs almost arbitrarily small, if you're not actually toting any data or if you don't have memory protection domains separating the components (as in Scout).

If you are toting data around, such as reading or writing to a filesystem server, you have three options:

* Copy it. That's kind of expensive.

* Share it. Copy-on-write magic, for example. Unfortunately, that requires fiddling with the VM system, to set up a mapping between user-space and filesystem-server-space, for example. Fiddling with the VM system can be surprisingly expensive, too.

* Pre-establish a shared-memory buffer. This is what L4 does(?), if I'm reading section 3.2.2 correctly. It may be much better than the other options, I have no experience there.

(Excellent paper, by the way. I'm hoping to get to do something with L4 at some point; microkernels are neat and it seems like it doesn't suck.)

I guess "high uptime" wasn't on their priority list

I find the fastest editors I use (Sublime, Notepad++) to be very stable and have uptime of weeks. On the other hand, the Electron nonsenses have to be killed quite often to clear memory and bugs.

I was talking about the website, I guess

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact