
Why Is It So Hard to Detect Keyup Events on Linux? - robertelder
http://blog.robertelder.org/detect-keyup-event-linux-terminal/
======
cryptonector
I mean, that there's no key up event in tty context goes back to the beginning
of time, and is not Linux's fault, and not even Unix's (nor VMS's) fault, or
anyone's fault, because this all works the way it does because of how
terminals worked, and they worked the way they did because it was _simple_.

If, in the late 70s, key up events in tty context had been important, then
terminal vendors would have developed an escape sequence system for expressing
those. But it wasn't, so they didn't.

So, yes, it's absolutely impossible to get key up events in tty context, and
at this point it will almost certainly stay that way forever, as it's too late
to retrofit terminal emulators, drivers, and applications to understand
whatever protocol for communicating key up events. (The right way to do this
would be to develop an escape sequence protocol that the tty/pty drivers could
decode and turn events into out-of-band events to be delivered via ioctl()s,
that way applications that don't care about key up events don't see them and
don't need to be modified. But who is going to do all that work?)

~~~
gameswithgo
this kind of highlights one of the differences between FOSS and paid software.
“not our fault” “too late to fix it”

versus a manager forcing you to fix the damn problem nobody cares whose fault
it is.

of course this difference is not innate, just a tendency.

~~~
cryptonector
Eh? No, no one can fix this. Oracle couldn't fix this in Solaris. Microsoft
can't fix this either (in Windows, in Linux, not at all).

It's not a question of money, or managerial will-power. You're asking to boil
a great lake, and if you have the money for it, you will almost certainly find
better uses for it. There is no market demand for a "fix" here, so there
cannot be any managerial will-power either.

Nor is this due to Unix being open source back in the 70s, because it simply
wasn't open source (and save for the erstwhile OpenSolaris, it still isn't).
It wasn't entirely a research vehicle either, as AT&T used Unix as the OS for
its phone switches. VMS too could not have had this feature, and for the same
reasons -- though I'm not familiar enough with VMS, I feel absolutely certain
that VMS did not have this either. And the hardware terminals (they were _all_
hardware terminals back then, as there were no graphical terminals in which to
emulate a terminal) were decidedly not open-source, and their vendors were in
it for the profit.

This, and all the wonky, hacky things to do with terminals, goes back to the
way things were almost half a century ago, long long before "FOSS" even came
to be.

Software business models have absolutely nothing to do with why you can't have
key up events in ttys.

EDIT: And incidentally, windowing systems happened on the scene around
1984-1985. By 1987 even SVR2 had a windowing system in the AT&T Unix PC. But
it was too late to retrofit key up events into ttys -- already by then there
was over a decade of legacy, and hardware terminals continued to be a thing
for some time yet. This is a crucial lesson in engineering: even relatively
small amounts of legacy have profound effects on what is feasible, so often
you have to get things right in the very beginning (precisely when it's
hardest to get things right!).

~~~
zamalek
> No, no one can fix this.

The nice thing about escape sequences is that they are ignored if not
understood.

> Microsoft can't fix this either (in Windows

Because it's not an issue in Windows? The Windows console, as primitive as it
is, supports monitoring the status of the keyboard (which can be used to infer
key up events).

~~~
cryptonector
MSFT is adding a pty driver and support for Unix-style applications.

------
dooglius
The OS is abstracting away details about where the characters come from, which
is a keyboard in this case. The characters could also come from other sources,
such as a file, or a speech-to-text program getting input from a microphone.
Key-up events make no sense for these sources. Essentially, the author wants
to go down a layer of abstraction to use his keyboard not as a character input
device but as a grid of buttons. By default, this will require special
permissions on most linux distros (and rightly so, as it allows for
keylogging), but this is a matter of changing one's udev configuration; root
is not inherently required.

In any case, the stated goal "to remotely navigate a robot over an SSH
connection using the 'w', 'a', 's', 'd' keys" is misguided to begin with; what
happens when your connection drops and the robot can't be stopped?

Addendum: has the author thought about the case where the user is using a
keyboard layout where the WASD keys are not together, or where the user is
using a non-latin-alphabet keyboard? As someone who uses a dvorak-based
layout, I am annoyed at how often developers screw the key/character
distinction up and assume everyone uses qwerty.

------
egwynn
The terminal is the wrong tool for this job. I believe the author realizes
this in their exploration of the topic, but I think it still bears saying
explicitly. This task is difficult not because of some big design mess-up, but
because this use-case is well outside the design constraints of the technology
he turned to first.

EDIT: I’d also like to mention that “Linux” has nothing to do with this. One
would face the same issues using a Windows SSH client connecting to a Solaris
SSH server.

~~~
empath75
Right. If you want to control something in real time, use a real time
protocol.

~~~
egwynn
Honestly I probably shouldn’t have mentioned SSH, because it also doesn’t
really deserve any blame here. People can / do tunnel real-time traffic
through SSH without any problem. The real root problem is wanting to get so
much detailed information from a tty-like interface. In this case it just
happens to be via SSH. The author would have the same issue with telnet, rsh,
etc.

~~~
zaphar
He could also totally still use SSH for this. He just has to tunnel something
else over it instead of using a TTY. SSH itself is somewhat agnostic to
whether you use a TTY or not.

------
vesinisa
That raw keyboard events are not delivered through SSH connection is entirely
expected. At its core, it is a text-only communication protocol. At the
discretion of the client terminal, there could be an ANSI escape to enter mode
where raw key events are delivered, akin to unbuffered input. But that is
nevertheless beyond the scope of what SSH offers.

~~~
kurthr
Yes, delivering timed key events would be a security issue.

~~~
mehrdadn
Huh, why?

~~~
chrisseaton
If each interactive key stroke is sent in its own packet, then you may be able
to guess which keys the user is pressing based on just the timing of packets
coming in, and so guess what they are typing without decrypting the stream.

[https://www.usenix.org/legacy/events/sec01/full_papers/song/...](https://www.usenix.org/legacy/events/sec01/full_papers/song/song.pdf)

~~~
mehrdadn
But what does timestamping have to do with sending each thing in its own
packet? Did you mean something else by 'timed'?

~~~
noselasd
You, as a malicious 3. party, captures the encrypted packets between a client
and a server. You can't decrypt those packets.

However if each keystroke is sent in its own packet, you note the timestamp of
the packets and then you can infer which keys the user has pressed from the
difference in time between the packets.

~~~
mehrdadn
I get that, it's just not what I originally read the comment to mean. It
sounds like you guys say "timed" to mean "sent in real-time", whereas I read
it to mean "includes timestamps", that's why I was confused. (Because in my
mind I think of things like "timed automata" which are automata accompanied
with time variables, not automata that execute in real-time.) But otherwise
yeah, that makes sense.

------
nerdponx
This weirdly seems like the "right" implementation to me. Somehow I feel like
a TTY generally doesn't need or deserve to know when keys are pressed and
released.

That said, is there a more end-to-end summary out there of how keyboard input
is handled in GNU/Linux? I have the vague understanding that USB HID scancodes
are translated into keycodes, which are sent along to X applications or a TTY,
but where and how each step happens is still a bit mysterious to me.

~~~
amelius
> Somehow I feel like a TTY generally doesn't need or deserve to know when
> keys are pressed and released.

Why? What if I'm playing a game on that TTY?

~~~
cryptonector
Mainly because there exists no protocol by which the terminal (really,
terminal emulator) can indicate these events, and the tty/pty drivers don't
know how to decode that non-existent protocol to spare existing apps having to
be modified to understand it.

This all goes back to the 70s, when people first hooked up typewriters to
computers as terminals: everything was far too simple to make it possible to
add such a protocol, and nobody cared about key up events, and they weren't
going to care for a long time because the hardware and software all had to be
rather simple (exceedingly so by our standards today).

~~~
Dylan16807
You made a nice post but that's not the question being asked. The historical
record is different from the question of whether a terminal needs/deserves it
_this_ decade. ("need" obviously being non-literal)

~~~
cryptonector
The submission's title is "Why Is It So Hard to Detect Keyup Event on Linux?".
I knew the answer to that very specific question, so I offered it :)

As to whether we need to evolve this functionality today, I would hazard that
the answer is "no". I did sketch out how it would be done, if we really need
it and people want to implement it. But my guess is that there are too few
people with the desire and funding to work on this problem. Reading TFA today
is the first time I've ever heard of anyone wanting key up events in the tty
-a tty that has some 45+ years of history- so I'm a bit skeptical that this
will somehow become such a blindingly obviously highly desirable feature _now_
that it might get done.

~~~
Dylan16807
> The submission's title is "Why Is It So Hard to Detect Keyup Event on
> Linux?". I knew the answer to that very specific question, so I offered it
> :)

You didn't reply to the top level article, you replied to a specific comment,
which was asking a different question.

And "not enough desire to make the change now that we're so entrenched" is
still different from "does it deserve to know, if we were doing it right". And
I would say the latter is true. (Still ignoring "need" because "need" is
really vague.)

~~~
cryptonector
Fair enough. That post said:

> Why? What if I'm playing a game on that TTY?

So my answer was still on point.

~~~
Dylan16807
The predicate to that specific "Why" was "doesn't need or deserve to know",
not "is so hard to detect on linux".

------
adontz
ReadConsoleInput [https://docs.microsoft.com/en-
us/windows/console/readconsole...](https://docs.microsoft.com/en-
us/windows/console/readconsoleinput)

INPUT_RECORD [https://docs.microsoft.com/en-us/windows/console/input-
recor...](https://docs.microsoft.com/en-us/windows/console/input-record-str)

KEY_EVENT_RECORD [https://docs.microsoft.com/en-us/windows/console/key-
event-r...](https://docs.microsoft.com/en-us/windows/console/key-event-record-
str)

    
    
        bKeyDown
        If the key is pressed, this member is TRUE. Otherwise, this member is FALSE (the key is released).
    

Also, please note that INPUT_RECORD contains union of key, mouse, window
buffer size, menu and focus event records. I do not want to say interface is
more well thought per se, but it is definitely more rich.

~~~
int_19h
The reason why Windows has this stuff is because it never was particularly
supportive of remote terminals. Going back to DOS days, when writing a console
app, it was either the kind that only needed print, or else you basically had
full control over the screen area, changing individual characters, colors etc
in a random access way. The first approach would use DOS output functions that
allowed for things like stdout/stderr to work. The second approach couldn't be
redirected properly.

Windows inherited that model, and mostly just kept developing it until lately.

------
gnachman
This is functionality that the terminal emulator reasonably should provide to
enable games or other interactive applications. I believe that would solve the
author's complaints. Some work has been done along these lines in both Kitty
and iTerm2. Not everyone likes the idea because it breaks the basic
abstraction of a terminal. I kinda like it, though, and I'm optimistic that
the situation will improve in the coming years.

------
jchw
The title really should be "On a TTY" and not "On Linux" \- the reality is,
it's not that hard. The TTY is a TTY - it's designed for typing, not general
input. You could always forward your input events through another channel,
even over SSH if you wanted.

------
jesuslop
IIRC /dev/input/event* gives you that

------
solarkraft
Meta:

I'll again have to criticize this submission's title. It shouldn't be " Why Is
It So Hard to Detect Keyup Event on Linux?" (it's not a problem with the
kernel), it should be something along the lines of "Why can't I detect key up
events via SSH?".

And the answer to that is simple: That's not what it's designed for.

Or, even better: Instead of concentrating on the complaints part of the
article provide on the part in which you're providing value to your readers:
"Detecting keyboard events without a display server" or, if you want to get in
on the long headlines trend: "Detecting key up events in a TTY environments is
hard. Here are some ways".

~~~
tinus_hn
No, the submission title is the title of the linked page. That is correct.

~~~
solarkraft
It's not uncommon to correct click-baity titles in a submission. By
"submission" I'm also referring to the original article.

~~~
tinus_hn
Your comment does not claim click bait. Your comment claims you disagree with
the blog title. The rule is: Don’t editorialize.

~~~
solarkraft
"... unless it is misleading or linkbait", which I indeed think to be the case
here. I clicked because I thought frameworks like Wayland and X or something
else inherent about the architecture of modern Linux systems made it difficult
to detect key up events, but it just turns out the author was trying to use
something that doesn't have any concept of key events, leading to the feeling
I was misled.

------
AnthonBerg
Would it work if the sender immediately started sending a fast stream of
repeating characters over? Then the keyup on the receiver is when the stream
stops.

------
zwetan
Very interesting, in a different context I had to tackle a pretty similar
problem with Redtamarin [0]

Traditionally under the CLI you will manage key input wih a readline() command
or something similar to kbhit() and depending on your needs you'll use
getchar() then track if either a CR or LF is entered for the "end of command",
also EOF.

This is blocking, so nothing else can happen, and depending on how you do it,
you can only read single byte chars and not mutlibyte chars (like CJK input)

something like

    
    
        while( run )
        {
            i=kbhit();
    
            if( i != 0 )
            {
                key = String.fromCharCode( getchar() );
    
                if( (key == "\n") || (key == "\r") )
                {
                    run = false;
                }
                else
                {
                    buffer += key;
                }
    
                i = 0;
            }
        }
    

another way to do it

    
    
        while( run )
        {
            b = fgetc( stdin );
    
            if( b == EOF )
            {
                if( kbuffer.length == 0 )
                {
                    return null;
                }
    
                run = false;
            }
            else if( b == LF )
            {
                run = false;
            }
            else
            {
                kbuffer.writeByte( b );
            }
        }
    

which has 2 greats advantages, being able to read multiple bytes input (thanks
to fgetc() which read the raw byte) and detect EOF (CTRL+D under POSIX, CTRL+Z
under WIN32), but still blocking forever, using fgetc() the detection of EOF
is done automatically for you (while getchar() getc() etc. do not detect that
EOF)

now because Redtamarin is based on AVM2 and AS3, there is one part of the API
which try to reimplement the Flash API with such things like KeyboardEvent
that should be able to be non-blocking but still for a CLI environment

That KeyboardEvent should detect keyUp or keyDown but yeah it is hard to
detect and if you try to do that for multipel platforms liek Windows / macOS /
Linux it gets nigthmarish

In a little experiment [1] I found out you can do "stupid things" that
actually work, like spawning a child worker (AVM2 uses pthread), blocking on
the user input (like above) and then send back a message to your main worker
and so receive a "key event", all that allow to listen for input
asynchronously

But then, what about making the difference between keyUp and keyDown ? I
decided to ignore the keyUp because in fact it does not really matters on the
CLI or I least I don't see any use to it, for a GUI yes I can see the use
cases, but for a CLI? not so much

Purely on the CLI (no X Server) you don't really listen for key events you
read the stdin stream, the only special events are signals like SIGINT SIGHUP
etc. or special kind of signals like EOF.

The other things you can alterate is the buffering of that stdin and the raw
mode/cooked mode and echo off/on.

So for a use case like navigating something using the 'w', 'a', 's', 'd' keys
you just need to go async to listen for chars input (which key is pressed) and
probably a mix of timeout on the last key pressed and a "diff" between the
"prev key" and "last key".

If last key pressed is 'w' then go up, if you keep receiving a 'w' key you
keep going up, if prev key was 'w' and the last key is different you change
direction.

And to detect when to stop to go up if last key pressed was 'w' you just keep
the time when this key was pressed, if 1 second elapsed and no more key events
are received you stop going up, ergo you don't need to detect keyUp, but maybe
I'm missing something.

    
    
      [0]: https://github.com/Corsaair/redtamarin
      [1]: https://twitter.com/redtamarin/status/900794336031510530

------
zackmorris
This is quite possibly one of the best critiques of YAGNI that I've ever read.

I grew up in the waterfall (as opposed to agile) era of the 80s and 90s. Back
then, it was top priority to catch mistakes as early in development as
possible. This is the top article I could find on the concept, which seems to
stir a lot of debate:

[https://developers.slashdot.org/story/03/10/21/0141215/softw...](https://developers.slashdot.org/story/03/10/21/0141215/software-
defects---do-late-bugs-really-cost-more)

The gist of it is that if a bug costs $1 to fix during development, it costs
$10 to fix during testing and $100 to fix once it's in production. Maybe
someone can find the original quote.

Had I been working on terminals in the 70s, I would have been the annoying
person in the back of the room who raised their hand and said "what about key
ups?" There would have been a lot of muttering, much debate about how to store
keymap bit arrays and what might happen if they get out of sync, every edge
case would be explored, and in the end my opinion would be noted somewhere and
character streams would have moved forward as the "simpler" implementation.

But it's not simpler, because perfectly valid use cases were excluded.
Basically, their decision meant that we couldn't have video games in the
terminal. Kind of a big deal, if you ask me.

After so many decades of this, it's hard for me to drink the Kool-Aid on new
frameworks, even if they're ones we use every day like C++ (operator
overloading was maybe a bad idea in hindsight), git (can't store folders
without .gitkeep), Angular (two way binding - oops), React (front end PHP),
even HTML/CSS/Javascript (difficulty encoding our own tags as
components/nondetermistic inheritance across browsers/mutability). These
frameworks are great, but it takes a certain level of suspension of disbelief
to buy into them.

Give me 5 minutes with any technology and I'll find the conceptual flaws and
bugs that impose major hurdles on its conceptual simplicity, utility and
reusability. Basically everything I touch breaks. It's like a knack that makes
me a good programmer, but also a Debbie Downer.

[https://www.youtube.com/watch?v=MZF6EK7x4Dk](https://www.youtube.com/watch?v=MZF6EK7x4Dk)

P.S. The workaround for the keyup thing is probably to set the key repeat
threshold and repeat delays to 0 and check for repeated keydown events each
main loop, setting the keymap entry to true if the key is still down (false
otherwise). It's not perfect because it can't easily detect multiple keys down
or modifier keys, but that was one of the ways we did it in classic Mac OS
anyway.

~~~
int_19h
What makes you believe those were valid use cases for those in charge back in
1970s? Just because some hobbyist wants to run a real time game on a terminal
worth thousands of dollars doesn't mean that others are going to accommodate
that.

The most likely reason why they never bothered with key-up is because they
were an evolution of the teletype, and inherited its protocols. That was the
use case, not a generic human interface.

But even if there was some explicit consideration of this idea, I'm pretty
sure it'd die almost immediately, because carrying information about
keystrokes would double the bandwidth of an average session, and we're talking
about the time when typical transmission speed was measured in baud, and 4-
digit speeds were considered fast (the first terminal supported a maximum of
2400 baud). You need a much more profligate culture to add support for things
like that "just in case".

------
shmerl
What about Wayland?

~~~
mbrumlow
This is on a tty. So outside of X as well.

Both X and Wayland have key down and up events.

~~~
setpatchaddress
Yep. All GUI systems that I'm aware of have explicit keyDown/keyUp events.
They're essential for a number of things.

If the Unix console was being designed today, I expect it would also have
them, along with a sane system for reporting terminal capabilities and
probably a zillion other things.

It's too bad none of the console modernization projects have really gained
traction. The best we seem to be able to do is fish + UTF-8 + maybe Powerline.

