- It reads all events because it's meant to interface with your keyboard directly, for global hotkeys, macros, disabling or remaping keys, etc. Interacting with windows and per-application hotkeys are explicitly not a goal of the library at the moment.
- It reads /dev/input because the library was developed to work in as many environments as possible, including headless installations like raspberry pi's that may not have a graphical environment or even a monitor. There's an open change to _try_ to communicate with the X server first and fallback to /dev/input, but event suppression is not working reliably with this yet.
- It could read /dev/input by just being in the `input` group, but then `dumpkeys` doesn't work and you are stuck typing numeric scan codes instead of key names.
Due to a series of unfortunate personal circumstances I've been unable to give the proper maintenance the library requires, but the issues and thank yous have never stopped, and I'm slowly getting back to it again.
Little known trick: if you run the library as a standalone module (`python -m keyboard`), it prints a JSON object for each detected event. You can save them to a file and pipe them back (`python -m keyboard < events.txt`) to replay them like a macro.
Little known trick 2: I also created `mouse`, the companion library for my second favorite peripheral. It's the same thing, but for mouse events.
I figured it'd be worth a shot to ask you this here, since I just caught you: have you considered adding some support for "rehooking"/"restarting" the library, so that callees can restart your library whenever they detect that their environment has changed (e.g. input devices have been (dis)connected, etc)?
I've contributed with some info to this issue https://github.com/boppreh/keyboard/issues/264 as to how we're dealing with this in some of our projects. Funnily enough, I didn't know about `python -m keyboard`, and had I known about it, it would definitely have saved me some time!
Once again, thank you very much for your hard work!
I've been thinking about how to handle environment changes for a while, and it's pretty hard to reliably detect and respond to them in all supported systems.
But I really like the idea of a manual "re-init" method as temporary measure. This should be trivial in Windows, and doable in Linux after some refactoring.
Thanks for the idea, I'll take a look tomorrow.
This is basically the long and the short of it; it's a very old API, and retrofitting key-up detection reliably is a serious problem.
Perhaps a better distinction is "data stream of characters" vs. "data stream that can express key events (and perhaps other stuff as well)".
Even on Linux the low level kernel API for getting keypress events works this way.
Do you know of any source documenting which terminal emulators support this feature other than kitty?
Basically Kovid Goyal, the author of kitty, volunteered to write a spec, based on the work he did for kitty, but got fed up with what he perceived as pointless bikeshedding. The authors of other terminals represented in the terminal-wg (e.g. iterm2, mintty and vte -- which powers terminator, gnome-terminal and others) kept going for a bit, but there is no shared spec yet and I haven't checked if anyone else other than Goyal has implemented something at this point (I thought vte had, but I didn't immediately find it when I looked). It also looks like it might take some time till some de-facto standard emerges.
And you don't need root if your program's user is in the input group on most distros. You're opening a device node in /dev/input/XXX, the standard UNIX permissions model applies...
(That said, I wouldn't be surprised if debouncing were done in the driver. I just think it's the wrong place for it.)
It's complicated. Being in the input group is part of it, but to actually get the access you need to interface with the freedesktop-derived multiseat stack, which is as clunky as everything else that comes from the fd.o folks.
Linux exposes your keyboards as /dev/inputN devices and (with the exception of oddball devices or drivers without the hardware capability to detect them) every single one of them presents clean down/up pairs on every press, every time. If you want to read the device state, read it from the kernel.
The problem is that the author doesn't want to detect the device state. The author wants to write a program to run at the command line. And the command line is an environment descended from a long line of tty environments going back to the original teletypes of the 1960's. And those devices were never designed to expose a "keyboard" as a "device". They were a source of "characters" only.
Now, over time countless people writing countless terminal emulators have tried to address this shortcoming, in countless not-quite compatible ways. And that's been about as successful as you'd expect. You can do it, but...
But again, it's not about device support. Your command line program isn't and has never been connected to a "keyboard" device via any useful abstraction. The same feature that allows you to pipe input into it instead of typing at it makes this problem hard.
It's perfectly reasonable to setup a pipe that sends binary data through SSH:
cat somefile.bin | ssh someuser@somehost "cat - > /tmp/somefile.bin
For the author's original example you could write `read_raw_input.py` and then pipe it through ssh to your controlling program on the other side say, `receive_input.py`:
/usr/bin/read_raw_input.py | ssh someuser@somehost "receive_input.py"
On the other hand, if I were do to this I'd write my Python code so that it communicates with the other end directly rather than rely on the SSH daemon but that's just me (even if it was using SSH/paramiko internally).
If it's something that truly requires that sort of experience to work properly, ok, but if it's just a sort of "I want a fancy text UI that requires key-up events", this process is a bit much to ask of your users.
What you really want is that SSH implements support for a remote keyboard so that the application on the other end can simply read local machine key events and this is as far away from "upgrade the old way" as possible.
The interfaces for advanced UIs is Wayland and X. TUIs are meant for compatibility, not fancy ability.
> TUIs are meant for compatibility
Nonsense. Our machines don't even get GUIs installed.
Otherwise it wouldn't work on the kernel console (the only thing available without a GUI installed), over SSH, over serial links, even still on dedicated terminals and teletypes.
If some tooling doesn't work for us (which means over a terminal), then we don't consider it. If something doesn't expose functionality over a terminal, that functionality doesn't exist.
> If some tooling doesn't work for us (which means over a terminal), then we don't consider it.
Who is "we"? Even the most die-hard TUI fan uses them through a GUI terminal emulator, making that argument fall entirely apart.
Gvim works for visual editing
Yeah, but those are only readable by root.
It's as sane as asking why you can detect joystick movements or the kettle boiling.
It bothers me how little people understand about the terminal and "the command line", despite using it everyday.
If people read man bash from top to bottom they would probably get a feel for what is going on. Learn how changing the title of your tty works and there is not much left to _not_ understand.
>Something worth noting to avoid confusion is that if you run the example python keyboard example in the introductory paragraph of this article over an SSH connection, the code will still work and run, but when you type characters, nothing will happen.
>What gives?!? That's because it will be detecting 'local' keyboard events from the machine you just SSHed into! If it's a cloud server like EC2 or something, there probably aren't any keyboards attached to it!
"What gives?!?"? Is this supposed to be a joke? What else did you expect? Expecting your local keyboard to connect to the remote machine is as insane as expecting your keyboard to read the keys you want to press from your mind.
The problem he is running into is that SSH only forwards a character stream to the remote machine and that is entirely an SSH problem and the solution to forwarding the keyboard has been X forwarding. What he is asking is that SSH should be able to forward the keyboard without X.
He even mentioned a workaround that involves compensating the lack of this feature by forwarding the keyboard himself.
>Well, you have to build your own client/server application where a client/server listens locally on the machine with the keyboard, and then forwards these events to the remote machine where you're running the applications that needs to respond to these events.
My suggestion is that he should send an email to the openssh developers to add keyboard forwarding without X.
This isn't really a good experiment as it assumes that everything that can be sent over SSH will always be sent. This not the case in the terminal world where by default you get a processed character stream and have to explicitly enable anything else. For example kitty's keyboard protocol  mentioned elsewhere in this thread needs to be activated by sending the escape sequence CSI > 1 u to the terminal on the other side of the SSH connection.
This isn't that much work if you write it as a trivial Web app using WebSockets. Then you can leverage the browser's OS abstraction layer instead of having to write your own.
Put differently, they're making the transition from Linux as a commodity used to build the thing they're dreaming of to Linux as a critical design component.
For example, he calls out in the beginning:
> They are interested in performing some real-time based task that is controlled using keyboard presses. In my case, the goal was to remotely navigate a robot over an SSH connection using the 'w', 'a', 's', 'd' keys.
As the author discovers, that's fundamentally not how SSH works. This sort of behavior could be achieved using other mechanisms, but it's not even really an issue of the tty/pty. They're just trying to map a functional model from a different operating system to Linux.
As to the discussion of keyboard handling moves into Python code I would have again, gone a different path.
When the author took a turn into Python my first thought was this is trivial for evdev to handle (https://en.m.wikipedia.org/wiki/Evdev /
Next thing I know they touch on the kernel mechanisms managed by evdev.... And pivot across to X11 (which would seem to make sense until one realizes that the transition to Wayland from X11 is far further along than a layperson might imagine).
In the end a good write up which shows a lot of "raw power" on the part of the author. With some additional tutilage, exploration, or guidance they to really take their understanding to the next level.
For the folks who are in the weeds (like me) it is a good guided tour. Seeing into the "beginners mind" (and taking it to heart) can provide perspective as to how to make software more intuitive.
(Typed and butchered from my phone).
It can grab joystick/gamepad events though since it reads those directly. In that case you have to enumerate then specify which gamepad/joystick device to use so it can open the correct /dev/ path.
I have a planned change to use this a default, and fall back to /dev/input on non-X systems. It's not quite there yet, especially the capability of suppressing key events.
Carl Clover (Blazblue character) attacks on keydown, while Carl's doll attacks on key-up. Expert players change the rhythms of their key-down / key-up to make combos possible.
In general, these kinds of fighting game characters are called 'Negative Edge' characters. There's a large number of them in many different fighting games. I know its present in Street Fighter, Marvel vs Capcom, Mortal Kombat, and Injustice. Even SSB:U (Shield-release / perfect shields) is a negative-edge event, showing just how common negative-edges have become in modern games.
I personally am only really good at Blazblue: and thus, I know that Carl Clover, and to a lesser extent Taokaka and Lichi are negative-edge characters. Its an entire "character design" philosophy, to make certain characters feel much different from others. But its all over the place.
I HATE playing negative edge characters. If I realize a character has negative-edges, I run the heck away from them. Nonetheless, I accept that fighting games are fun because there's "always a character" that matches someone's personality.
If someone else has fun playing negative edge characters, I want to welcome them into the community, and therefore want to support negative-edge gameplay into a game. Its not about "me", its about "the community of players".
All of Carl's moves himself are on key-down.
So think of what your fingers must do to consistently pull off combos like: https://youtu.be/D8gPPB9YD6s?t=65
You get the benefit of playing two-characters vs the opponent's one character, but it means having to think about both characters (as well as having a wonky control scheme for the 2nd character). Furthermore, Carl is designed to play with the doll, so you only deal the same damage as everyone else if you successfully pull off these combos.
But having two-characters means you can setup unblockables more easily (doll hits high, Carl hits low), or weird pressure strings / frame traps that are unavailable to most typical characters. Which is where Carl's unique advantage really comes in. So its not a "damage" thing, its more about the mind-games you play with the opponent when its 2vs1.
Terminology: A, B, C, and D are the four attack buttons in the game. Numbers represent which direction you push (2A means A-button while holding "Down". 2 is down on the numpad. 8 is up, 6 is right, 4 is left). Every fighting game community has invented their own terminology. But knowing this should give you enough information to watch that combo-tutorial adequately.
A, B, C map to Carl, D maps to the doll. Holding D means that the doll starts to move left-right (4 or 6 with your left hand), letting go of D means that the doll attacks (negative edging). A, B, and C on Carl play as a typical fighting game character.
Carl is... not easy... to play.
My point: negative-edge gameplay is common in fighting games (and maybe other video games). Its reasonable to expect that a video game designer would use that technique.
I've played terminal action games before. Lets say you're programming one and have some useful function mapped to the negative edge. How do you implement that in the terminal?
Megaman's charge shot (and most "charge shot" video games, including Samus / Metroid, Rocket Knight Adventures, etc. etc.)
Maybe those platformers are more common? Either way, the negative-edge is all over video games, its a good control scheme.
The Windows terminal supports key-up events... as long as I can remember anyway. I think a number of Windows programmers who are looking at Linux's command-line API wonder why its so hard to get something that's common in Windows.
>In my case, the goal was to remotely navigate a robot over an SSH connection using the 'w', 'a', 's', 'd' keys. Real-time tasks like this require extremely high responsiveness to key events for palatable performance.
Yeah, I don't think a SSH connection is the right tool either, but there's the answer (and basically the entire article follows from that).
Is this same case with Wayland
The process is running as root and reading directly from /proc; the article acknowledges that it's essentially a keylogger.
So the compositor chooses what surface to deliver events to based on its own desires (like letting pointer focus enter background apps) and the users input. I think there is a protocol (used by Xwayland?) to allow a client to get events from any window if the compositor/user allows it
Why would you ever let a non-focused application subscribe to any key combinations?
* Media playback controls
* Hotkeys to take screnshots, or record videos
* Quake style terminals
* Ingame chat and game invites
* Color pickers
* Screen controls like F.lux
* Window management hotkeys
* Push-to-Talk in group voice calls
And that's just what I personally use on my own machine.
It's a massive security issue that any installed software can listen to any input activity or view/affect other windows, and that does need to be reigned in. But to claim that there's no legitimate utility to global hotkeys is absurd. We need robust permissions, not a completely crippled experience.
* What key events I care about. (I.E. what the keybind should be named in the settings UI)
* What script or message to pass back to me when that event is activated.
* What the default keycombo should be, if any.
Providing a cli interface for automation or binding a hotkey is imo more powerful and useful but its not discoverable. Doubly so for a controlling it over dbus.
The logical thing is that since there is a substantial use case and a desire to limit applications access to global info is that a permission system ought to have been built into Wayland such that applications could request not only global access to the keyboard but permission to get a notification when a particular key press happened.
Since this feature was a staple of desktop operating systems for decades it ought to have been part of the plan from the start.
The current model where everything running can listen to everything you do is not acceptable, but that doesn't mean there's no usecases that strongly benefit from global hotkeys.
Looks to me like something the DE should manage for you, not the application.
But everyone seems to disagree with me!
The answer to both these questions is: an application. Think of applications hooking into global shortcuts as plugins for your DE.
> But everyone seems to disagree with me!
The original post you responded to was suggesting that your DE, or at least something outside of your application, provide an API to register specific global hotkeys that you can listen for. In response, you said:
> Why would you ever let a non-focused application subscribe to any key combinations?
So to me when reading this thread, it looks like you're the one disagreeing with this idea. This is the first post of yours I've seen where you suggest this, and the thread begins with you shooting down the entire concept as ridiculous.
Unless you're suggesting that the DE comes with a preconfigured set of global hotkeys and cannot be altered or extended, nor handled differently by different applications. In which case yes, I strongly disagree with that.
There's a reasonable middleground between a free-for-all and nothing but what the DE already thought of, and that's a well-defined API boundary alongside per-application, per-feature permissions. Bonus points if the DE handles all key-combination assignment in a consistent UI and applications can only register a suggested keycombo, a context for when the hotkey should be activated, and a function to respond to the event.
Yes that's what I'm suggesting. Works fine for almost all use-cases like on macOS.
I guess it's just not a popular opinion lol!
I'm not sure what your hangup here actually is.
> > Unless you're suggesting that the DE comes with a preconfigured set of global hotkeys and cannot be altered or extended
> Yes that's what I'm suggesting.
Okay, let's use the example of music playback. If your DE handles music playback controls, how does it tell the music player when to stop and start, or skip to next track?
Does it hardcode a list of music players, each of which provides its own bespoke API, and the DE calls into the application?
Or does it provide a way for any music player to call into the DE, and listen for those play/pause/skip events?
And if it does that, then why are the list of events hardcoded? Why not allow the music player to say "I also play podcasts, and I want to provide a hotkey to speed up/slow down the playback speed, or provide a separate hotkey to skip ahead 30 seconds that's distinct from skipping the entire episode."? Why not allow Discord to provide a push-to-talk hotkey so that people on a group call can actually hear each other without dogs barking and keys clacking and people coughing?
Look at the VSCode Extension API for instance:
* Extensions are at least partially isolated from the main process and each other. I don't know how far this goes or how secure, but for our purposes they could be as isolated as you want.
* They can register a command which when activated, performs some action from within the extension.
* Those commands show up in the command list at the top of the screen when you hit "Control + Shift + P" (which is itself a command subject to all the same rules) and users can select whichever command they want to run.
* Those commands can also be assigned to a hotkey.
* Extensions can suggest a default binding for this hotkey, but users can go into the VSCode settings page and assign whatever keycombo they desire.
* Users can remove keybindings for commands that have them by default.
* Users can assign keybindings to commands that don't have them by default.
* All keybind editing is performed through the same UI, which is owned by VSCode and not the extension.
* VSCode itself is the thing listening to those key events, and it calls into the extension to activate it.
* There is a concept of context for when these keybinds apply, such as when a markdown document is active in the editor. Extensions provide these contexts by default and users can override them as they choose.
* Extensions themselves don't even start up until an activation event is met. These can be opening a file of a certain language, running a command, etc...
What about this model is objectionable in your mind? Who benefits from the lockdown you propose where no new ideas can be tried until the DE authority deigns to allow it?
This problem has been solved in the DEs for quite some time now, global hotkeys is probably the very last thing you want to reach for, when all other options are exhausted.
And when that happens, I don't know about other DEs, but the way you're describing with extensions is mostly how GNOME Wayland already works. If you want to intercept a key, you ship an extension for that, and display a UI to rebind the keys, or you can place additional entries in the system's keyboard settings panel. What you can't do is have random unprivileged processes intercept keys without the user's permission.
It sort of works in the mobile space (+/- accessibility services, which are the way to get that functionality if you need it), but that's only because mobile devices are consumption-oriented; there's only so much you can do with them, and they eschew utility in order to streamline you into being monetized by third-party services.
There is also more to this, showing the Wayland security model is incoherent. If debugging is enabled on your system I can simply debug your root terminal and start injecting commands into memory.
In Wayland the situation is also not so much different from Windows, you would just reconfigure and extend the DE itself. You are already relying on them to provide most of the UI to an extent.
For your latter point, Wayland is much different. In Windows there are ways to inject behavior into other programs or the UI elements. This exists so you can change the UI. Wayland's idea of security is very much against that, at least in the way it can be done on Windows/X11.
Input injection is being worked on in a different library, that brings a standard API that's supposed to work the same across Wayland, X11, and with sandboxed contexts: https://gitlab.freedesktop.org/libinput/libei
I know this is a rare use case, but the DE and the app need to work togheter to allow global shortcuts. I for example have a keyboard with many useless media keys , that I configured to run some specific scripts and some I could very easy set them up as Global shortcuts for some KDE apps. Not sure if other DEs would offer to setup global shortcuts for you.
Probably something most users will use is screen recordings and screen readers , this applications need to have global shortcuts(and access to the screen and window elements)
My success rate for finishing Hyper Hexagonest is falling; being able to inspect recent attempts might reveal where my timing is drifting.
(it's only ~25%-ish finished, unfortunately)
Alternative shortcut managers might also run into trouble (say, clipboard managers with customizable shortcuts) although those could work with a simple permission prompt for each new shortcut. Very annoying and obtrusive to the user, but the security principles remain.
There are also tools that extend some window managers with i3-like shortcuts and configurations; the config files are parsed on the fly and they need to work in the background.
Then there's tools like AutoHotkey that can be great tools for productivity through scripting and custom shortcuts. A lot of AH's functionality can be replicated using the standard hotkey API, but not everything.
There's also diagnostic tools (see Windows Steps Recorder) that record all keystrokes and generate a step by step report about what happened with screenshots as a guide.
IMO global key capture should still be possible with the right capabilities set because there are valid use cases for it. Requiring additional permissions is fine IMO, but completely removing the option to do this is a pain.
Example: I've bound several scenes in OBS to keyboard shortcuts so that I can switch scenes no matter what application I'm in. I'm pretty sure OBS does this by listening to all keyboard events on the root window. It's not very efficient, but it works.
Now imagine having to support a global shortcut daemon per DE that exists. I suppose someone would write an abstraction but really, it's something that FreeDesktop.org should provide (if it already doesn't).
Most of the examples are either global listeners or outside of a window environment.