- Screenshot, screenshare, or remote desktop/VNC server without having to use different protocols for different compositors
- Use a third party app to control monitor outputs(resolution, orientation, framerate, layout, etc.) generally, for example arandr or xrandr (wlroots does support this)
- Be able to do something like KeepassXC's "Autotyping", where it fills in a text field with the password from your password manager.
- Have a third party app give a selection of open windows. I think all compositors have some API for this, but they are all different. (for example rofi's window switcher mode).
Generally, I think the problem isn't so much things that can't be done in wayland at all, as the way to do them is different across different compositors, so whereas with X it was pretty straightforward to mix and match apps between different DEs and apps that worked on all of them. In wayland there are more apps that only work on gnome, only work on kde, or only work on wlroots based compositors. And some apps that duplicate parts of the code for all three.
For screenshots or screensharing, please use the xdg-desktop-portal API. That should work correctly in X11, Wayland, and within a sandbox. For generating fake inputs, a library is being worked on called libei, that also should work in X11, Wayland, and within a sandbox.
For controlling monitor outputs and open windows, those APIs are different because each compositor has a different set of features for what those mean. Generally you should not be using random applications to configure monitor outputs, only your compositor's configuration tool is going to be reliable there. Applications that need to set custom resolutions can use the viewporter or use fullscreen-shell.
Worst case scenario if you need to duplicate parts of the code for all three, I don't think that's a problem. Just put it in a library if it needs to be re-used, the only alternative there is that upstream maintains the relevant bits of that duplicate code, which I'm sure you can see why they would be reluctant to do that.
> For screenshots or screensharing, please use the xdg-desktop-portal API. That should work correctly in X11, Wayland, and within a sandbox.
This does not allow one to write a screenshot application; this has a protocol that allows Flatpak applications to requæst that the compositor take one and as far as I know it only has support to take on from the entire screen.
Most compositors also have the ability to take such screenshots outside of Flatpak; this simply moves it inside of it.
The “screenshot issue” with Wayland is not that compositors can't take one; it's the lack of an the a.p.i.'s necessary to write software that can take one.
> For generating fake inputs, a library is being worked on called libei, that also should work in X11, Wayland, and within a sandbox.
I could find nothing on this on the internet.
> For controlling monitor outputs and open windows, those APIs are different because each compositor has a different set of features for what those mean. Generally you should not be using random applications to configure monitor outputs, only your compositor's configuration tool is going to be reliable there. Applications that need to set custom resolutions can use the viewporter or use fullscreen-shell.
And that one “should not do this” yet this is supported and used by many on X11 would be one of the missing features that would lead to Wine developers not being interested in a Wayland port.
> Worst case scenario if you need to duplicate parts of the code for all three, I don't think that's a problem. Just put it in a library if it needs to be re-used, the only alternative there is that upstream maintains the relevant bits of that duplicate code, which I'm sure you can see why they would be reluctant to do that.
The problem is that these per compositor a.p.i.'s are unstable.
GNOME extensions and KWIN scripts work by allowing one to directly hack the internals of the compositor in unstable ways; they are much like kernel modules and they break from version to version.
What you are saying is not correct; the xdg-desktop-portal API is not tied to flatpak, it has support for individual windows, it's generic and can be re-used by any sandbox, and it can also be used outside a sandbox. (Snap I believe is using it and it should be not too hard to get it working in other tools like firejail either) Using it has the benefit that your application will also work inside a sandbox and doesn't need to have additional code paths for X11 and wayland and for the sandboxed case. See here for the description of libei, it's not ready yet but is being worked on, and should have a similar design that allows it to work correctly in a sandbox: https://who-t.blogspot.com/2020/08/libei-library-to-support-...
In almost all cases, Wine should viewporter or use fullscreen-shell to set custom resolutions for specific programs, the only reason you would need to allow wine to have access to an xrandr-like API is if you wanted to run your monitor configuration tool from windows, which I strongly doubt people are wanting to do that. If you are using GNOME/KDE, the GNOME/KDE control panel is always going to be the most reliable way to configure that.
Yes you would need to track upstream and keep up with their unstable APIs in order to write such a library that but that's no different than if you asked upstream to do it, they would just be doing that in their tree. (In some cases this is what they do anyway with the xdg-desktop-portal) In any case you would need to be more specific about what it is you want because just saying "implement everything from X11 exactly the way X11 does it" is not useful, that's never going to happen, so let's focus on what actually it is that is needed.
the supported options are whether it should be interactive, and whether the dialog should be modal. There is no option to indicate what kind of screenshot you want (window, monitor, full screen, rectangle). You can hope that the compositor's implementation lets the user select what they want, but the compositor is free to take the screenshot however it wants, the wlroots xdg-desktop implementation for example, currently just always takes a screenshot of the whole screen. The API is oriented more toward's Gnomes approach of considering the screenshot dialog to be the responsibility of the compositor, and doesn't work well for sway's approach of considering the screenshot dialog to be the application's responsibility.
The Screencast protocol at least allows you to specify if you want to capture a monitor or a window, although not all compositors support both (for example xdg-desktop-wlroots only supports the monitor type). Now hopefully in the not-too distant future most compositors will support it, atm, at least in my compositor of choice, it doesn't work yet.
And in both cases the the xdg-desktop-portal API doesn't really work well with apps like Flameshot or Peek, where you determine what to capture by resizing the window of the app.
> For screenshots or screensharing, please use the xdg-desktop-portal API. That should work correctly in X11, Wayland, and within a sandbox.
As Blikkentrekker said, it isn't sufficiently flexible for making a screenshotting app, it is only useful for requesting that some other (possibly compositor native) application take a screenshot and return it to you. Flameshot for example, evaluated using xdg-desktop-portal and determined it wasn't suitable (seeh ttps://github.com/flameshot-org/flameshot/issues/446#issuecomment-774372329). It is also notably missing a way to request a screenshot/screenshare of a single window or of a region of the screen rather than the whole screen.
> For generating fake inputs, a library is being worked on called libei, that also should work in X11, Wayland, and within a sandbox.
Well that's still a work in progress isn't it? My response is about what currently doesn't work, not what will work in the future.
> Generally you should not be using random applications to configure monitor outputs, only your compositor's configuration tool is going to be reliable there.
1. For minimal compositors such as sway, creating an advanced monitor configuration tool that handles hotplugging outputs, or with a GUI is out of scope for the project.
2. Being able to script changing monitor configuration, and bind those scripts to hotkeys is pretty important to me. Thankfully that is pretty trivial with sway, but isn't really possible if the only way to manage monitors is the "compositor's configuration tool". And even if it is, such a script isn't portable between compositors.
3. Maybe it is different with wayland, but my experience in X11 was that configuring monitors with xrandr or arandr was much more reliable than any of the DE's designated monitor configuration tools. And even if reliability is no longer an issue, users might prefer a different UI.
> Worst case scenario if you need to duplicate parts of the code for all three, I don't think that's a problem. Just put it in a library if it needs to be re-used, the only alternative there is that upstream maintains the relevant bits of that duplicate code, which I'm sure you can see why they would be reluctant to do that.
Maybe libraries would help, although afaik, such libraries don't currently exist. And some compositors don't have public APIs for some of this functionality at all, and don't really want third party apps to have access, although it is frequently possible to use internal APIs if you can deal with it changing in backwards incompatible ways without any notice. I don't really understand your argument about "upstream" maintaining duplicate code. If there were standardized APIs, there wouldn't be any need for duplicate code.
I did neglect to mention before that all of these are to some extent, somewhat privileged operations, and it does make sense from a security perspective to limit what applications can do these things. However, a big missing piece of wayland is a standard way to grant certain application elevated privileges. From what I can tell most compositors either take the approach of not letting third party apps do a privileged action, or letting all third party apps do a privileged action. Granted it's a difficult problem for a variety of reasons, but the "third party apps aren't allowed to do privileged operations" stance obviously breaks things that worked in X.
An example would be that X11 has protocols to allow the manipulation of currently selected text. In Wayland no such protocol exists as it's entirely agnostic of how clients handle input and render text.
The result is that on X11 I have a hotkey that automatically normalizes strings to their unicode normal form, something that is not as easily implementable in Wayland, or at all.
A far more basic example is that in order to open new card packs in Hearthstone, to buypass the chore that that involves I can in X11 quite easily send an endless string of space keypreses to an application window without having that window even open or focused. — this is not generally possible in Wayland. Looking it up, a tool ydotool exists to bypass this limitation, which requires root access and directly accesses the input devices to do so.
In Wayland, you would use the same method to manipulate the selection. Read it out from the server, do your transformation, then generate fake inputs to rewrite the text. Running ydotool requires elevated access with suid because that's a privileged operation -- if you're running X11 without a sandbox that has a similar drawback as it allows all clients to intercept inputs and send fake inputs to any window.
Independent of Wayland or X11, why should a user not be able to take inputs for their own window (as in, the owning process is run as the same user)? Why does it require a tool that needs extra privileges (that presumably works by doing it at a lower level)?
That's a genuine question — as in, I'm mostly thinking about Windows and how window message handling works with UAC windows. You can send messages to your own windows all day without needing extra permissions, but not other users' windows. I assume that's what you're referring to with the X11 sandbox; is it reasonable to set up Wayland _without_ such a sandbox?
The short reason is that if the user can do it, so can any application running as the user.
I don't know about you, but I'd prefer my applications not be able to inject and read inputs arbitrarily, though it may be that even stricter sandboxing is needed to make that a reality; Wayland being stricter than X11 is just one step along the way.
> though it may be that even stricter sandboxing is needed to make that a reality; Wayland being stricter than X11 is just one step along the way.
It is not a situation of “it may be”; it is a situation of “it is”.
Right now, it is useless as the security boundary on Unix and any other operating system is fundamentally the user and malicious software that runs as one's user can modify every file and process one owns anyway.
I read one comment a while back by a developer that illustrated the fruitlessness of Wayland by saying that it is essentially a lock on a door, that stands in the middle of room, that one can simply walk around, claiming to add security.
That would be true if someone refused to use any form of sandboxing, however there are multiple sandboxing solutions available that the "door" works in combination with. That is the only real way to make this kind of security work on an ordinary Linux distribution, the approach used by Android where a new user is created for each application is not really feasible.
There are also ways to sandbox X11, it's a bit harder to do, but you do have some options on how you'd like to do things.
Right, so that would be a place where something like ydotool would come in handy too because fake inputs coming from there wouldn't be coming from the XTEST device.
Can you explain what you think these are? In my experience, this is not really the case.