> Web MIDI API - Allows websites to enumerate, manipulate and access MIDI devices.
This API is actually a bit horrifying from a security perspective. In addition to allowing you to use MIDI keyboards as input devices on websites, it also allows websites to send binary firmware updates to MIDI devices. The reason is that it's common to use custom firmware to backup/restore settings and enable neat effects and functionality on MIDI devices.
Mozilla's engineers have reasonably pointed out that an attacker utilizing Web MIDI could use MIDI devices as a stepping stone to launch an attack against the user's PC outside of the web sandbox. One such attack might be by reprogramming the device to appear as a standard USB computer keyboard and "typing" commands to the host.
At least one well known manufacturer has vouched for the technical safety of their musical instruments, noting that they're physically designed in such a way that the MIDI firmware can't alter USB firmware. But there's no way to know that every MIDI device has been similarly well designed.
As neat as Web MIDI is, I think Mozilla and Apple probably made the right security call here.
Fun fact: for quite a long time Chrome skipped over the user permission step in the Web MIDI spec, always allowing access and silently giving ad networks a list of connected USB MIDI devices with no user consent:
Not sure why that's more odd than other crazy fingerprinting techniques actually in use. Keep in mind no midi devices would need to be present for fingerprinting. Different failure modes, etc.
Especially in the porn industry where the end users are likely using incognito mode or a VPN.
I still don't understand how WebMIDI would be used for fingerprinting of the vast majority of users who don't have any MIDI devices connected to their machine.
Because thats what you want when fingerprinting....the few users who have one connected gives you probably quite and accurate fingerprint for those users.
I'm sure there are fingerprint libraries that include every possible API that the browser provides. Does MIDI provide a good fingerprint alone? Probably not, but it can serve as a few more bits of information thrown into the mix when implementing fingerprinting. It's not like it would take many engineer hours to add it to an otherwise already functional fingerprinting system.
It's far fetched to think that google added web midi in this way just for a couple of bits of entropy which are essentially worthless (no ad network cares about identifying like 0.01% of people, if even that. Yes it's very valuable entropy if you want to identify those people specifically, but who actually wants to do that?)
The point is not to identify the people using Web MIDI but to identify individual users, regardless of what information exactly identifies them. To that end, every single piece of entropy helps. A good approach to it in general is to opportunistically consume every available API that can possibly divulge identifying information.
A lot of people also do have a virtual MIDI device installed whether they know about it or not. The name of this device differs between different operating systems and operating system revisions.
Well, in this case, a porn site wants to tie what it shows you to what you liked last time you visited. They aren't after your identity per se. They are after a conversion. And since you might be using incognito mode (no lingering cookies), they care about fingerprinting for that.
Edit: I see the disconnect now. I'm not saying Google/chrome added the midi API for fingerprinting. I'm saying the screenshot way up this thread is an example of a site using it for that purpose.
I get different types of failures and messages from different versions of Chrome, Firefox, and IE. None of which have any midi devices. Those errors, or the structure of the resulting object if it succeeds, are all fingerprint inputs.
Yeah, ran it in Chrome, the browser didn't say a thing whatsoever and I see MIDIAccess object in JS console. Nice to know the browser just allows this entire API by default.
I would guess quite a few browsers or operating systems would implement at least one virtual MIDI device, so that sites wanting to play MIDI would work. Those virtual devices wouldn’t all be identical.
It might be a way to detect bots, even on headless browsers, that pretend to be Chrome but don’t implement the MIDI api. I’m sure crawlers are the bane of the porn industry.
Besides, I know they want to turn the browser into an os, but it's not one.
It's sandboxed from the os and limited to some use cases, which is the point. I don't want something capable of hot loading code from any web site to have the capabilities of my OS.
I am have dreamed of a browser that lets me create desktop shortcuts to webapps and then pretend as if the webapp is its own fully independent application but all webapps would still run in the same browser instance.
There are. I used Eudora up to 2005. Incidentally, I can't look at my email history before 2005, because, you know... formats become obsolete, hard drives die, etc.
Do those clients work on my mac, my chromebook, my windows box, and my android phone?
Call me crazy, but I prefer web apps for that kind of stuff. I'm also glad I don't have to download an app to use Hacker News.
As an independent developer, I am quite pleased that I can target one platform, the web, without having to deal with all the mess of multiple native apps, and worry that people won't run my simple app because they don't trust me not to delete their hard drive, and so on.
>Do those clients work on my mac, my chromebook, my windows box, and my android phone?
Yes.
>Call me crazy, but I prefer web apps for that kind of stuff. I'm also glad I don't have to download an app to use Hacker News.
Web means HTTP, Email is POP3/SMTP/IMAP. Different protocol, different programs. That you can use a website to view and send emails is not the default case and is merely a interface to those protocols.
I understand how email works. "Default case" is a matter of interpretation. Most people today use web based email (at least on computers as opposed to mobile devices), and it is much easier for most people to set up and get working than using a native client. The vast majority never think about wire protocols. I have implemented both HTTP and SMTP in C etc back in the day, but that is not relevant here.
Regardless, I said my preference is to use web based email, that's all.
Because the relevant issue is most certainly how people use it. People use web browsers to read their email. Why is what the RFCs say important to this?
Because the discussion above was about email. A browser is not a mail client (as in MUA). Unless a browser implements the RFCs regarding email, it's only a web browser.
>How about email apps such as Gmail or Yahoo mail?
So the answer is: It doesn't matter as those are not email apps. They are email frontends for a service that implements mail. If people think differently - it's their wording, but still a wrong one.
So you're just spinning on the definition of "email"? As opposed to recognizing that a particular activity people do (which I call "using email" but maybe you have a different word for), is very often done using a web browser.
Why you'd think debating the semantics of the word "email" is relevant to the discussion is beyond me. It makes me almost wonder if you are attempting to parody a certain type of pedantic technical person.
It's kind of sad though because it's still 10X easier to build a browser app than a native app simply because of the wealth of highly-usable stuff written for JS.
Every time I have to build a GUI in Linux I just build a webapp that connects to a TCP backend on localhost because that way I can just build a beautiful UI in HTML/JS/CSS and I don't have to deal with the mess that is GTK, QT, TCL, TK, and all that crap.
Android programming is another can of worms and I'm frankly fed up with Gradle updates breaking my project every update and needing 79+ files, multiple cludgy steps for signing APKs, zipalign (wtf) and other crap just for a Hello World.
What would be nice to have is "installable" apps that use the webkit rendering engine but have full access to the system including directly opening TCP ports and direct access to /dev. These would have to be trusted apps obviously. Websites that load code without consent should be restricted, of course.
> Every time I have to build a GUI in Linux I just build a webapp that connects to a TCP backend on localhost because that way I can just build a beautiful UI in HTML/JS/CSS and I don't have to deal with the mess that is GTK, QT, TCL, TK, and all that crap.
I suspect this has more to do with the state of native Linux development tools versus Javascript dev tools then it has to do with the general case. Dev tools for Windows and iOS/ MacOs are fairly straight forward. Not sure about Android, since my burning hatred of Java has removed my desire to mess with that platform entirely. (I know Kotlin exists, still not interested)
Update: I'm basing my comment of what it's like based on the quoted comment, not making an assertion about how good/ bad it is.
> Dev tools for Windows and iOS/ MacOs are fairly straight forward.
Can't speak much for Windows, but the Apple dev story is pretty good. Platform SDKs are deep and capable, if sometimes not well documented, and there's a clear "right" way to do most things. One can build a "world class" app with nothing but Swift+UIKit and few or no third party libraries. SwiftUI is rapidly improving this too, bringing a fully native "modern" reactive approach that works across all Apple OSes.
Xcode can be a cantankerous beast at times but it's been getting a lot better in recent releases.
> Not sure about Android, since my burning hatred of Java has removed my desire to mess with that platform entirely. (I know Kotlin exists, still not interested)
It's slowly improving but still very much a mess. Jetpack Compose looks to be poising itself as the SwiftUI of Android and that will no doubt improve things, but I have doubts that Android development will ever be as nice as iOS development is.
Well, IntelliJ Idea is a fantastically better IDE than XCode from a functionality standpoint, and Kotlin + Compose is IMHO, better than SwiftUI. Compose isn't a SwiftUI clone, it's a pure-functional memoization framework with compiler support.
> IntelliJ Idea is a fantastically better IDE than XCode from a functionality standpoint
This is not my experience, I've never quite cared for all the finicky setup needed to get things right, and it always feels a bit laggy. (Though IntelliJ is a world better than Eclipse).
It's definitely more laggy, but in terms of everything else, it's light years ahead of Xcode:
1) code navigation, editing, refactoring
2) integration with Git/GitHub, Issue Trackers, Cloud Providers
3) external build system support
4) multiplatform editing (java, js, typescript, kotlin, python, etc)
5) integration with testing, continuous integration, deployment
6) code analysis, finding problems
7) tons of special support for DSLs and third party frameworks
The only thing XCode is better at IMHO is UI building and OSX instrumentation. If you're writing tons of code that doesn't have a UI, especially for cloud backends, I don't think you'd use it.
Even simple things, like language injection, work wonders in the editor window. Having the IDE know how to syntax color, check, and code complete SQL, CSS, HTML, Regex, etc inside strings is a huge help.
I spend 15 years on emacs, and the last 15 on InteliJ, and having an editor that slices and dices code in a myriad a ways with easy to use automation, and actually indexes the code and builds a deeper understanding of the totality of your project is well worth it. I just wish it was less laggy in the UI.
If I was writing an iOS app, sure, I'd use XCode, but I can't envision being it general purpose for anything else, unlike the versatility of other competitors like VSCode.
> If I was writing an iOS app, sure, I'd use Xcode, but I can't envision being it general purpose for anything else, unlike the versatility of other competitors like VSCode.
I would never expect Xcode to be the best general purpose editor/ IDE. It's good for Swift/ iOS development which is what its build around. Personally, much of the time I'd rather have a performant editor than one that has a million bells and whistles.
> Dev tools for Windows and iOS/ MacOs are fairly straight forward
Are they? It's probably been a decade since I touched a native GUI (and I was without a mentor and working on already-old software) so I legitimately don't know. Using something like Visual Studio's form builder was fine enough, but it was not a very expressive toolset as I recall.
Web you can get started "instantly". Your browser covers most of the tooling you need and you can tweak any live site.
I don't like that that's how it is. From an abstract perspective I'd rather not be working on web because it seems like we're trying to make a better car by building a bicycle inside of it. But the low bar for entry is hard to beat.
> Web you can get started "instantly". Your browser covers most of the tooling you need and you can tweak any live site.
Obviously you can splat some HTML into a browser and get instant results, but once you start talking about apps with even moderate amounts of interaction, the simplicity of web apps falls away quickly. If you are talking about a sophisticated app, I don't think the complexity is any less when you are using javascript/ React versus Swift/ UIKit. The big win I've seen for javascript/ web apps is the fact that you are reasonably platform independent, obviously if you use Chrome and Chrome specific APIs, that falls away too.
I had to get back into C# a year or so ago to build a commercial internal production management application, WPF with XAML was exceptionally pleasant to work with and the final product just worked with few issues.
It felt weirdly like writing Vue-like code.
Microsoft nailed it imo, VS2017 and C# have come along way since I used to do .net 3.5 stuff.
I really liked C#, it’s http and a sync/await stuff was excellent.
Now try implementing a swipe tab UI in GTK with animations and embedded video. In HTML you can probably import someone's awesome library .js file and be done with it in 5 minutes. Video is a snap. In GTK you have to deal with some idiotic factories, pipelines, sinks, faucets, and other god-knows-what abstractions. In HTML it's just <video>.
What is the problem if the user give the permission to do so?
I don't get this let's not allow this web api because it's dangerous, well you only move the problem to an application that the user installs on his PC.
If the permission mechanism is correct there is no danger, a web applcation wants to access my MIDI interface or my USB or Bluetooth or whatever and it can. Isn't the same for mobile applications and permission?
So maybe and I say maybe we could stop having to ship an entire Chromium engine with Elecron just for a web application to access devices or files on the computer.
Sure but for a long time web has been touted (rightly or wrongly) as this safe sandboxed convenient distribution platform. The expectation is very much still that a website shouldn’t be able to harm your computer and that expectation isn’t going to change any time soon.
In top of that, less technically inclined people don’t even expect native applications to be able to harm their computer and we haven’t been able to change that expectation.
That's why I have almost no Apps on my phone. Just a lot of different browsers.
I think the new mobile Vivaldi browser is the first with an option to switch of the 3d sensor for the website
Not only brick the attached device, but also potentially fully compromise either the attached device or the device you are on. It doesn't seem responsible to put that kind of tis behind a permission dialog. Users are not in a position to make the judgment. The web should have a higher standard of security and privacy.
Because users are users and they win inevitably do the wrong thing. Normally not such a big deal, but with the interconnected world a compromised user is a big problem. It's used as a stepping stone to compromise others, cause problems to other systems by using them as slaves in a botnet or simply using them to send spam.
Users need to be protected against themselves as long as they can't take responsibility for their actions.
If the user is going to be tricked on the web, they can be tricked in other ways. If the web doesn't support MIDI, users will just download MIDI malware as an app.
By your logic, the web should not have video support either, because users are users and it will inevitably be misused.
If you were serious about addressing this: We need clear and robust and granular permission dialogs on web and native apps. Ideally they'd be consistent across web/native, which would help users trust their software, and understand the permissions they give.
One point not discussed here is by relying so much on app store and apple's walled garden, what do you do when china or other country ask apple to remove certain apps?
Bypassing geo block on websites is easy and there isn't a single source of truth on the web like app store is for the users. Can apple explain why they took down apps on the request of china in HK and how do you think that will play out when no web apps can work reliably on apple devices?
Censorship is a huge problem for app stores. They censor anything sexual but sexuality is part of human nature. They censor anything politically charged but it's part of human nature too. I hope the anti trust fine plays out.
Apple can protect the privacy of people by making it harder for them to be vulnerable by choice. People here point towards stupid users when saying that a normal user won't be able to connect usb and enable a feature. Why can't the same happen with browsers or apps on ios? Why the $99 fee just to side load apps? Why the need for a mac?
Just admit it's for profit seeking reasons. Ads in your app store are a proof of that.
It's still possible to partially implement it while keeping it safe by excluding sysex messages. Probably more than 90% of the end users will never need sysex messages so why bother?
I’m pretty sure GP meant why bother implementing sysex messages.
Personally, I think that’s the solution. Provide access to the most common and safe features while disallowing the unsafe one. You’d still get far more functionality than you have now and only sacrifice a little in exchange for safety (vs not having any at all).
I'm curious now that I read this, this means that web midi can access the virtual midi devices created under alsa on linux also? Which means if you have it routed to other software using Jack or aconnect, websites could send midi notes directly into whatever software your midi device is routed through?
So theoretically then, could a Jack aware payload be created that runs in the background, say disguised as a vst or ladspa plugin and when a user browses a malicious site, this site could, recognize the malicious midi device, create connections to other software and gain access through possible buffer overflows or other things?
It seems like a stream of midi notes could itself possibly cause a buffer overflow in certain programs. Muse and rosegarden are a bit buggy as is and frequently crash for me. From what i can tell there's a lot of midi aware audio software that likely contains a bunch of avenues for exploits and when you throw virtual midi devices into the mix capable of doing far more than hardware midi devices...
I don't see why the API should even allow enumerating devices. Put that behind a browser dialog. Webpages don't get to enumerate directories/files on your system - they pop up a file picker dialog. The same should happen with device selection. Once you've granted the site access to a device, it can ask you to associate a name with the handle, and provide whatever shiny device selection / device management UI it wants.
Better yet, reasonably abstract the raw MIDI protocol away with something that has suitable security and privacy properties for the web, and translate it to MIDI on the host. In any case, the current proposal does not seem to cut it.
I agree that it would be nice to have a limited amount of the Web MIDI spec, but bear in mind the staggering few people who actually have MIDI devices, and the even more staggering few of those people who want to use whatever website has midi support for its features.
To answer your question about why not to just limit the API: because that would be another data point to use to fingerprint users, and because the amount of engineering time that would have to go into Web MIDI support (including testing, security auditing, etc.) would never be worthwhile compared to putting those same developers on something that might be beneficial to vastly more users.
(Also note that Firefox made the same decision to implement nothing at all.)
This is just a philosophical standpoint, but I think supporting the interests a small group of experimental pioneers is generally more useful than their numbers would imply.
The idea that "staggering few" is a negative disappoints, given it has almost always been the "staggering few" who've progressed humanity.
Yeah. It's crazy to me that they haven't decided to create some abstract virtual midi instrument.
That you can instruct the browser to pass through to your devices.
It's a shame, but it proves just what a futile affair it is to retrofit ancient protocols that never had to consider host security. Especially because MIDI is so elegantly simple!
Sending binary firmware updates (sysex) is not a necessary part of the API... they don't have to implement that, and if they do, they can ask for additional permissions.
Allowing you to use a keyboard as an input device is incredibly powerful, and even that can be handled much as camera and microphone is: you give the site permission.
If you get the MIDI device to act as a keyboard it’s not for typing things in the browser but in the OS... you are out of the sandbox so it’s possible to download and install any payload.
I don't understand what you mean. We're talking about a piano keyboard by the way, not a typing keyboard. The browser uses it via the MIDIAccess API, and simply sets up a callback for MIDI codes, such as noteOn, noteOff, etc. I have used it extensively.
This API is actually a bit horrifying from a security perspective. In addition to allowing you to use MIDI keyboards as input devices on websites, it also allows websites to send binary firmware updates to MIDI devices. The reason is that it's common to use custom firmware to backup/restore settings and enable neat effects and functionality on MIDI devices.
Mozilla's engineers have reasonably pointed out that an attacker utilizing Web MIDI could use MIDI devices as a stepping stone to launch an attack against the user's PC outside of the web sandbox. One such attack might be by reprogramming the device to appear as a standard USB computer keyboard and "typing" commands to the host.
At least one well known manufacturer has vouched for the technical safety of their musical instruments, noting that they're physically designed in such a way that the MIDI firmware can't alter USB firmware. But there's no way to know that every MIDI device has been similarly well designed.
As neat as Web MIDI is, I think Mozilla and Apple probably made the right security call here.
https://github.com/mozilla/standards-positions/issues/58