Hacker News new | past | comments | ask | show | jobs | submit login
Details about MIDI 2.0 (midi.org)
281 points by dmoreno 32 days ago | hide | past | web | favorite | 147 comments

This bothers angers me: >To implement MIDI-CI and MIDI 2.0, you need a manufacturers SysEx ID. A SysEx ID by itself is $250 a year

That would be a re-hash of the hobbyist USB VID/PID fiasco. MIDI synthesizers are one of the main non-activities which non-professional people have been doing. So many amateurs and educators whip up MIDI synthesizers in a few hours in a workshop or after work. And that is thanks to MIDI being a dirt-simple, unlicensed standard.

>You will also have access to the MMA Github which has code for MIDI 2.0 to MIDI 1.0 translation (and vis versa)

This makes me sad and angry at the same time.

My introduction to MIDI was as a cheap-ass soldering-iron wielding 13 year old hooking together components for my Commodore VIC-20. I credit that work for my subsequent career as a scientist, programmer, and mathematician.

$250 a year. To solve a solved problem?

Everyone generates a UUID. Part of the CI setup is mapping the 128-bit UUID to a short identifier. Or some other compression scheme. Hell, even 64-bit random identifiers will almost-surely not collide on a bus.

So much for "hobbyist-friendly".

And totally unnecessary.

I was just listening to the midi 2.0 webminar from the midi association, and as I understood there are some free sysex and ci id for non profits / hobbists, but if you make money out of it you should register. I guess mainly to prevent ID collisions.

It's pure money grabbing. They could just as well avoid collisions by specifying a UUID for the SysEx ID, and then everybody could generate their own independently.

I guess you can use the free IDs, and then place the uuid in the stream or packet. Everybody happy.

In the USB case that simply translates to rogue VID/PID by vendors that don't want to pay the fee.

Steinberg tried the same thing with VST (2, not 3). No one uses (relies on) it.

At the end of the day specs are only as good as adoption. If there is a reliable workaround (or even easier way to do things) than using the UUID then people will do that instead.

Can amateurs club together and form an organisation to administrate allocations from a common ID range? Or just liblically declare that they are going to squat on a particular range?

Haha! All IDs that start with 0x0FEE ;p

Some opensource projects already squat on 0xF055

How would they enforce the SysEx ID charge if someone built a synth for personal use? I've wrote a number of midi programs (sequencing utilities) and never remember implementing a SysEx ID for anything.

Also, it's pretty naive of these guys to assume their Github repo will stay hidden behind a $250 charge for very long.

$250 a year is a pretty small amount, probably about what it costs to keep a registry of SysEx IDs so that manufacturers don't create conflicts.

What's the alternative? This is a bit like MAC addresses - if network gear manufacturers didn't have some type of system for de-conflicting addresses it would be chaos when you bought 2 pieces of gear that conflicted with each other.

In the classic Mac days it was up to each developer to make up a FourCC for their application. I don't remember there ever being any conflicts.

I feel MIDI is severely underutilized outside music. It's a simple, standard protocol that relays switches being turned on/off, and knobs being turned high or low. The only standard alternative that I know about is would be USB HID interface which has its limitations.

What are some cool uses of MIDI you have seen outside music-making?

How about controlling a band consisting of robots playing real instruments? [0]


> Compressorhead are built out of metal scrap, most of the movings are made with electro pneumatics. It is controlled via Midi. There is an interface that changes the Midi signals in switch signals for the electro pneumatic valves…,” Kolbie’s voice drifts off. He has soon got his face in a Robotics for Dummies manual. [1]

[0] https://en.wikipedia.org/wiki/Compressorhead

[1] https://web.archive.org/web/20181016144145/http://www.gibson...

I read the first line of your post and I seriously expected the YouTube link to be this:


All controlled by MIDI, unsurprisingly.

Thank you so much for that rabbit hole.

There are better protocols outside of MIDI like Open Sound Control aka OSC (http://opensoundcontrol.org)

I used to do a lot of experiential retailing experiences and OSC was our preferred control scheme.

"Better" is a matter of requirements. You can interpret a stream of bytes of MIDI with some tens of lines of C code and a couple of bytes of interpreter state. The physical interface is less than a handful of components. For OSC you need an IP stack.

Except a lot of MIDI in control type situations (lighting, user triggers) is run over ethernet these days. So if all things are equal on that front, OSC is a far superior design.

It almost doesn't even matter because outside of musical gear, OSC is becoming the established standard to control everything from DMX lighting to VJ software to OF/Cinder interactive installs to some more performance oriented software like Ableton Live (via LiveOSC).

I've written software MIDI sequencers (all the way back in the early 90's even) and you'll never be able to convince me MIDI is better than OSC.

Nah, OSC is just a message format. I see it routinely used across a serial port for instance.

(disclaimer: I develop an OSC-based sequencer, https://ossia.io)

Sorry, a brief glance at the OSC website would have corrected that misconception.

That said, I only assumed as much because I don't know of any OSC client software that uses anything other than IP networks to facilitate device to device communication. I googled for serial OSC software and found some nonetheless. Does your software support OSC over serial?

> Does your software support OSC over serial?

not yet, sadly ! I've got a lib for that however : https://github.com/jcelerier/oscour

what is experiential retailing?

Usually some type of creative interactive retail experience, anything from wall displays or projections that users can interact with to simple kiosks.

I haven't listened in on the data but it seems likely that MIDI Maze only uses the electrical standards, i.e. a proprietary protocol over MIDI ports

It's a midi port. It does midi. Hence the limit of 16 players. Every player is using a dedicated midi channel.

I agree that it's a MIDI port. That's not under contest.

Because every claim that it's MIDI I've seen has failed to substantiate it, I searched around a bit today and found the developer of MIDI-Maze II and contacted him. Only the physical layer of MIDI is used. He furthermore said that MIDI doesn't allow a ring network anyway, in which case this must also be true for the first MIDI-Maze.

So no, the player limit has nothing to do with MIDI channels. Maybe be more clear about whether what you're saying is an unsubstantiated guess or something that you actually know. Your message reads a lot like you know it for a fact.

I stand corrected then I guess. I definitely read before that it uses plain midi messages.

I haven't done a write-up yet, but my partner and I use MIDI as part of our custom lighting setup, which we bring to gigs and other events. A MIDI controller is hooked up to a raspberry pi running some Python software. We use the `mido` library for decoding MIDI. We then drive a bunch of RGB LED strips using DMX. Video: https://www.instagram.com/p/BtjLSUeFjtJ/.

Is there a reason you chose to do that rather than use something more standard like OSC or ArtNet?

The main reasons I’ve seen VJ’s use MIDI is that they can slave their rig to the performer’s Ableton Live setup. I’ve got friends that have set up timing triggers to pyro and displays / lighting (interfacing with the VJ’s rig) within a few hours of stepping off a plane for a gig that night.

DMX is very standard for lighting control

It’s comparable, and in some ways superior as a general control protocol, to Modbus. Can’t recall the specifics but MIDI seemed better defined and generally saner (IMHO).

A good example of non-music MIDI is Firmata; it’s a protocol based on MIDI and allows remote control of Arduino and other MCUs [1]. Seems both modbus and midi have transports over Ethernet now as well.

1: https://github.com/PatternAgents/Electronics_One_Workshop/wi...

I'm not in that industry, but isn't it also used for controlling stage lights in some cases?

MIDI will often be used to trigger the console to change cues which then uses DMX512 to control the lights.

Yep, am using QLC+ at the moment and it has profiles for all sorts of midi controllers. https://www.qlcplus.org/

DMX is standard for lighting, but although the actual protocol is rock solid the parts around it are no where near as good as MIDI. Device to device is fine, but USB DMX controllers are expensive and rubbish in my experience using them to control a ~100 piece (dumb) lighting rig

That's DMX512.

I completely agree. We are launching a new web app/platform soon and have completely integrated the web midi and gamepad APIs, allowing you to control any piece of the software with any midi controller and/or gamepad, including devices like the new Microsoft adaptive controller, which supports many more types of analog switches.

One of my favorite parts of both APIs is that they don’t require focus on elements or the window/tab and are naturally multi-touch.

This allows you to rapidly speed through the interface and perform tasks.

I've used MIDI ports on an Atari ST to control tool changers and coolant pumps on an industrial CAD/CAM station.

They are being use on photography editing software: https://petapixel.com/2019/01/26/using-a-midi-controller-wit...

Back in the 90’s, MIDI was used by some firework detonators to sync the launches and explosions with music, etc.

You could use it for game control having a super low latency midi controller would also work for game controllers

Indeed: Pre-USB "game ports" doubled as MIDI interfaces. https://en.wikipedia.org/wiki/Game_port

The MIDI pins on game ports were an add-on on sound cards to unused pins from the game port, though. The original game port spec doesn't have anything to do with MIDI or UARTs, and joysticks and the like did not use those extra pins.

And iirc game-ports were usually their own expansion card. It wasn’t until Sound Blaster decided to include it on their cards that we started seeing it integrated into the sound card (part of a sound cards job is Analog to digital conversation, why not do the ADC conversation needed for joysticks as well? Also, games normally also want sound so seemed like a nice why to bundle everything you need onto a single card).

I broke multiple joysticks on Wing Commander this way.

I agree there is many use, which is what I have done with IMIDI. You can also use it with television and all sorts of stuff. While most of the commands are same as MIDI (a few are removed, and also a few having to do with forwarding signals to connected devices are added), some of the meanings are bit more general, and the signal is two ways and the format of the SysEx payload is different (and does not require any kind of registration) (although the framing is same as normal MIDI, so if the signal is recorded by something that does not interpret SysEx payloads, the recording will still work). You can also forward signals to other devices; you can have a tree structure of devices and each device can have up to 128 children, selected by the use of the Device Select command (which is an optional implementation; compliant implementations are required to parse the command even if it is otherwise ignored; this is the same with all other commands, which it is required to parse even if it is otherwise ignored). (My own intention of a computer design is using IMIDI for all of the input devices, rather than USB or PS/2 or whatever (I dislike USB). And you can also use the IMIDI to transmit captions along with a television picture (the picture using Digi-RGB, and the sound using ordinary analog audio ports), and/or to control the channel setting of a external TV tuner and to control the VCR, too).

Specifically, for exclusive messages in IMIDI, the first byte specifies the type of message and the channel (some exclusive messages might be channel independent, and this is also indicated), and the second byte is a namespace number. The four types of exclusive messages are: application message, namespace error, namespace request, namespace response. Note that the first byte has four bits for the channel, one bit for channel independence, and two bits for message type, filling up the seven available bits, and the namespace number is also seven bits. (If the channel independence bit is set, then a 11-bit namespace number is used, because the channel bits are used as additional bits for the namespace number in this case.)

Some namespace numbers may be predefined by the application in use. Other namespaces are negotiated by the use of the protocol mentioned above.

I think most of the use cases you think of are currently covered by the USB Test and Measurement Device standard and/or the instrument control protocols (GPIB, VISA).

There is OSC, which I think is under appreciated. It's super easy to implement, it's literally just UDP.

I agree. I was watching the first video in the article and thinking there must be similar standard protocols for IoT orchestration. I mean, anything that can be controlled and needs orchestration must have some kind of similar protocol?

Lights, Props, Water Fountains, Drones. Anything that can be a show in Las Vegas.

DMX512 does that, too, and it seems even simpler to me.

MIDI is extensively used for stage light control.

not sure if it qualifies as "cool" but i see midi controllers being used a lot for live video mixing in clubs / events

edit: spelling

I have just started to play around in the world of electronic music and have had my first real exposure to MIDI. I just want to say ... wow. Something like this is so rare. I have pieces of equipment from different decades that can talk to each other using a $2 cable and no computer in between.

As a programmer I'm used to walled gardens and competing standards. MIDI is a breath of fresh air. I hope whatever 2.0 brings can keep this spirit alive.

Once upon a time almost all standards were like that. You got the spec, wrote some code and off to the races. Then proprietary stuff happened, then that got displaced by open standards and now we're back to proprietary. It won't be long before email and messaging are all proprietary.

Wonder what the next swing of the pendulum will look like?

> As a programmer I'm used to walled gardens and competing standards. MIDI is a breath of fresh air. I hope whatever 2.0 brings can keep this spirit alive.

At $250 a SysEx ID it seems MIDI 2.0 is exactly the opposite of that spirit.

> To implement MIDI-CI and MIDI 2.0, you need a manufacturers SysEx ID. A SysEx ID by itself is $250 a year, but it is included with your MMA membership.

A membership, not an open protocol.

There is MIDI competition, more open and some say better, it just never caught on outside open-source/hacker/maker circles.


OSC is fantastic in many ways, but it makes for a pretty inefficient (and unnecessary) replacement for the primary use case of MIDI, which is note on/off messages. It’s much better suited to some of the other layers that got bolted on top of MIDI, such as MIDI Show Control. For most musical purposes, MIDI v1 is perfectly sufficient, well-documented, optimized, and open (from the standpoint of being free to implement).

OSC absolutely has caught on in a lot of professional applications, just not the ones that MIDI was initially designed to serve. It’s huge in the world of theater, for example.

(Background: I wrote the initial OSC implementation for QLab [a theatrical show control application].)

My main beef with OSC is that there is no “there” there. It is hyper flexible at the cost of you needing to design your own meta-protocol. Like every single instrument that supports OSC has a different API with a mess of docs you need to read. Doesn’t really get me in the mood for making music. The DAW manufacturers had a very hard time creating user interfaces that studio engineers could use to configure OSC, and I think that was a main reason nobody ever used it as a synthesizer control protocol.

Very much so. It’s a useful layer to build an API on top of, but saying a device speaks OSC is like saying a backend service speaks HTTP.

OSC is awesome but it never caught on with the mainstream because nobody standardized common music production events like noteOn/noteOff or control change. Furthermore, the original developers never attempted to build any industry consensus.

This was probably partly intentional, as it was designed by the academic community to solve the needs of relatively esoteric music production systems (e.g. [1]), whereas MIDI was standardized by a broad consortium of manufacturers of professional-oriented music products.

[1] https://www.synthtopia.com/content/2009/03/24/slabs-touch-pa...

You'll like DMX and CV too.

I once had a dream that every new synth, sampler or drum machine I brought into my studio would automatically recognize the local MIDI network, joined it and would pop up as a new device in my Cubase sequencer.

All wireless of course. Like Bluetooth but actually working. A man can dream.

PS: it would also stream multitrack audio over ASIO wireless and expose its inputs and outputs.

I think a lot of performing musicians actually prefer the reliability of a wired connection with no handshaking going on.


- perfectly synchronised

- zero latency

This may be hard to realise with lots of devices and wireless connections.

> This may be hard to realise with lots of devices and wireless connections.

Assuming that the latency of the wireless connections is at least relatively predictable, you could just introduce a suitable delay like with NTP time syncing, right? Of course, it's possible that the latency will be wildly unpredictable and in that case it's pretty much impossible, but that doesn't have to be the case.

In the parent's case this latency includes time from a human key press (piano key). The other cases you mention can be compensated for but of course the human input is non-predictable.


Also, my dream included vintage synths with just cv controls that would just connect to some dongle/wireless CV converter and boom, integrated. The dongle converted perfectly between all the various Yamaha, Roland and Moog standards.

I for one expect that MIDI 2.0 helps to boost the RTP MIDI protocol adoption (over ethernet). It has much more speed, longer distances, less latency, proper packet loss recovery (which is not a problem on local networks) and overall much better hardware ecosystem.

Even on WiFi it is useful although latency is very jittery.

Disclaimer: I'm working on a RTP MIDI implementation for linux (https://github.com/davidmoreno/rtpmidid)

> Even on WiFi it is useful although latency is very jittery.

Could you elaborate on why Wifi latency jitter would be an issue for MIDI?

From an outsider's perspective and with only a vague knowledge of MIDI, it doesn't seem to be something that should be any more sensitive to latency jitter than other realtime applications like audio/video.

Music performance is a synchronized effort, and with very precise timings. If there is lag, everything is just a bit late and a musician can (unconsciously) compensate up to some point. But for example (if my math is correct) 120 bpm, has 8th notes every 250ms.

If drum beat is sometimes, and only sometimes, 100ms later than it should, the result is not nice at all. And it is random jitter.

There’s also MIDI timecode, which has to send at four messages per frame (up to 120 messages per second), and whose entire job is to convey timing. It’s basically unusable over WiFi.

Ok, thanks - I wouldn't have thought 100ms variation would be that noticeable.

As an amateur musician, I played with real time synthesis a lot. Latency, and especially jitter in latency is the biggest enemy. 100 ms is an eternity. 5-7 ms is still noticible and anything above 10 ms becomes a nuisance. Some artists, especially drummers, hear 2 ms differences.

And it’s quite logical really. A reasonably fast tempo is 180 bpm. Playing sixteenths notes would separate them by 80 millis. Then you have separation in swing style sixteenths, funk (which is often ahead of the pulse by a tiny fraction) and the real scale is around 20 ms. That’s comparable to 50 frames per second.

This is also the reason (among others) why orchestras need conductors. The right side would hear the left side 100 millis later due to the speed of sound.

I am of the belief you need to be in the sub-millisecond range for jitter, in order to be capturing a performance authentically.

Whatever the range of perception is, there’s going to be another range where people can’t consciously describe what they are perceiving, yet it affects them.

If you are playing an instrument, eg a MIDI keyboard, I’ve found anything above around 10ms creates a noticeable lag between when you physically touch the key and when you hear the sound.

I notice if the soundtrack on a video is consistently off by 100ms, can believe a musician would notice occasional 100ms variation in a beat.

good musicians can notice the difference between 10 and 20 milliseconds of latency pretty easily in my experience

For example, with a video player like vlc, try moving the sound out of sync with the video with the hot keys.

100ms is 1/10th of a second. It’s an agonizing wait when you’re playing.

In broad terms, performing music is much more sensitive to latency than A/V playback is.

Think of it like playback vs a phone call. If you have 0.5s gap between audio and video, you hardly notice. But the same latency in a phone call makes it very awkward.

In music performance, the awkwardness would be multiplied by the number of spectators cringeing :)

> If you have 0.5s gap between audio and video, you hardly notice.

If your video comes later than audio and depicts human mouth talking you would notice it ever for 1 frame (0.02s for 50 frames/s) delay; but it's true that if audio is delayed relative to video, then 0.5s is hardly noticeable.

This is because in the real world audio always arrives late compared to images. Our brain is trained to compensate.

Personally I start getting annoyed when the delay is above +200ms or under -50ms.

Excellent demonstration at 6:52 here:


Even 5ms jitter is perceptible, and 10ms is obvious when compared directly with 0ms.

Audio and video streaming have buffering, which hides most jitter. MIDI doesn't have that at all.

As I read it, in brief, it adds much more resolution, and a introspection protocol using bidirectional communication which allows property exchange and profiles.

Both very welcome additions! Finally no stepping on CC and note velocity, and the DAW can really know about the synth / controller capabilities, as it knows already with VSTs.

And keeps MIDI 1.0 compatibility.

Now we just wait a decade for any DAW to support it, plus another million years if you're an Ableton Live user...

Reaper will probably have it quickly. They update it at least once a month and don't have any need to upsell to the next version or tier.

Midi 2 adds some much-needed features (per-note pitch bend, for instance), but what I expect to happen is that the major manufacturers are going to only implement the parts they care about.

I'd kind of like to see MIDI replaced outright with something built on a somewhat different (less piano-centric) abstraction; something more voice-oriented rather than note-oriented. Instead of having some number of fixed pitch notes that you turn on and off and settings that apply to all notes, you have voices that you can control independently (set volume, filter cutoff, control the envelope, disable and enable, and so on). You can do that now with the one-note-per-channel trick or MPE if it's supported, but it's kind of kludgy and only works with synths that are multitimbral to begin with.

> Property Exchange uses JSON inside of the System Exclusive messages.

For some reason this made me lol. The idea of cramming some JSON inside a SysEx seems crazy.

Hopefully the timing is actually improved in 2.0. Once you are past “hello world” getting midi gear to sync up has always been a nightmare.

One of ESR's relatively recent rants about protocol design is optimizing the right thing vs scaling a protocol for use over long time, so he's a big fan of using JSON in the next generation of NTP.

One quote summarizes a couple thousand words "We should be designing to minimize the cost of human attention."


when you can run all the world's stratum 1 traffic on a raspberry pi, you shouldn't be optimizing for bw and speed, but for bug-free-ness and security and ease of use. Likewise back when a minimal computer system was a multicard S100 Z80 system minimal midi made sense, but now a days it should be like adhoc wifi with a REST API or something similar.

Text is not how we optimize for correctness. I can already see my cheap synth failing on scientific notation oddities, BOM, and other JSON gotchas.

Please read up on langsec.

JSON exposes implementors to all sorts of recursion, numeric precision, etc., where the only advantage is that you can consume the data through whatever JSON library already exists (but also having to handle that library's unique quirks).

When you look into it, while ESR says he's using JSON, he's really using a custom format that is readable by typical JSON libraries, but his format has very strict limitations, and his implementation is designed to error out early, rather than parsing an entire 10MB object tree before handing anything back to the application. He does not acknowledge this inconsistency in the article, only mentioning it late in the comment section.

Extensibility is valuable, but does that mean the format should support '"position": {"x":5,"y":[7.2,"XXIV"],"font-face":"Comic Sans"}', where every single object can have arbitrarily-many unrelated fields inserted? HTTP is better, with a flat key-value list, but if you build on top of HTTP itself, you now do not know if any of the systems on the network or libraries used might respond strangely to some obscure old header.

Personally, I think transmitting integers as text is at best rude, and somewhat comparable to Java allowing any reference to be null: Every client now has to handle an extra error condition on every parse. With binary integers, every bit pattern is valid, so you can perform a single range check to handle every type of bad input. Use 0x1234 as a magic number somewhere, and endianness errors are trivially debugged from a single sample packet.

I'm trying to understand the 'bi-directional' part. There used to be a idea of midi out & midi in. So does this mean you only need one cable connected now?

It sounds that way - but they also say that connections over 5-pin DIN is/will be MIDI 1.0 only.

The future of MIDI is AVB and/or AES67 with OSC

You seem to be the only person commenting that gets it. I think that might be because anyone who cares just skipped this press release. Even calling this MIDI 2.0 is a deception. We have MIDI, it's great. The tools that follow exist already and MIDI 2.0 doesn't appear to add any value.

Whatever this hustle is, it can fuck right off in my book.

Wow. This took so long to happen that I assumed it had already happened, a decade ago. Early RFCs went out while I was in college.

MIDI 2.0 looks like garbage and it barely features what Open Sound Control has been doing forever. I'm interested, but the demos here were made by people that don't make music and suck at hardware, software, blogging, and video presentation.

Whatever this is, I'm hard pressed to care.

Is the connector still that only DIN+USB-Micro hybrid? It looks like it's expensive and fragile.

> Will MIDI 2.0 devices need to use a new connector or cable?

> No, MIDI 2.0 is a transport agnostic protocol.

> That's engineering speak for MIDI 2.0 is a set of messages and those messages are not tied to any particular cable or connector.

DIN connectors are not expensive, nor fragile (too young to have used cassette players with them?). They are however big. Which is why the industry started using mini-jack connectors. They are now officially sanctioned:


For backwards compatibility, all MIDI 2.0 features will still work over 5 Pin DIN cable.

According to the linked page, 5-pin DIN still only supports traditional MIDI 1.0 and there's no plan to change this currentlyt.

...which means that almost everyone using midi 2.0 between multiple devices will be doing it over USB, which is a shame because very few hardware synthesizers or controllers can act as a USB host.

That's fine if you're connecting everything to a computer, but it's kind of a step backwards from what MIDI used to be, which was an easy way to connect almost any keyboard to almost any synthesizer made in the last three and a half decades or so.

I've wondered if CAN bus would be a good modern-ish alternative to DIN-5 and USB, but I don't know enough about it to say if it has some limitation that's not immediately apparent but which would become a problem. (On the plus side, it's much faster than plain DIN-5 midi, allows longer wires than USB, and it seems to be supported natively on a lot of cheap microcontrollers.)

I reckon CAN would be a great alternative. It has inbuilt support for message prioritization, so important stuff like timing sync messages could have higher priority. Also it's differential so long cable runs are not a problem and it has good noise immunity. Oh, and it's a bus so virtually unlimited devices on the same bus, in any topology.

Oh, and also I've always wanted to use my synths in the car!

It would be nice if the protocol of the future was wireless and could support the actual audio as well though, but all that adds extra complexity of course.

The old 5-pin DIN hardware doesn't have anything like the bandwidth needed for MIDI 2.0. In fact it can struggle with MIDI 1.0.

> 16bit note velocity

For electronic drums I would of much prefered at least 24bit. Volume is by far and away a drum's most expressive dimension, so it will be limited by a 16bit velocity range. Adding velocity curves just masks the problem

Though of course 16bit is orders of magintude better than the current 8bit range

>For electronic drums I would of much prefered at least 24bit. Volume is by far and away a drum's most expressive dimension, so it will be limited by a 16bit velocity range.

It wont be limited at all, real human players have no control as subtle as 256 levels, much less 65K levels... Nobody would even notice anything...

Not my experience. The expressive control is exponential, so either you clip on either end of the velocity spectrum, or you get discrete steps at the lower end.

I agree with the previous person that 256 levels of amplitude should be sufficient purely when it comes to velocity as long as these levels are spread appropriately (i.e. non-linearly). If the expressive control is exponential, that suggests to me that the data itself should be exponential.

I know literally nothing about drumming but I've no doubt there are plenty of other characteristics of a drum strike than velocity. Such as the mass being applied to the hit (e.g. is it being hit with the weight of the stick alone or is it the drummer's whole arm) or the location of the strike, or the release time.

Speaking as a producer who's programmed a lot of natural-sounding drum tracks...

There are plenty of variables other than velocity -- location on the drum head being the prime one, but there are others. Because of this, sampling an acoustic drum kit involves capturing a suitable number of random variations, and the end result is often gigabytes (i.e. hours of content) in size, even though each sampled hit is just a few seconds long. Not having enough variation in your sample set makes the programmed drums sounds unnatural, since excessive repetition doesn't gel with how we experience acoustic drums.

Certain variables are more important than others, though. One notable sound is the 'rim shot' which means striking the drum head and rim simultaneously, which causes all kinds of constructive interference and results in a very powerful sound. It's the holy grail of rock drumming. In drum programming, rim shots are often a separate stack of samples with its own velocity layers, each layer with a set of random variations.

The new attributes in the spec should make it possible - although not entirely easy - to include 2D position information with note on messages.

I would have been happier with a more general note spec that left the number of attributes and their resolution open and system-definable. This would allow 2D/3D/4D/etc control of note events, super-high resolution pitch definitions for microtonal support, and so on.

Bandwidth really isn't an issue any more, so there's no reason to limit the spec to a low common denominator.

Even so - 2.0 is better than the limitations of 1.0. So that's progress.

3D positioning was added to 1.0, described in RP-049 dated 2009-07-23. The parameters' MSB is 61 (0x3D, nice)

It bears some resemblance to OpenAL source parameters which is little surprise as Creative seems to have written it. Some obvious differences:

- sources' positions are sent in azimuth/elevation/distance, i.e. spherical coordinates instead of rectangular

- the positions are always relative to the listener instead of often having a listener that moves around in a stationary 3D world

- the source is now allowed to be both spatialized and stereo with extra parameters for angular distance between the "speakers", the roll angle of the pair, etc.

I located the PDF maybe on Google, maybe by accident more than a few years ago. (I think it was from MIDI.org even then) I had to make an account at MIDI.org in January just to look through the specs, and it was there. Now I can't find a link so I'm afraid it disappeared behind the MMA member paywall. <sigh> Here's to progress.

I'd agree with you for synthesizing, particularly with a master volume knob. But playing live my volume range is from audible to just me (So I can plan what people will hear), to easily the loudest thing going ;) Which is louder than you want to set your CD player :)

Also, I strongly believe in using dynamics throughout the set. The min-max range I use is one of the strongest impacts I have on what people feel. Tinkled whisper to roar

Though yes, like you, I love my other variables too like strike position, angle, etc

Remember that drums are two dimensional, but circular, so there is at least one more thing in their response: the timbre if the sound is affected by the radial location of the hit, which encourages different harmonics and hence a different sound. You can also hit parts other than the drumhead - and the rim and head (nearly) simultaneously.

You wouldn't be able to tell between a hit in velocity 1 or 2 or between 254 vs 255 is my point.

If the velocity is a^x, with a a number like 1.1 and v a number between 0 and 255, I would agree. However, what I’ve seen is that 2 is twice as loud as 1 and 256 twice as loud as 128. That means soft passages become discretized and uneven.

Why wouldn't 65536 separate velocities not be enough, even for the most expressive drummer (or indeed for any other instrument)?

I'd bet money on it that you couldn't tell even if it was 10-bit. 16-bit gives you 65536 levels of volume, that's surely overkill for anything a human can control or recognize. You certainly can't consistently drum with 65,536 different levels of volume. You wouldn't be able to even do 256.

99% of digital recorded music only has 16 bit velocity for every sample. I think it'd be fine.

16bit is sufficient for listening, but it leaves no headroom for production. Modern music is often recorded in 24bit and only rendered to 16bit as the final step.

That is correct for recording with microphones, however that is largely to avoid having to be overly precise with setting levels in order to maximise the value of those final 16 bits.

This doesn't analogise well to digital instruments, as the minimum and maximum velocity/amplitude/whatever is already known and defined. You already know what the exact potential dynamic range is before the first note is ever played. The concept of headroom doesn't exist. Anything that you can think of which you might define as "headroom" is either beyond the input and/or output capabilities of the device and therefore irrelevant, or it should be within the normal scale.

No, because digital processing adds distortion to the signal, which means that having an input beyond perceptual accuracy still matters to a production needing to apply lengthy effects chains. It just matters less than the case of a quiet signal recorded at a linear bitdepth.

For this sort of reason many digital effects operate at an oversampled rate(i.e. they apply a FIR filter to represent approximately the same signal at a higher sample rate like 1.5x or 2x, do the processing there, then downsample again). Contemporary DAW software has tended towards doing processing in 64-bit floating point, while source recordings typically go up to 24-bit/192kHz, which is probably "way more than good enough" even if some room to inflate the specs further exists.

For playback, 16-bit/44.1khz stereo is still as good as ever. You're limited more by the other parts of the signal path in that case.

This is concerning a velocity control signal, not a stream of audio samples. This control signal would be mapped the internal processing of a synthesis or sampling engine, resulting in an audio signal whose data format (analog or 16/24/32-bit digital) comes down to the physical characteristics and design of the synth. Once it leaves the synth, a host can convert to whatever it needs to have enough headroom for further processing.

This is orthogonal to the issues remediated by oversampling (harmonic artifacts due to aliasing in non-linear processing like saturation, not distortion issues related to bit-depth).

16-bit velocity signals, if linearly mapped, means 96dB of velocity sensitivity which is pretty good. And it won't be linearly mapped, because exponential velocity mappings are already prevalent in the industry.

I'm still having a hard time believing that the drum computer only having a 16-bit representation of the velocity of the stick hitting the pad will matter in its goal to produce a velocity-scaled (probably 24-bit) sample on its output port.

There's lots of way more interesting information to be represented that'd be more valuable than just a raw 24-bit velocity - the angle and location of the hit, how much pressure (not velocity) was being applied, how the stick moved while it was in contact with the skin, the material of the stick, etc., if you truly want to capture the dynamics of drumming.

“No” what?

Did you interpret the word “largely” to mean “entirely” by accident?

This post appears to be a generic diatribe on the facts about sound mixing. It does not appear to be a response to anything I wrote, which is not about sound mixing at all. My post was a rejection of the analogizing between MIDI data streams and sound recording.

You also need good mics and preamps otherwise some of your 24 bits might be filled with noise.

Good luck finding any analog gear period that can saturate 24 bits and not have noise in the LSBs, it just doesn't exist. That's not the point of 24-bit audio, it's not needing to worry about clipping. That's why knocking it down to 16-bit for playback is practically speaking fine.

velocity != bit depth of a recording

Your resolution problem with electric drums is not MIDI, it's your drums.

Remember that music on a CD has only 16bit depth. It's plenty of dynamic range, especially for a control signal that can map to any output amplitude range.

It's plenty of dynamic range only when you know in advance what the lowest and highest values are going to be so you can scale it correctly, or if you're using a logarithmic scale. Neither are as convenient as higher bit depth linear scale.

Not as fun to read, outside the music industry. Lack of consistent samples has been MIDI's major issues, and even in games these days, nobody uses it, sadly.

MIDI is just a control protocol. You are thinking of General MIDI [1], which defined a vague set of sounds associated with a program number, that allowed playing a sequence on different engines with horribly mixed results as you said.

MIDI 2.0 brings additional capabilities to device interconnection, and AFAIK has nothing to do with GM (which not many people care about anymore)

1: https://en.m.wikipedia.org/wiki/General_MIDI

Lack of consistent samples is far worse without General MIDI. You could get a flute switched with a snare drum.

Lack of consistent samples is not an issue with professional use of MIDI.

Professionals (musicians, engineers, etc) don't use MIDI as a general-purpose playback method, they use it to control their samplers, synths, external effects units etc.

They don't need "consistent samples" because they provide their own samples, different for every song (plus pure synth sounds, etc).

This is going to be hard to explain without outlining a whole recording setup, but basically, a lot of the time, in a given musician's studio, the program numbers are going to refer to different things depending on how he chose to configure the patches (presets) on his synthesisers. In some cases, patches aren't even used, such as when an old Minimoog is retrofitted with MIDI and the notion of "Flute" or "Piano" becomes meaningless. In other cases, the synthesiser is a sampler and the samples were programmed by the musician at custom program numbers. In the case of software instruments in a DAW, patches are usually chosen by the DAW itself and no program message is sent to the synth, just note on/off, modulation, pitch bend and other controller messages, and if a switch of instruments is needed during a song, you just add a new track for that.

Basically, MIDI began as a protocol used between hardware devices. General MIDI is a product of the ROMpler era, a ROMpler being a sampler that can't be reprogrammed, i.e. what most people know as a "keyboard", and it's far from universal.

Musicians often don't care about standardising. They're hooking up all sorts of wonky instruments to their rig and reconfiguring it for one-off projects. There is often no need to recall specific patches after you've recorded the part you wanted.

MIDI is like HTTP, while GM is like, maybe, the named CSS colors.

edit: TCP -> HTTP.

>Lack of consistent samples

MIDI has nothing to do with samples itself.

>and even in games these days, nobody uses it, sadly.

Most recorded music, soundtracks, and almost 100% of game music is been done with MIDI.

MIDI doesn't amount to the .mid files that people used to download / play on some player on their PC and which sound tinny.

It's a professional specification that's used in every single studio, and is the basis of every professional digital workstation for recording music and sequencing electronic instruments.

What you're describing is like confusing the JVM with Java applets, as if that's its only use.

It does not make sense to use it in games. It would need a good synthesis engine plus samples which consumes much more CPU than reproducing a MP3.

If in your game you go the synthesis mode (chiptunes and whatnot) MIDI is a comunication protocol between devices.. if you want to use it to communicate a game with its own synthesis engine it makes less sense. Buy you may want MIDI so you compose in a standard music composing program.

It is used also in other areas that need automation, as lighting (DMX).

But I agree it would be nice to have better examples.

> It would need a good synthesis engine plus samples which consumes much more CPU than reproducing a MP3.

My 386 could do a decent job running DOS trackers (software synths) years before mp3’s were feasible on PC hardware (there was not enough storage, bandwidth or compute for MP3’s).

The Kosmic Free Music Foundation published a ton of music on one CD-ROM this way:


MIDI actually allowed game designers to decouple samples from the events that triggered them.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact