
An update to 37-year-old MIDI - pseudolus
https://qz.com/1788828/how-will-midi-2-0-change-music/
======
core-questions
> In MIDI 1.0, all data was in 7-bit values. That means musical qualities were
> quantized on a scale of 0 to 127. Features like volume, pitch, and how much
> of the sound should come out of the right or left speaker are all measured
> on this scale, with 128 possible points. This is not a lot of resolution.
> For some really sophisticated listeners, they can clearly hear the steps
> between points.

This is extremely misleading. Sure, the velocity input into your synth is
going to be at 7-bit resolution, but at soon as the synth has it, it can play
anything it wants at whatever volume it wants to based on how you have
configured it. There's nothing about the external 7-bit implementation that is
really limiting the dynamics of the synth itself.

Higher resolution timing and a greater amount of 'awareness' about the
features of the device at the other end so as to facilitate automatic mapping
of controls from surfaces to synth parameters is what I would find more
useful.

~~~
kazinator
Also, if we talk about the volume parameter.

The human ear's dynamic range is about 120 dB, which includes about 20-30 dB
of pain. With 127 bits, we can map that with 1 dB resolution.

16 bit audio ("CD quality") only has a 90 dB dynamic range.

We would almost never want a single instrument to have a 90 dB dynamic range,
but if we did, MIDI values could logarithmically encode it with a better than
1 dB per step resolution.

In a mix, any instrument that is reduced by more than about 20 dB will
disappear.

When synthesized music (e.g. electronic drumming) lacks dynamics, it is not
because of the encoding of the raw volume parameter. It's due to other
factors, like poor synth patches. Poor synth patches use a small number of
samples, like say a snare drum being hit in three different ways, and they
stretch that over the dynamic range with some naive scaling. A real drum
doeesn't work that way; it makes a different sound for each intensity with
which it is struck. You need samples of it being played at a myriad volume
levels, sort those by intensity and map them to the intensity range.

Some instruments don't even change intensity that much when they are played
louder; a lot of the perception of loudness comes from changing harmonic
content. If you fake it with one sample that is just volume-adjusted, it will
not sound right.

Synthesizers have tricks to help with this, like low-pass filters that respond
to velocity: hit the key harder, and more high frequencies go through. That's
one tool in the box for creating a more dynamic sound from scratch.

~~~
atoav
The perceptability of the 128 steps depends on the parameters it influences.
If the midi parameter influences e.g. some form of pitch, 128 values won't get
you very far without having the steps perceived.

It also has to do with the pressure range of midi controllers: 128 steps are
few if you have to distribute them between "barely touching" and "hammering on
it with full force". When you play a real instrument, you will notice that the
range of things between the most silent and most loud you can manage is
usually huge. For midi controllers this is kinda limited, so good
developement.

~~~
rzzzt
You can distribute the values non-linearly, though; the difference between
smallest and largest value might be big, but I don't think I could hit a pad
or key in 128 different ways. Some controller software does offer a selection
of velocity profiles.

~~~
oriolid
It depends on the instrument. 128 positions is well enough for piano. From
other instruments, drums is the one I know best, and there single parameter is
just not enough: the result depends on velocity, the position where the stick
hits drum head, for non round tip sticks the stick angle. For cymbals, in
addition to hitting different parts of the cymbal, stick tip, shaft and
shoulder give different sounds. And so on... There's a good reason why loops
sampled from acoustic drums are used even though drum synths exist.

~~~
PaulDavisThe1st
You do not need MIDI 2.0 to solve any of those problems.

~~~
oriolid
No, and I don't expect MIDI 2.0 to help. It was just response to the idea that
single parameter would be enough but 7 bits wouldn't.

------
kazinator
By the way, seems that MIDI playback support in operating systems and browsers
is petering out.

About nine or ten years ago, I had little trouble playing back MIDI files on
various platforms.

Recently, I sent an old .mid file that I produced almost a decade ago to
someone (at Google!) and they couldn't play it. After trying it myself, I was
shocked. Browser after browser, system after system; no dice.

I ended up converting to MP3 using Timidity -> Lame.

~~~
mmis1000
It is possible for current browser to play midi using some JavaScript library
just like you use native players to play midi. see MIDI.js.

So, for programming usage, it does not matters that much

------
robbrown451
MIDI 2.0 is awesome and all, but I'd be happy if Firefox supported MIDI 1.0.
(only Chrome and Edgium and the like do, Firefox has been saying they will for
ages --
[https://bugzilla.mozilla.org/show_bug.cgi?id=836897](https://bugzilla.mozilla.org/show_bug.cgi?id=836897)
\-- and it seems to be extremely low priority)

~~~
unlinked_dll
Audio apps should stay out of the browser

~~~
monetus
Why?

~~~
unlinked_dll
Because "the audio thread waits for nothing." Garbage collection is
unacceptable, more than a single pointer dereference to get to state is
borderline unacceptable, and synthesizers (that an MIDI system would trigger)
are up there with some of the most computationally demanding software you can
develop even for toy projects. Especially so, even. A naively coded soft synth
in C++ talking directly with drivers can easily crap out with 4-5 voices of
polyphony.

Now onto why a browser is a bad place to do this. Your audio subsystem
supplies a buffer to you (or you supply a buffer to it, depending on OS) and
you need to fill it in a fixed amount of time to avoid a buffer underrun, or
increase the buffer size. At a 48kHz sample rate and a buffer size of say, 512
(default on MacOS) you have ten milliseconds to fill it.

But you don't get all that time. As buffer sizes get smaller, the dominant
factor becomes how much time it takes to get data from kernel to userspace,
and from userspace down to kernal. In a browser now you have to go from kernel
to user space to sandbox down to user space down to kernel.

So getting low latency MIDI input to a browser, rendering it to sound, and
getting it back, is basically a terrible use case scenario in terms of
latency. Yea you can do a lot when you don't care about latency, but then the
question is, why would you care about live MIDI input if you don't care about
latency?

Firefox's audio engine (or at least WebRender) is actually fairly impressive -
I've heard things that they can get sub ms latency. But I don't really trust
that you can do that and do serious audio processing, which any non-sine synth
is going to entail.

~~~
nine_k
A midi event stream is _massively_ less demanding than a waveform stream. A
browser could at least be a source of such events that you would route to your
favorite hardware synth.

Interestingly, browsers are capable of native audio and video playback for
years; somehow it wasn't such a problem, given their relaxed requirements for
latency.

~~~
unlinked_dll
It's because streaming video and audio is simplex, MIDI + rendering is duplex.
Round trip latency for a decent app needs to be sub 5ms, it doesn't matter if
you have 200ms+ for receiving packets from a video/audio source on a web page,
since you're not providing live input to get live output.

------
anigbrowl
It won't. Most people aren't interested in those musical subtleties; the ones
that are either like their music acoustic (from ballads to grand opera) or
listen to experimental electronic music like Autechre which liberated itself
from the MIDI straitjacket years ago.

It will likely change music performance to some extent, making techno and
other highly synchronous styles more fun and interesting to perform live than
is currently practical.

------
unlinked_dll
imho MIDI 2.0 hits the low hanging fruit but doesn't go far enough at fixing
the biggest problem in professional audio. Deterministic rendering. Basically
same input makes the same output, which doesn't happen even in totally digital
systems - doubly so with live (or recreated) midi events.

I'll give an example - you have your PC with $DAW_OF_CHOICE running and plug
in $MIDI_2_CONTROLLER to USB and enable a track and hit record, play your
stuff, and stop. When you play it back, unless the cosmic forces are
exceptionally on your side, _it will not sound the same as when you played
it_. It's subtle and sometimes ignored or even desirable, but it's there.

There are a lot of reasons why this problem hasn't been solved, some technical
and others artistic. But imho, a certain grade of equipment (namely,
recording/reproduction, of which the MIDI protocol is a key component) should
behave identically under the same conditions and be able to reproduce a
performance exactly. It's a goal that borders on absurd, but I think we could
do it!

Namely - enough of this "transport agnostic" horseshit. Give me a professional
event protocol that is sent synchronously with the audio, on the wire and on
the board. I want my codec chips interleaving midi messages (don't care if
they're 16/24/32 bit) as a separate audio channel, even if it is undersampled
(eg send LRLRLRMLRLRLRM over I2C at the appropriate clock to have audio in
time and MIDI undersampled by 3x per word), I want MIDI events _in my
interrupt_ and in the callback, synchronized _exactly_ to whatever audio is
coming in at the same time.

And give me unadulterated total, absolute dictatorial control over the audio
drivers using a ping/pong buffer that gets mapped to memory in the audio
callback for my professional application and mine alone. Minimize the kernel
time spent mapping physical memory to virtual memory, like ASIO drivers but
without the bullshit. I want as little overhead between the frame buffer
coming in off DMA to the chip as I want without the possibility of pwning the
system. Hell, give me a dedicated core!

~~~
rewgs
> I'll give an example - you have your PC with $DAW_OF_CHOICE running and plug
> in $MIDI_2_CONTROLLER to USB and enable a track and hit record, play your
> stuff, and stop. When you play it back, unless the cosmic forces are
> exceptionally on your side, it will not sound the same as when you played
> it. It's subtle and sometimes ignored or even desirable, but it's there.

I'm sorry, but...what?

I'm the rare non-professional-programmer on this site. I'm a professional
composer -- I spend my days writing/producing/mixing music on a computer.

Can you go into more detail of what you're talking about here? Because, no
offense, but if you were right, I think I would have noticed by now.

~~~
blattimwind
I wouldn't be at all surprised if the process is not bit perfect; the question
is: can you hear a difference on your own? Can you hear the difference if it
is pointed out ahead of time (through waveform comparison)?

~~~
rewgs
I would be absolutely surprised if it _weren 't_ bit perfect.

We're dealing with 7 bits of MIDI data controlling, typically at most, 48,000
samples per second. These numbers are chump change for modern computers. The
bit depths can add some more complexity to that in terms of dynamic range (I'm
usually working at 32 bit float), but that doesn't apply as much to the
situation here.

Yes, real-time performance is a tough task (as is always stated, "the audio
thread waits for nothing"), but OP here is talking about recording MIDI data,
which -- once recorded -- acts as a _static input_ controlling either a
synthesizer or sampler. So let's break that down.

A digital synthesizer, unless designed with some amount of randomness
(typically for "analog-like" behavior purposes), by definition is the same
every time. The MIDI data in this case is going to be something like CC7,
controlling output volume; CC1, controlling some pre-defined parameter (i.e.
opening/closing a filter); etc. A solid representation of OP's example in this
case would be "CC7 controlling output volume of a synth over a 4 second
period, linearly from totally silent upwards to 0 dB." I fail to see how that
could possibly change from one playback to the next, unless, again, some
amount of randomness-with-same-MIDI-data is a feature of the synth's
programming.

A sampler is just, at its most basic form, playing back audio. Audio is a
static file; MIDI is controlling which audio is playing back. Round Robin
sampling, where, say, C4 is played N number of times and X number of similar
samples are called randomly so as to avoid the "machine gun effect," where
_literally_ the same sample is called, could account for "it not sounding the
same," but like the "analog-like" programming of the synth above, that's on
purpose, not a flaw.

I routinely deal with situations where phase cancellation null tests would
reveal the kind of behavior that OP is talking about, and I simply have never
come across them. And that's not even going into sensitivity in listening,
which while subjective and impossible to prove, is something I put a lot of
faith in.

Sorry, unless OP can point me to a solid source laying out a further
explanation, I call horse shit.

------
dang
Recent & related:
[https://news.ycombinator.com/item?id=22180731](https://news.ycombinator.com/item?id=22180731)

------
TaupeRanger
Why in the world would you reference Adam Neely of all people? He is YouTube
famous, but by no means a foremost expert on any of this stuff. He literally
just reads the feature list in his video and adds some simple explanation
while filming himself walking around a convention. If you want to learn about
something like this, talk to one of the contributors or an actual
hardware/software developer.

Neely, like all YouTube explainer-celebrities, is primarily concerned with
getting views and having "production quality", while leaving the audience with
a vague sense of having learned something without actually having learned
anything at all. His most popular videos are chock full of non sequiturs and
made-up nonsense.

~~~
philligr
His comment about classical betrayed this a bit. It was pretty ignorant (in
the pure sense of uninformed) and confusing.

------
mkaic
This seems to just be an articleized version of Adam’s excellent video from
several months ago.

~~~
corndoge
Who is Adam?

~~~
mhh__
Adam Neely, musician and living Jazz meme.

------
Jeff_Brown
I'm happy about it, but if you're able to use Open Sound Control, it's better.
Rather than gibberish channel numbers, OSC lets you label your messages
meaningfully, as in "/trumpet/volume 100". And it lets you send lots of data
types -- numbers, strings, lists -- rather than only numbers.

------
orangefarm
Is there any reason why music production in the cloud isn't the standard yet?

High-quality VSTs requiree a lot of CPU power. Even my 16-inch MBP easily
heats up once I add some more advanced VSTs.

I would rather pay X$ per month and have my music production work station in
the cloud and interact with it from any old device with a fast internet
connection.

Working with a buffer size of 512 samples, I currently have a latency of
11.6ms in Ableton. Adding another 10ms latency through the internet connection
wouldn't be a drama for me.

Working in the cloud would allow me to easily upgrade or downgrade my system
based on my needs, better collaboration with others, automatic backups, one-
click access to new VSTs and samples, etc.

This set up would probably be less ideal for people who actually have to
record a lot of 'real' instruments but a lot of music is only created in the
box today with VSTs.

But I'm surely missing something here. Why hasn't this been a trend yet?

~~~
vortico
Audio over internet would be at least 300ms latency, not sure where you're
getting "10ms". Anything over 10ms is annoying, and 50ms is nearly unplayable.

~~~
orangefarm
Why would it be at least 300ms latency?

~~~
vortico
To send/receive a multi-channel audio/MIDI buffer to/from a server, you need
to go through at least a dozen protocols, including waiting for the speed of
light between you and your server. If you're in NY and your server is in LA
for example, that's already 30ms gone _just_ considering speed of light. Other
factors multiply this latency by an order of magnitude.

~~~
orangefarm
Okay and you would say that if you optimised all of these factors you would
end up with a latency around 300ms?

I just set my Ableton Live to 300ms and it was actually o.k. I think the
reason is that a lot of people don't actually 'play' their instruments these
days - at least in electronic music.

Instead, they program their drums by putting midi notes on the grid and then
listening to the result. The same with synths etc. So when I work this way,
the 300ms latency are actually bearable. Of course it would be different if I
used drum pads to play my drums 'live'. But honestly I don't know many people
who do that and when I watch tutorials on YouTube also almost no one is doing
that. A lot of electronic music producers 'play' their instruments with their
mouse button.

~~~
vortico
Access Analog ([https://accessanalog.com/](https://accessanalog.com/)) is
already doing something similar, and their system is around 300-2500ms latency
([https://accessanalog.com/support/#1534876416634-f660a710-8f4...](https://accessanalog.com/support/#1534876416634-f660a710-8f47)).
A company cannot reliably offer much better than this latency, unless they
have servers in all their customer's cities.

To most DAW users, 300ms is unacceptable, so any service that processes audio
on a server needs to make this caveat very clear in their documentation. The
problem with such a business idea is that local computers run DAWs just fine,
so very few people would seek remote audio processing.

------
ChuckMcM
My "turing test" is whether or not you can tell the difference between Stan
Getz playing it and a midi device playing it.

~~~
djaychela
There are three elements there though:

1) the capture of all the musical and performance input (a wind controller,
etc) 2) the transmission of all that information in real-time without latency
3) the conversion of that information into the appropriate sound.

Midi is only part 2 of that - advances in other areas are needed for a fully
convincing facsimile, but given the incredible improvement in sample-based
libraries over the last 10 years or so (and even more so comparing something
like a Spitfire Audio library to a synth string patch), it is possible that
will be the case, given the investment of time, brains and money.

------
kyriakos
Why not just use USB? Standardise all instruments on USB and use adaptors for
backwards compatibility.

~~~
robbrown451
Most MIDI seems to be done over USB already. MIDI 2.0 is mostly unrelated to
the actual connectors and such, it's more about the protocol.

~~~
mhh__
"MIDI" on your Desk may well be done over USB but MIDI as a hardware protocol
is absolutely rock solid and is the only choice for live music. Speaking as an
occasional backstage gremlin, I minimise the USB proportion of the signal path
because reduces the amount of testing required (MIDI is generally well
implemented on the hardware side, but not all USB midi interfaces are made
equal).

That and a lot of MIDI is instrument-to-instrument rather than to a PC (Or
more realistically a Mac, because you have to have a Mac to be a creative,
right...)

~~~
freeone3000
Windows audio latency increased to a 35ms floor in Vista and never recovered.

~~~
oriolid
This is why many music apps use ASIO drivers that skip Windows audio. ASIO is
awful in many ways, but latency is not one of them.

------
travbrack
I love how clickbait gets torn apart on this site.

