Does incoming MIDI go through the ASIO Sample Buffer?

I’m getting results in my latency testing that seem to indicate that MIDI coming in from a device passes through, or is at least delayed by, the ASIO sample buffer. I’ve only done this with MIDI over USB - have not tested MIDI over Serial/DIN.

Could anyone comment on this? Is it true? Is it a Cantabile thing?? Any way around it???

Details: The two tests below were looking at the time it takes to render MIDI to audio in Kontakt. I used two different buffer sizes in Tests 18A and 18B, and that’s when the difference showed up.

The time from the keyboard MIDI through the end of sound rendering, which (one would think) should stay the same, in fact went up from 53 samples to 1026 samples - a difference of 973 samples. One explanation is that the incoming MIDI from the keyboard goes through the ASIO sample buffer.

In this case, changing the buffer size from 48 to 1024 would account for a difference of 976 samples - which almost perfectly accounts for the difference (to within 3 samples).

I am showing a red “ASIO?” box in the diagram below, which would account for these results:

1 Like

A bit of an related reply just to say I like the style of the flow diagrams!


AFAIK, this is the standard VST way of doing things - all MIDI input is processed in blocks in parallel to audio input, and then forwarded along the processing chain. MIDI plugins always get a buffer-load of MIDI commands to process, just like audio plugs get a buffer-load of samples to process - in fact, there is no architectural difference between MIDI and audio plugins - they are just plugins that process their input in chunks of “sample buffer size”.

Every MIDI command received or sent also has a “sample offset” timestamp within the audio buffer, so MIDI plugins are supposed to process a whole buffer at a time.

It looks like this mechanism is one of the key reasons for the sample buffer quantization.




Good to know that this is expected behavior (and not something I screwed up in my testing). Thanks so much for putting and end to my spate of hair-pulling.

Using pieces of your reply in a web search, I came across this ancient Martin Walker / Sound on Sound article that describes this in more detail … Solving MIDI Timing Problems - very enlightening.

This is all reminiscent of when mainframes first evolved into “timesharing” kernels without interrupt capabilities. In my mind I was thinking of a MIDI event as an interrupt, rather than a buffered “get in line, buddy, and wait your turn”.

in chunks of “sample buffer size”.

I’m thinking it’s really “sample buffer time”, since the MIDI buffer would hardly be filled (compared to an audio buffer) …

1 Like

Almost correct - the measure is sample buffer size for audio. MIDI events are simply tacked on to a sample offset when fully received, so for MIDI, you could say the buffers are time-related…

1 Like

Oh … so if the MIDI command is not complete by the end of a scheduled buffer delivery, it gets held over into the next buffer cycle?

“Stand by to stand by”

“Mill about smartly, men”

Complete means “all three bytes received” in a normal short message. This is so that a plugin can really process a message and doesn’t need to worry about incomplete message fragments of e.g only two bytes of a note-on message. Otherwise dealing with MIDI would be reeeeally messy for plugins…

1 Like

Got it. Thanks!

And another wrinkle: MIDI over USB vs. Serial MIDI (DIN).

One would think that USB with be much faster than 3k bytes/sec, but there seems to be significant jitter associated with USB. Interesting discussion at

… but I’m feeling some real-world latency testing needs to be done to compare the two …

I just did some latency testing to compare (among many things) MIDI over USB vs Serial (DIN).

The short answer is that MIDI over USB was marginally better than Serial (DIN) MIDI. However, this was anything but a “real world” test.

More details:

I produced MIDI and audio by hitting Middle C on a Yamaha S08. Sound and MIDI were routed by various paths through an RME Babyface Pro FS on Host A (a Win10x64, i7-6gen, 2-core, 16GB, 2016 PC laptop) and recorded on a second Host (a Win10x64, i7-11gen, 8-core, 64GB, 2022 PC laptop) with an RME UCX II.

I recorded sound directly from the keyboard’s phones port, from the audio out ports through Host A using TotalMix Direct Monitoring as well as going through Cantabile (a DAW) on Host A, and also the audio generated from MIDI input to Kontakt running a full-featured sound library (The Grandeur). The Serial (DIN) MIDI cable was also tapped to produce and audio signature to mark the start and end of MIDI events as issued by the keyboard, using this Audio Tap Cable in this post

I recorded all sounds simultaneously into multi-track WAV files and analyzed them in Reaper. I repeated tests 10 times to get averages, jitter (min to max), and other statistics.

In my tests, USB performed with 0.25 msec lower latency and 0.29 msec lower jitter than Serial (DIN) MIDI. However, these differences in average latency were not statistically significant (p=0.22). Also, the USB line was a dedicated line with no other traffic and handled single, isolated MIDI events, so this is not a “real world test”.

Latency when rendering Kontakt using USB MIDI was 4.02 msec with 1.90 msec jitter at 44.1kHz and reduced to 3.28 msec with 1.19 msec jitter at 48kHz. As many folks have noted, jitter is the killer.

Here’s my test rig, which was set up to also test the latency saving of TotalMix Direct Monitoring vs going through ASIO and the DAW (like 3+ msec RTL) and the effect of going from 44.1kHz to 48kHz sample rates (more than I expected, since my Kontakt sample libraries are 48kHz so I get about 18% improvement in latency).