How to reduce latency even more? Better interface... which one? LatencyMon questions

Longtime DAW user here… and I’m on a quest to reduce my latency for live performance. My rig is very stable, and i’ve gotten used to the latency somewhat, which Cantabile reports as 7.2 In and 9.2 Out at the lowest buffer of 89 samples. No pops or other issues, just latency. But I’d like to cut my latency some more… ideally around 8 or 10 roundtrip. Specs are below, along with LatencyMon report. I’ve done all the optimizing, using Brad’s outline and many others, the latest of which is the Presonus guide (the main thing I did from that was to turn off process scheduling and set No Paging File. I couldn’t find the IDE Controllers to enable DMA, but I’m guessing that’s becuase I have an SSD…?).

So I have a few questions:
-When i run LatencyMon at idle, without my interface connected, Cantabile not running, and all optimizaitons in place, I get a bad “in the red” report… but with very low ISR and DSP, low page faults, but high kernel latency of 11.6 ms. So i can’t figure out what’s causing that? Yet, with my interface connected and Cantabile (2 Solo) running with my full gig settings, I get an excellent “very low in the green” report of about .198 ms… if i’m converting the micro seconds correctly. Can’t figure that out… Is this because, even though my internal soundcard is disabled, it’s used somehow in the test, but when the Scarlett is in place it’s of course much faster than the internal?
-Is there something else I can do, based on the LatencyMon results, to optimize Windows for a significant latency improvement?
-I need to get another soundcard anyway, so I have a backup at gigs. What’s a good low latency model that won’t break the bank, in the under $600 range? I only need 2 in and 4 out, and I’d prefer to keep the footprint small… plus I have no need for 8 or 16 inputs, I use a Presonus 16.0.2 mixer via firewire for live multitracking with up to 16 channels. So half rack or similar size is preferred, although if I can get an appreciable improvement in latency I’d certainly go with full rack.
Any thoughts are appreciated!

Laptop: Vaio VPCSA 490x, quad intel i7 2640M 2.4ghz, SSD, 8 gig RAM, Win 7 SP1 x64.
Scarlett 2i4
Cantabile 2 Solo, 1 midi source typically running Kontakt 4 with Sax Brothers sample library, and 1 audio source (guitar) into S-Gear, and about 5 plugin efx total.

LatencyMon (wth Scarlett and Cantabile in “gig mode”):
CONCLUSION


Your system appears to be suitable for handling real-time audio and other tasks without dropouts.
LatencyMon has been analyzing your system for 0:00:38 (h:mm:ss) on all processors.


SYSTEM INFORMATION


Computer name: OWNER-VAIO
OS version: Windows 7 Service Pack 1, 6.1, build: 7601 (x64)
Hardware: VPCSA490X, Sony Corporation, VAIO
CPU: GenuineIntel Intel® Core™ i7-2640M CPU @ 2.80GHz
Logical processors: 4
Processor groups: 1
RAM: 8107 MB total


CPU SPEED


Reported CPU speed: 2793 MHz

Note: reported execution times may be calculated based on a fixed reported CPU speed. Disable variable speed settings like Intel Speed Step and AMD Cool N Quiet in the BIOS setup for more accurate results.


MEASURED KERNEL TIMER LATENCIES


This value represents the maximum measured latency of a perodically scheduled kernel timer.

Highest measured kernel timer latency (µs): 198.297646


REPORTED ISRs


Interrupt service routines are routines installed by the OS and device drivers that execute in response to a hardware interrupt signal.

Highest ISR routine execution time (µs): 42.892231
Driver with highest ISR routine execution time: USBPORT.SYS - USB 1.1 & 2.0 Port Driver, Microsoft Corporation

Highest reported total ISR routine time (%): 0.131754
Driver with highest ISR total time: USBPORT.SYS - USB 1.1 & 2.0 Port Driver, Microsoft Corporation

Total time spent in ISRs (%) 0.184219

ISR count (execution time <250 µs): 78904
ISR count (execution time 250-500 µs): 0
ISR count (execution time 500-999 µs): 0
ISR count (execution time 1000-1999 µs): 0
ISR count (execution time 2000-3999 µs): 0
ISR count (execution time >=4000 µs): 0


REPORTED DPCs


DPC routines are part of the interrupt servicing dispatch mechanism and disable the possibility for a process to utilize the CPU while it is interrupted until the DPC has finished execution.

Highest DPC routine execution time (µs): 276.531686
Driver with highest DPC routine execution time: netbt.sys - MBT Transport driver, Microsoft Corporation

Highest reported total DPC routine time (%): 2.446203
Driver with highest DPC total execution time: USBPORT.SYS - USB 1.1 & 2.0 Port Driver, Microsoft Corporation

Total time spent in DPCs (%) 2.719615

DPC count (execution time <250 µs): 199141
DPC count (execution time 250-500 µs): 0
DPC count (execution time 500-999 µs): 5
DPC count (execution time 1000-1999 µs): 0
DPC count (execution time 2000-3999 µs): 0
DPC count (execution time >=4000 µs): 0


REPORTED HARD PAGEFAULTS


Hard pagefaults are events that get triggered by making use of virtual memory that is not resident in RAM but backed by a memory mapped file on disk. The process of resolving the hard pagefault requires reading in the memory from disk while the process is interrupted and blocked from execution.

Process with highest pagefault count: none

Total number of hard pagefaults 0
Hard pagefault count of hardest hit process: 0
Highest hard pagefault resolution time (µs): 0.0
Total time spent in hard pagefaults (%): 0.0
Number of processes hit: 0


PER CPU DATA


CPU 0 Interrupt cycle time (s): 4.518413
CPU 0 ISR highest execution time (µs): 42.892231
CPU 0 ISR total execution time (s): 0.284162
CPU 0 ISR count: 78904
CPU 0 DPC highest execution time (µs): 276.531686
CPU 0 DPC total execution time (s): 4.180351
CPU 0 DPC count: 196556


CPU 1 Interrupt cycle time (s): 0.017055
CPU 1 ISR highest execution time (µs): 0.0
CPU 1 ISR total execution time (s): 0.0
CPU 1 ISR count: 0
CPU 1 DPC highest execution time (µs): 223.655209
CPU 1 DPC total execution time (s): 0.002657
CPU 1 DPC count: 374


CPU 2 Interrupt cycle time (s): 0.102216
CPU 2 ISR highest execution time (µs): 0.0
CPU 2 ISR total execution time (s): 0.0
CPU 2 ISR count: 0
CPU 2 DPC highest execution time (µs): 40.877193
CPU 2 DPC total execution time (s): 0.012052
CPU 2 DPC count: 2216


CPU 3 Interrupt cycle time (s): 0.004641
CPU 3 ISR highest execution time (µs): 0.0
CPU 3 ISR total execution time (s): 0.0
CPU 3 ISR count: 0
CPU 3 DPC highest execution time (µs): 0.0
CPU 3 DPC total execution time (s): 0.0
CPU 3 DPC count: 0


Hi @twaw

Most of that must be in the sound hardware. If you do the math on 89 sample buffer at (assuming) 44,100Hz, that’s about 2ms audio cycle time. The rest must be in the hardware.

That’s odd - I think. When Cantabile’s audio engine starts it makes a few requests of Windows that might affect kernel scheduling. eg: it requests 1ms timer accuracy and on Vista and later also requests pro-audio thread priority boost on the real-time audio threads. Wondering if when one or more programs do this the whole kernel scheduling changes.

It’d be interesting to run the same test with Cantabile running, but the audio engine stopped.

Sounds like your machine is pretty well optimized already.

I don’t have any suggestions for this - perhaps someone else here has some ideas.

Brad

Thanks for the quick reply Brad! Good idea, I’ll try the test with Cantabile up but engine stopped and will report back. I did find a cool utility called RTL Utility by Oblique Audio which seems to do a good job of measuring your card latency, and it’s easy to use… At 44.1 I got 13.4ms RT.

So here’s an interesting question: All specs aside, I wanted to thoroughly analyze latency for how it sounds when playing a gig (I know how it feels, and I’ve gotten quite used to the 13ms delay). So I did a “real world” test by recording a click while playing a palm-muted note on the guitar for the sharpest attack I could get. Oddly, on most of the clicks the very, very beginning of the guitar was actually just slightly ahead, by maybe 5-10ms. The slowest I got was 13 ms behind the click. What would cause it to be ahead? I’ve played to a click for over 30 years, so I don’t think I was anticipating it. Maybe it’s just the normal human error and variation?

Or maybe my approach is flawed? To do this, I ran Cantabile and my S-Gear guitar app with all EFX, running through my laptop gig rig, setup exactly as I always do, at 44.1 and 89 samples. I ran that through a mixer into my studio computer, recording in Sonar. Simultaneously, for the click I recorded a mic right up to a studio monitor, onto a separate track. Sonar in the studio computer was generating the click using it’s internal sound card. So both the guitar and the mic’d click are going into the studio computer at the same exact time, and any recording delays introduced in the studio computer or it’s sound card (an old Delta 1010) will be irrelevant when comparing my “gig latency”. So any latency in my gig rig should be very apparent.

I know that humans can’t really discern a delay much under 15ms, so maybe that’s the simple answer. But i’m still a bit surprised that most of the notes were slightly ahead… must be due to my human timing inaccuracy.

But the most enlightening thing about this is the simple fact that although when you first start using a rig like this you notice a bit of latency, in the real world it’s negligible, and no one, not even your drummer, will notice… assuming you’re in the 10-15ms range. My son, who uses a Kemper profiling amp (which therefore must have some latency too, although quite low), immediately noticed the latency but after a few minutes he said it didn’t affect his playing at all. And the whole time I was listening his timing sounded spot-on.

I also think this test will inform my buying decision… I could spend $800 for a Babyface Pro and theoretically get down to around 6 or 7 ms… or spend twice that and get to to 4 or 5. Or stay in the $300-$500 range and get maybe 9 or 10. Hmmm… after this test, for my specific purposes, I’m not really seeing an advantage to going much beyond $500… 10 ms is fine.

Hi @twaw,

I think the average that humans can detect is around the 10ms mark but I know for some people it’s less and for some it can be quite a bit more.

Given that you seemed to adapt fairly quickly to 10ms delay I’d be aiming for something around that mark.

Brad

Thanks, Brad…makes sense.

isn’t latency also due to PC hardware too?

The way I understand it - not really. The software’s audio thread runs on a very regular audio cycle which is the software latency. Any other latency is in sound card, DAC+ADC, sound-card buffering etc…

I guess there’s going to be a little delay in moving data to the sound card, but I’d guess that’s in the order of nano/microseconds - not milliseconds.

Brad

Only insofar as a faster processor can provide faster throughput to the soundcard. The same soundcard on a Pentium 3 will not allow for any reasonable load at low latency. Having said that, a good ASIO driver makes all the difference. Fast CPU + Good ASIO driver = :heart_eyes:

Well, Focusrite claims the MkII version of their interfaces are faster than the old ones, but I’ve heard it’s all about the new drivers and nothing to do with the hardware. So I guess a well written driver is indeed going to make a difference. I have a Tascam US-144 still on a machine and it works better with ASIO4ALL than its own driver (it’s been unsupported for years now).

Last rehearsal I forgot my usb audio interface (Focusrite Scarlett 6i6) and I had to play with the (noname) onboard soundcard of my motherboard via wasapi and it worked VERY well! No glitches at all and a latency of 5.87ms (displayed in cantabile). After that I asked myself why I invested 200€ in byuing an audio interface :wink:

1 Like

I never used an audio interface until I had the need to run guitar and vocals, and the use of numerous outputs. Latency on ASIO4All was great.

2 Likes

Same here- on my previous laptop I just used the internal out with ASIO4ALL and it really did quite well. I also have the Scarlett 616 and I do feel safer running it though- and anyway, I need the MIDI input and multiple outputs. So the $$ is justified…

Wasapi works quite well?
Windows 10 should be optimised for audio and it was my main reason for upgrading.
After that i disabled most services and online connections :stuck_out_tongue_winking_eye:

I’ve recently upgraded my PCAudioLabs laptop system to Windows 10 Professional from Windows 8.1. Amazing night and day difference in performance. I’m using a MOTU UltraLite-mk3 Hybrid (192 samples @ 48kHz, 4ms total latency) and I notice that when using the WASAPI drivers I am getting better performance vs. using ASIO drivers. Cantabile load meter is noticeably lower with WASAPI. When using ASIO cantabile’s load meter was showing a peak of 35%. When doing that same test with WASAPI the load meter was showing 27%. Here is a link to PCAudioLabs web site which has a video and info regarding how Microsoft has geared Windows 10 towards audio optimization.

1 Like

same here. I’m working with WinReducer since months, but still not having a perfect optimized ISO. I’ll let you know, anyway.

1 Like

It is supposedly even better since that July 2015 article was released at PCAudioLabs you referenced, but I’ve seen little as benchmarks other than that article.

I had big problems with the MOTU WDM/WASAPI drivers for my Ultralite mk3 hybrid, and switched to using Voicemeeter and Voicemeeter Pro “Banana” for handling all such duties (media player playback, YouTube videos, other browser audio) as the MOTU WDM drivers were too unstable (and HATED when sources switched sample rates - a playlist having both 44.1k and 96k files in it would freeze up the system!)

I use the ASIO drivers just fine with no problems, just to clarify that detail. (See this discussion over at MotuNation where I explain some more.)

Interesting to hear of your performance improvements via those WDM drivers, though. I’ve never tried driving my Ultralite out of Cantabile or out of any DAW through those. I’ll give it a look, and perhaps compare performance with Voicemeeter, which can be addressed by software (AND address the audio interface) as either ASIO or as WDM/KS/MME and converts between them flawlessly.

Terry

1 Like

Ow live it’s my motu also (if it doesn’t crack from now to than GRRRRRRR).
But at home it’s wasapi, works just well enough voor recording.

I’m using the current build as seen in the attached image. I haven’t had one single issue running the latest build of Win10 and Cantabile 3.

Specs:
PC Audio Labs Rok Box MC M7 FW laptop
2.8 GHz Intel i7-4810MQ
32GB HYPERX DDR3 RAM
512GB mSATA SSD for OS
(2x) striped 1TB mSATA SSD for sample libraries

No issues here either, but looking forward to doing the WDM vs ASIO comparisons over the weekend! :slight_smile:

Terry

how much did you pay for this machine?[quote=“mrheiser, post:18, topic:1269, full:true”]
I’m using the current build as seen in the attached image. I haven’t had one single issue running the latest build of Win10 and Cantabile 3.

Specs:
PC Audio Labs Rok Box MC M7 FW laptop
2.8 GHz Intel i7-4810MQ
32GB HYPERX DDR3 RAM
512GB mSATA SSD for OS
(2x) striped 1TB mSATA SSD for sample libraries

[/quote]