Replies: 1 comment 4 replies
-
Thanks for this! Very interesting. I'd be curious to see if disabling the high level API would further reduce the size of the build (these options weren't documented so you probably weren't aware).
I've not yet done a proper optimization pass so that CPU test is interesting - not as bad as I would have expected. Indeed, it's hard to measure that stuff accurately though. Overall, how did you find your experience with miniaudio compared to the others? Do you have any feedback for things to improve, or things that other libraries are doing better that I could consider adopting? I also did a comparison of some of my competitors a few years ago. I didn't spend a huge amount of time on each one, and I focused more on the developer experience rather than performance and benchmarking. If you're curious, below is a copy and paste of the notes I wrote back then (some of these may well be invalid by this point). Wwise
FMOD
SoLoud
OpenAL-Soft
irrKlang
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I tried a few of them.
Design
Portaudio: has PaStreamCallbackTimeInfo. It is broken or useless on many systems, and invisibly so. Sometimes it fakes numbers with the native clock at the callback, in which case its double type makes it worse than the native uint64_t type.
It has PaStreamCallbackFlags. This is occasionally useful; it does report underruns on my system when they happen. I know it's broken on some systems. Again, invisibly broken, so you don't know when it's working or not.
Portaudio is unique in adding these features to the callback. But neither feature is in working condition. The docs make no note of this.
"We could define ma_device_uninit() such that it must be called on the same thread as ma_device_init(). We could also just not release the IAudioClient when performing automatic stream routing to avoid the deadlock. Neither of these are acceptable solutions in my view so we're going to have to work around this with a worker thread"
Portaudio just requires them to be on the same thread, and it's fine. (Although, Portaudio does not document this properly.)
Cubeb: requires the user to call CoInitializeEx, which is a barrier to getting started. It would be nice to
#define CUBEB_PLEASE_CALL_CoInitializeEx_FOR_ME
.Cubeb has almost no configuration options, but the defaults are the right ones for being non-configurable. For example, it does not initialize buffers to 0. "What's right for Firefox" seems to be right for most consumer applications too.
Pipewire: built its own resampler that you may be interested in looking at. I haven't benchmarked it. Most resamplers demonstrate a misunderstanding of basic theory in their API and strive to approximate sincs. SoX's API is better, but it is not designed for low latency. Pipewire's does not have a asymmetric impulse response, but if Wim Taymans spent so much time building his own, maybe it has other advantages.
A quick overview of resampling theory:
A windowed sinc will have lower latency and better time uncertainty than a full sinc. Naively, the window should be Gaussian. Full sincs are for bandlimited, perfectly periodic signals, which audio signals are not. They have attacks and decays.
A gamma distribution window will have lower latency and better pre-/post- ringing than a Gaussian window. SoX achieves up to here, with its aliasing and intermediate phase parameters, although its aliasing reaches too far above Nyquist.
A rising frequency (in the style of a gammachirp) may have better frequency masking than a constant frequency. But it's not clear to me if this holds true for resampling where Nyquist > 20 kHz. Only researchers and hearing aid manufacturers pay attention to this.
Quality/Completeness
Portaudio: old school, in a bad way. Code, docs, build, examples, and infrastructure are ancient and have problems. There have been improvements in recent years, moving to cmake and github. Supports WASAPI exclusive mode, but requires great contortions; I haven't succeeded after spending 5 hours. Does not support IAudioClient3. Thus, it will have larger latency than other libraries on Windows 10.
Much of Portaudio is ancient and written by people who did not know how to code well at the time. Those programmers are better now, but the code is still the old code. Maintenance is slow, but alive and community-driven.
Cubeb: no WASAPI exclusive mode. API is mostly fine. Maintainers are active, and seem reasonable and capable. Firefox-focused, but will accept non-Firefox-related patches if in good shape. The docs are in good shape and up-front about limitations.
Cubeb is less buggy than Portaudio, thanks to its use in Firefox.
RTAudio: don't use this. Its maintenance has fallen behind, and it was never in good shape to start with.
Libsoundio: hit by a bus, has many bugs
Benchmarks
All tests are on Win 8.1, msys2 mingw-w64, full optimization, -s (strip symbols), WASAPI Shared, all non-WASAPI backends disabled. For miniaudio, I also disabled the other features: MA_NO_DECODING MA_NO_ENCODING MA_NO_WAV MA_NO_FLAC MA_NO_MP3 MA_NO_GENERATION.
Compilation time:
miniaudio single header: 50 sec
miniaudio split, compilation without linking, omitting miniaudio.c: 3 sec
cubeb: 3 sec
portaudio dynamic: 3 sec
Compiled file sizes:
miniaudio: 441 KB
cubeb: 176 KB
portaudio dynamic: 22.5 KB exe + 327 KB dll. (not fair, but I can't be bothered to compile it statically now. It was 107 KB when I did)
Latency:
I don't have a good way to measure.
If I did measure, Portaudio would be the loser on Windows 10.
Cubeb has disabled IAudioClient3 for the past two years, so it will also be a loser on Windows 10..
CPU overhead:
miniaudio: 30-31%
cubeb: 23-26%
portaudio: often crashes 1/4 of the processes. If they do all survive, CPU usage is similar to miniaudio.
To test CPU overhead, I simultaneously create 400 processes that output 0.001 to playback, and check Task Manager for total CPU usage after settling. Then I retest in strided order 2 more times to check for consistency.
This is a bad methodology. It would be better to increase the number of processes until they start dropping frames. However, Audacity is the only recording tool I have, and it drops frames all by itself, while barely doing anything.
To get a sense for absolute overhead, divide by 400: the libraries only take 0.07% CPU each, and this is not a big deal.
Beta Was this translation helpful? Give feedback.
All reactions