what other audio libraries are like #487

ad8e · 2022-06-21T15:51:46Z

ad8e
Jun 21, 2022

I tried a few of them.

Design

Portaudio: has PaStreamCallbackTimeInfo. It is broken or useless on many systems, and invisibly so. Sometimes it fakes numbers with the native clock at the callback, in which case its double type makes it worse than the native uint64_t type.

It has PaStreamCallbackFlags. This is occasionally useful; it does report underruns on my system when they happen. I know it's broken on some systems. Again, invisibly broken, so you don't know when it's working or not.

Portaudio is unique in adding these features to the callback. But neither feature is in working condition. The docs make no note of this.

"We could define ma_device_uninit() such that it must be called on the same thread as ma_device_init(). We could also just not release the IAudioClient when performing automatic stream routing to avoid the deadlock. Neither of these are acceptable solutions in my view so we're going to have to work around this with a worker thread"

Portaudio just requires them to be on the same thread, and it's fine. (Although, Portaudio does not document this properly.)

Cubeb: requires the user to call CoInitializeEx, which is a barrier to getting started. It would be nice to #define CUBEB_PLEASE_CALL_CoInitializeEx_FOR_ME.

Cubeb has almost no configuration options, but the defaults are the right ones for being non-configurable. For example, it does not initialize buffers to 0. "What's right for Firefox" seems to be right for most consumer applications too.

Pipewire: built its own resampler that you may be interested in looking at. I haven't benchmarked it. Most resamplers demonstrate a misunderstanding of basic theory in their API and strive to approximate sincs. SoX's API is better, but it is not designed for low latency. Pipewire's does not have a asymmetric impulse response, but if Wim Taymans spent so much time building his own, maybe it has other advantages.

A quick overview of resampling theory:
A windowed sinc will have lower latency and better time uncertainty than a full sinc. Naively, the window should be Gaussian. Full sincs are for bandlimited, perfectly periodic signals, which audio signals are not. They have attacks and decays.
A gamma distribution window will have lower latency and better pre-/post- ringing than a Gaussian window. SoX achieves up to here, with its aliasing and intermediate phase parameters, although its aliasing reaches too far above Nyquist.
A rising frequency (in the style of a gammachirp) may have better frequency masking than a constant frequency. But it's not clear to me if this holds true for resampling where Nyquist > 20 kHz. Only researchers and hearing aid manufacturers pay attention to this.

Quality/Completeness

Portaudio: old school, in a bad way. Code, docs, build, examples, and infrastructure are ancient and have problems. There have been improvements in recent years, moving to cmake and github. Supports WASAPI exclusive mode, but requires great contortions; I haven't succeeded after spending 5 hours. Does not support IAudioClient3. Thus, it will have larger latency than other libraries on Windows 10.

Much of Portaudio is ancient and written by people who did not know how to code well at the time. Those programmers are better now, but the code is still the old code. Maintenance is slow, but alive and community-driven.

Cubeb: no WASAPI exclusive mode. API is mostly fine. Maintainers are active, and seem reasonable and capable. Firefox-focused, but will accept non-Firefox-related patches if in good shape. The docs are in good shape and up-front about limitations.

Cubeb is less buggy than Portaudio, thanks to its use in Firefox.

RTAudio: don't use this. Its maintenance has fallen behind, and it was never in good shape to start with.

Libsoundio: hit by a bus, has many bugs

Benchmarks

All tests are on Win 8.1, msys2 mingw-w64, full optimization, -s (strip symbols), WASAPI Shared, all non-WASAPI backends disabled. For miniaudio, I also disabled the other features: MA_NO_DECODING MA_NO_ENCODING MA_NO_WAV MA_NO_FLAC MA_NO_MP3 MA_NO_GENERATION.

Compilation time:
miniaudio single header: 50 sec
miniaudio split, compilation without linking, omitting miniaudio.c: 3 sec
cubeb: 3 sec
portaudio dynamic: 3 sec

Compiled file sizes:
miniaudio: 441 KB
cubeb: 176 KB
portaudio dynamic: 22.5 KB exe + 327 KB dll. (not fair, but I can't be bothered to compile it statically now. It was 107 KB when I did)

Latency:
I don't have a good way to measure.
If I did measure, Portaudio would be the loser on Windows 10.
Cubeb has disabled IAudioClient3 for the past two years, so it will also be a loser on Windows 10..

CPU overhead:
miniaudio: 30-31%
cubeb: 23-26%
portaudio: often crashes 1/4 of the processes. If they do all survive, CPU usage is similar to miniaudio.

To test CPU overhead, I simultaneously create 400 processes that output 0.001 to playback, and check Task Manager for total CPU usage after settling. Then I retest in strided order 2 more times to check for consistency.
This is a bad methodology. It would be better to increase the number of processes until they start dropping frames. However, Audacity is the only recording tool I have, and it drops frames all by itself, while barely doing anything.
To get a sense for absolute overhead, divide by 400: the libraries only take 0.07% CPU each, and this is not a big deal.

mackron · 2022-06-21T23:03:24Z

mackron
Jun 21, 2022
Maintainer

Thanks for this! Very interesting. I'd be curious to see if disabling the high level API would further reduce the size of the build (these options weren't documented so you probably weren't aware).

MA_NO_RESOURCE_MANAGER
MA_NO_NODE_GRAPH
MA_NO_ENGINE

I've not yet done a proper optimization pass so that CPU test is interesting - not as bad as I would have expected. Indeed, it's hard to measure that stuff accurately though.

Overall, how did you find your experience with miniaudio compared to the others? Do you have any feedback for things to improve, or things that other libraries are doing better that I could consider adopting?

I also did a comparison of some of my competitors a few years ago. I didn't spend a huge amount of time on each one, and I focused more on the developer experience rather than performance and benchmarking. If you're curious, below is a copy and paste of the notes I wrote back then (some of these may well be invalid by this point).

Wwise

Was never able to figure out the low-level API for this. Very enterprise-y.
You need to install a launcher before being able to download the SDK, but you need to sign up for an account before you can download the launcher.
No API documentation whatsoever on the website - it's all in the SDK installation.
SDK installation is 2.6GB. Seriously?
They seem to put heavy emphasis on the authoring tool and the high level API.

FMOD

Website is where the documentation is located, but I didn't find a good example to get started.
Need to download a launcher to install an SDK, which you can only do after signing up for an account. Would be good if it was just a zip file without the need to install anything.
Resampling seems OK, but smooth pitch shifting is not that good.
Quite a good API, except it has ambiguity with their use of the term "Channel".

SoLoud

Website was the best reference to get started.
Annoying build system. Allows you to add files directly to source tree, but there's too many of them. Also requires multiple headers to access everything. Should only need something like #include "soloud.h".
The download comes with pre-built binaries for the DLLs, but not libraries for linking. Very annoying.
Resampler doesn't work properly, at least not with pitch shifting.
API seems quite good.

OpenAL-Soft

The programmers guide was a really good resource.
Requires a separate device and context initialisation step which is similar to miniaudio, however it does not support passing in a NULL device to the context.
Pitch shifting is not good quality with the sine wave test.
The API is a strange level. It's low level to the point that it doesn't do decoding nor does it do complex buffer management, but high level in that it does 3D audio. Feels awkward.
High memory usage.

irrKlang

Requires separate DLLs for the library itself, MP3 and FLAC.
SDK is a zip you can just download without needing to register or install anything. Very nice.
Example code on the front page is useful for getting started.
Not many backends. No PulseAudio nor WASAPI which are critical for good support on Linux and low-latency on Windows.
No authoring tool that I could see.
Documentation is messy and hard to read.
Not enough options for device initialization. Cannot specify format, channels or sample rate. Cannot specify latency.
No result codes.
There's limitations with where you can do certain things. For example, cannot set the pitch if there's effects being applied at the same time.
Does not support f32 WAV files. Needs to be integer PCM.
Annoying API flaw in ISoundEngine::play*() - Returns a pointer to an ISound object, but will only return non-NULL if specific boolean parameters are set to true. This is unnecessarily confusing. Fire-and-forget playback should be done using a different function.
Onboarding process is good.
Overall, API seems very good except for point above about ISoundEngine::play*().

4 replies

ad8e Jun 22, 2022
Author

Thanks for this! Very interesting. I'd be curious to see if disabling the high level API would further reduce the size of the build (these options weren't documented so you probably weren't aware).
MA_NO_RESOURCE_MANAGER
MA_NO_NODE_GRAPH
MA_NO_ENGINE

Miniaudio: 441 KB -> 393 KB.

Overall, how did you find your experience with miniaudio compared to the others? Do you have any feedback for things to improve, or things that other libraries are doing better that I could consider adopting?

If someone wanted a recommendation to get started with an audio library, I would recommend miniaudio first. I think I got it working on the first or second try. The cubeb first-time experience is miserable. So miserable, in fact, that I finally decided to fix it. Maybe if that pull request is accepted, I'll change my recommendation to cubeb. Cubeb has a bad reputation for its first-time experience, and it's well-deserved. But after that, it's smooth sailing.

Portaudio kind of sucks. It used to be RtAudio vs Portaudio. Then RtAudio's maintenance fell behind. I assumed Portaudio would be of decent quality, since it is widely used by the community and a de-facto default. But when I looked into the code, it was not good. It's not in the same tier of awfulness as some other projects like FreeType, but it's hard to say good things about it. The first-time experience requires trawling through docs in a semi-disorganized state. I submitted a partial patch for the new user experience for Portaudio too.

After the initial hump, cubeb is quite easy to get into good shape. It has almost nothing to configure, and all the configuration is what it should be. These opinions are quite helpful - a developer new to audio has a knowledge barrier, needing to read each option and evaluate it. As an example for APIs, cubeb says, "if you are on Windows, WASAPI is your choice, and MME is a fallback, and this is what Firefox chooses". That makes getting started faster, since the opinion is clear and has a reasonable basis, so there is some confidence it's the right one. The code is well-tested; despite being Firefox-focused, that seems to help a lot more than Portaudio's wide developer diversity. Eventually, a pro audio user will want ASIO, but I notice miniaudio doesn't support ASIO either, and I'm also not yet at the point where I care about ASIO and maybe never will be. I really like cubeb's platform support page since it's very upfront about limitations. No surprises and hidden gotchas like Portaudio has.

One example of the quality difference is IAudioClient3. Both Portaudio and cubeb don't support it. But cubeb implemented it, and disabled it because of corner cases that it has yet to fix. Portaudio hasn't implemented IAudioClient3, and even after it is implemented and stable, it will suffer from corner cases. So although the outcome is the same, the root cause illustrates a vast gulf in stability and user care.

I think it will be very hard to compete with cubeb. If I wanted to spend serious time improving an audio library, I would choose cubeb, because the maintainers are sane, active, capable, and welcoming of patches, the code is already in good shape and performant, Firefox provides important testing, and it hews closely to what I want (a minimal audio library, not needing features like mp3 or filters). Only some of these are advantages over miniaudio, but the Firefox testing is hard to overcome.

I would say that within the confines of what cubeb is able to do, cubeb is better than miniaudio apart from Windows 10 latency. Within the confines of what miniaudio is able to do, cubeb is not able to do all of those things, and miniaudio is better than Portaudio. Portaudio has some extra features the other two don't, but I did not have a good experience with some of those extra features.

I'll keep miniaudio around as an option in my project, since miniaudio has a Web Audio backend which I will need eventually. Faust and Sokol are still stuck on ScriptProcessorNode, and I don't believe I am more motivated or experienced than them in writing an AudioWorklet backend, which will be boring and low-impact. So I'll just use whatever Web Audio library is available and hope it solves the problem for me.

mackron Jun 22, 2022
Maintainer

Thanks for that feedback.

If someone wanted a recommendation to get started with an audio library, I would recommend miniaudio first. I think I got it working on the first or second try. The cubeb first-time experience is miserable.

Yes, I've put a lot of work into miniaudio's first-time experience, especially it's build system (or lack there of). It's build system is king - nothing I'm aware of even comes close. The examples and documentation are also important for getting started and I'm always looking to improve that (suggestions welcome!).

After the initial hump, cubeb is quite easy to get into good shape. It has almost nothing to configure, and all the configuration is what it should be. These opinions are quite helpful - a developer new to audio has a knowledge barrier, needing to read each option and evaluate it.

The miniaudio configuration system will try to use logical defaults, but it errs on the side of safety for things like clipping and pre-silencing the output buffer. I guess whether or not they're the best default values depends on the opinion and requirements of the individual. I also think it's better to allow things to be configured (where practical), so long as it's got a well-rationalised default value, rather than simply not allowing it at all. Maybe the miniaudio documentation can do a better job at listing what is the most important stuff (format, channels, rate, buffer size, callback)?

As an example for APIs, cubeb says, "if you are on Windows, WASAPI is your choice, and MME is a fallback, and this is what Firefox chooses". That makes getting started faster, since the opinion is clear and has a reasonable basis, so there is some confidence it's the right one.

miniaudio also has that backend prioritization, but I don't think it's explicitly documented as such. Fair point - I've updated the documentation to make it a bit clearer: 431bea6

Eventually, a pro audio user will want ASIO, but I notice miniaudio doesn't support ASIO either, and I'm also not yet at the point where I care about ASIO and maybe never will be.

I'm intending on at least looking at ASIO at some point, but it's fairly low priority for me. miniaudio supports custom backends, so people who are desperate for it could do it themselves. Indeed, that's how I'll be implementing it initially.

I would say that within the confines of what cubeb is able to do, cubeb is better than miniaudio apart from Windows 10 latency. Within the confines of what miniaudio is able to do, cubeb is not able to do all of those things, and miniaudio is better than Portaudio.

At the start, miniaudio was indeed just at thin library that just did playback of raw audio data. But there were just things that I was always needing consistently like decoding and data conversion (and basically everything in the library, really) and I just wanted to have it all in one place. I think there's good rationale for every feature in miniaudio, even though it does contribute to the overall size (the name has not aged well!). At this point I think miniaudio and cubeb, and other thin audio libraries like it, are basically different classes of libraries, especially now with miniaudio's high level API coming online last year.

Only some of these are advantages over miniaudio, but the Firefox testing is hard to overcome.

You're certainly right about sense of security coming from the Firefox user base and it's broad test coverage. miniaudio is being used in more and more projects, the biggest being raylib, but nothing close the scale of Firefox. As more projects using miniaudio come online I think that'll give people more confidence and test coverage. That'll come in time.

Just regarding your comment about the Web Audio stuff, miniaudio also uses ScriptProcessorNode. I think the AudioWorklet thing requires a separate JavaScript file? I just wasn't sure how to deal with that so just left it alone and stayed with ScriptProcessorNode.

ad8e Jun 22, 2022
Author

The miniaudio configuration system will try to use logical defaults, but it errs on the side of safety for things like clipping and pre-silencing the output buffer. I guess whether or not they're the best default values depends on the opinion and requirements of the individual. I also think it's better to allow things to be configured (where practical), so long as it's got a well-rationalised default value, rather than simply not allowing it at all. Maybe the miniaudio documentation can do a better job at listing what is the most important stuff (format, channels, rate, buffer size, callback)?

The miniaudio documentation is probably fine; I'm not saying that its backend info or other docs have any problems. Rather, cubeb which doesn't need docs has an advantage in this sense; the new user benefits because he doesn't need to learn things, and the power user benefits because he has confidence he's configured it correctly and didn't spend effort doing so. As long as the configuration is what he wants - if someone wanted to clip and couldn't do it himself, cubeb's lack of configuration would pose a large difficulty.

By the way, your ma_pcm_f32_to_s24__reference and related conversions would benefit by rounding rather than truncation. If you look at it in godbolt, it calls cvttss2si instead of cvtss2si. You can see a fix here, because C++ unfortunately does not support one natively; lround and cousins use a function call. Your clipping should be adjusted too; consider what happens with NaN comparisons. (a < b) and !(a >= b) are different.

Just regarding your comment about the Web Audio stuff, miniaudio also uses ScriptProcessorNode. I think the AudioWorklet thing requires a separate JavaScript file? I just wasn't sure how to deal with that so just left it alone and stayed with ScriptProcessorNode.

I have no intelligent opinion on which one is better. Given this condition, I know that Mozilla and co. recommend AudioWorklet. So if I have access to an audio library with AudioWorklet, I will use that over ScriptProcessorNode, while being blind to the merits of either one. But the only other credible audio library that supports Web Audio is sokol, which also uses ScriptProcessorNode. So while you may debate ScriptProcessorNode vs AudioWorklet on their merits, you hold a monopoly and I must use whatever you choose.

mackron Jun 22, 2022
Maintainer

By the way, your ma_pcm_f32_to_s24__reference and related conversions would benefit by rounding rather than truncation.

Thanks. I've added this to my notes to take a look at this.

I have no intelligent opinion on which one is better. Given this condition, I know that Mozilla and co. recommend AudioWorklet. So if I have access to an audio library with AudioWorklet, I will use that over ScriptProcessorNode, while being blind to the merits of either one. But the only other credible audio library that supports Web Audio is sokol, which also uses ScriptProcessorNode. So while you may debate ScriptProcessorNode vs AudioWorklet on their merits, you hold a monopoly and I must use whatever you choose.

I'd be happy to support AudioWorklet if there's a clean way to do it. It's just that whole separate .js file requirement makes it extremely awkward for this style of library. My understanding is that the advantage to AudioWorklet is that it allows the browser to run the audio processing stuff on a separate thread rather than on the main thread. That would be something I'd like to support, so if anyone in the community is reading this thread and has suggestions on how to support it cleanly I'd be more than happy to listen.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

what other audio libraries are like #487

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

what other audio libraries are like #487

Uh oh!

Uh oh!

ad8e Jun 21, 2022

Design

Quality/Completeness

Benchmarks

Replies: 1 comment · 4 replies

Uh oh!

mackron Jun 21, 2022 Maintainer

Wwise

FMOD

SoLoud

OpenAL-Soft

irrKlang

Uh oh!

Uh oh!

ad8e Jun 22, 2022 Author

Uh oh!

mackron Jun 22, 2022 Maintainer

Uh oh!

Uh oh!

ad8e Jun 22, 2022 Author

Uh oh!

mackron Jun 22, 2022 Maintainer

ad8e
Jun 21, 2022

Replies: 1 comment 4 replies

mackron
Jun 21, 2022
Maintainer

ad8e Jun 22, 2022
Author

mackron Jun 22, 2022
Maintainer

ad8e Jun 22, 2022
Author

mackron Jun 22, 2022
Maintainer