The ASIO SDK documentation contains a lot of information about how the ASIO Host Application and the ASIO Driver are supposed to interact with each other when it comes to buffer management and event callbacks. However, despite the precautions taken by the ASIO docs to try to carefully explain what's going on under various angles and using examples, there are still some aspects of it that are still arguably fairly confusing. This document provides one possible interpretation of the ASIO buffer management semantics that is, hopefully, correct and most consistent with the official documentation.
Part of the reason why this can be so confusing for an ASIO driver developer is because ASIO was apparently designed for hardware using Direct Memory Access (DMA), a scheme in which the hardware audio device reads/writes directly from/to the ASIO buffers in main memory. This is an extremely efficient approach that is well-suited for low-latency operation, but a lot of audio devices do not work that way, especially USB devices and devices accessed through an abstraction layer ("Universal" ASIO drivers). The way buffers and event callbacks work in ASIO feels natural and intuitive to someone writing a driver for a DMA-based device, but could be highly confusing when using a different approach which requires audio buffers to be copied to/from some other place for processing.
In order to understand what's going on, it is a good idea to take the perspective of how things would work in a DMA-based driver, in order to understand ASIO semantics and implement them in a non-DMA-based driver.
Note that this document assumes that you've read the ASIO SDK documentation first, and that you understand the most trivial concepts described there (such as the general principle of double buffering). This document only explains the details that could arguably still be confusing after reading the documentation; it does not address topics that the official documentation already makes clear.
We'll start with the input side of things because it's the simplest. The
bufferSwitch
callback documentation states:
The addressed input buffer is now filled with incoming data.
In other words, if bufferSwitch(0)
is called, it means that "input buffer 0 is
filled with recorded data".
The use of the passive form is unfortunate, but it should still be fairly obvious that the driver is responsible for filling the input buffer with incoming data (where else could it come from?).
The biggest problem in that sentence is the word "now", which is arguably ambiguous and could be interpreted in several different ways:
- The specified input buffer is guaranteed to be filled with incoming data
before
bufferSwitch
is called. - The specified input buffer is being filled with incoming data as the
bufferSwitch
call is taking place. - The specified input buffer will start to be filled with incoming data after
bufferSwitch
returns.
The correct interpretation is (1), on the following grounds:
- The driver sample code in the ASIO SDK fills buffer X before
bufferSwitch(X)
is called (seeAsioSample::bufferSwitch()
), which is inconsistent with (3). - If (2) was the correct intepretation, then the ASIO host application would
have no way of knowing when the input data is ready (there is no
inputReady
callback that the driver could use), and would have to wait until the nextbufferSwitch
call to be able to process the input, which would necessarily inflict additional latency of one buffer length. This contradicts other parts of the documentation, which state that a driver can achieve an input latency of a single buffer length ("block"); see theASIOGetLatencies()
documentation. - A DMA driver cannot implement (3) because it does not get to choose when to "receive" the buffer from the hardware.
- (3) forces the ASIO Host Application to process (or at least copy) the input
data before it can return from
bufferSwitch
, which is awkward ifdirectProcess
is false. - (2) and (3) would be awkward from an API design perspective, because they would imply that the ASIO Host Application should not be processing the specified buffer but the opposite one, which is counter-intuitive.
Therefore, the documentation could be reworded as follows:
The driver guarantees that the specified input buffer is filled with incoming data before
bufferSwitch
is called.
This also implies that, in order to minimize input latency, the driver should
call bufferSwitch
as soon as possible after a full block of input data has
been recorded by the audio hardware.
This answers the question of when the input buffer becomes readable, but it does not address a related question: for how long is the input buffer valid for; i.e. when should the ASIO Host Application stop reading from the buffer. This is addressed in the section regarding buffer validity, below.
On the output side, the bufferSwitch
callback documentation states:
The current buffer half index (0 or 1) determines the output buffer that the host should start to fill. The other buffer will be passed to output hardware regardless of whether it got filled in time or not.
This is where it gets really confusing. It can probably be assumed that "current" here means the buffer index that is specified as the parameter to the callback. The terms "should start to" could be interpreted in several different ways:
bufferSwitch
is called after the specified output buffer has been fully sent to the hardware.- The specified buffer is being sent to the hardware as the
bufferSwitch
call is taking place. - The specified buffer will be sent to the hardware immediately after
bufferSwitch
returns.
The correct interpretation is (1), on the following grounds:
- The host sample code in the ASIO SDK fills the specified buffer in
bufferSwitch
, which is inconsistent with (2). - (3) forces the ASIO Host Application to generate output data before it can
return from
bufferSwitch
, which is awkward ifdirectProcess
is false. - A DMA driver cannot implement (3) because it does not get to choose when to "send" the buffer to the hardware.
- (2) would be awkward from an API design perspective, because that would imply that the ASIO Host Application should not be processing the specified buffer but the opposite one, which is counter-intuitive.
- If (3) were true, then the driver could send the output data to the hardware
as soon as
bufferSwitch
returns, which means there would be little reason forASIOOutputReady()
to exist (see below).
Therefore, the documentation could be reworded as follows:
The driver guarantees that the specified output buffer has been fully sent to the hardware before
bufferSwitch
is called.
This answers the question of when the output buffer becomes writeable, but it does not address a related question: for how long is the output buffer valid for; i.e. when should the ASIO Host Application stop writing to the buffer. This is addressed in the section regarding buffer validity, below.
Upon entry of the bufferSwitch
callback, the following guarantees hold:
- The specified input buffer can be read from.
- The specified output buffer can be written to.
What is still not clear, however, is how long these buffers are valid for. Indeed, eventually the driver will start writing data to the input buffer (which means valid data cannot be read from it anymore), and will start reading data from the output buffer (which means the host must have stopped writing to it by then).
The ASIO SDK documentation does not address these questions explicitly.
Intuitively, a C/C++ programmer might assume that these buffers are only valid
for the duration of the bufferSwitch
call, because, in the absence of
documentation, one should assume that a pointer is only valid for the duration
of the call in which it is used. However, the way the ASIO documentation is
phrased, and in particular the existence of concepts such as directProcess
and
ASIOOutputReady
, casts doubt on that assumption. Section 5 "Audio Streaming"
states:
This is a provision for drivers which require an immediate return of the
bufferSwitch()
orbufferSwitchTimeInfo()
callback on interrupt driven systems like the Apple Macintosh, which cannot spend too long in the callback at a low level interrupt time. Instead the application will process the data at an interruptible execution level and inform the driver once it finished the processing.
In other words: it is possible to return from the callback first, and then
process the data (i.e. read and write buffers) later. This can only work if the
validity of the buffers is extended beyond the bufferSwitch
callback itself.
On the other hand, this seems to be contradicted by a previous statement from the same section:
Upon return of the callback the application has read all input data and provided all output data.
There are multiple ways we could try to reconcile these statements:
- In the latter statement, perhaps "all input data" and "all output data" refers to the other buffer half, not the buffer half that is specified in the callback that we are returning from.
- Perhaps the latter statement is merely a recommendation and is non-normative;
i.e. it is recommended that the host processes the data in
bufferSwitch
, but the driver should not rely on that. - Perhaps the API contract is different depending on whether
bufferSwitch
was called withdirectProcess
enabled or not. IfdirectProcess
is enabled, the buffers are only valid for the duration of the callback. Otherwise, they stay valid beyond the duration of the callback.
It is worth discussing what the consequences would be if the driver and the
application disagree on this contract, i.e. the driver starts reading/writing a
buffer too soon after the application returns from bufferSwitch
. Aside from
the obvious consequence of messing up the audio data, which will cause a likely
audible "glitch" (discontinuity), it also qualifies as a race condition (two
threads are reading/writing the same memory location at the same time), which is
undefined behaviour. This is a big deal - there is no such thing as a
benign race condition, and such a situation could compromise the
integrity of the process.
In general, it is prudent for a driver to assume that, in general, the
application might access buffer 0 as soon as bufferSwitch(0)
is called, up
until the application returns from bufferSwitch(1)
(and vice-versa). In fact,
there is one more case where a buffer is accessed outside of bufferSwitch
:
before streaming starts, when the application pre-fills buffer 1 before calling
ASIOStart()
(see section about priming, below).
Note that a driver that exposes DMA buffers cannot really use this approach,
because both buffers are under the control by the application for the duration
of a bufferSwitch
call, including the location that the hardware might
currently be reading/writing at. However, race conditions are inherent to such
drivers anyway, because their mode of operation if fundamentally designed around
the hardware and the application racing against each other.
ASIOOutputReady()
is an optional ASIO feature that allows the host to notify
the driver when the host has finished filling the output buffers that were
provided in the last bufferSwitch
call. The ASIO SDK documentation does not
mention any restriction as to the context in which this function can be used; it
should be safe to assume that it can be called from within bufferSwitch
(in a
reentrant manner) or from a separate thread between bufferSwitch
calls.
One could naively assume that simply returning from bufferSwitch
would achieve
the same effect, but that is not always true, because:
- It is not clear if the driver can safely assume that output buffer
X
is ready as soon asbufferSwitch(X)
returns (see the discussion about validity of buffers, above). - The application might have filled the output buffer some time before it
wants to return from the
bufferSwitch
callback. For example, if the application generates the output before processing the input.
The ASIO SDK documentation explains ASIOOutputReady()
using the concept of
"DMA buffers", which is confusing for drivers that are not DMA-based in the
first place. In a "pure" DMA-based driver where the hardware is reading the ASIO
buffers directly, then ASIOOutputReady()
does not provide any benefit because
the driver doesn't have to do anything when the output buffer is "ready": the
data is already in place and will be read directly by the hardware as the clock
advances. However, drivers that do not directly expose DMA buffers cannot work
that way: they need to know when the output buffer is filled because they need
to take action before the hardware can play it (such as copying the data,
post-processing it, or sending it elsewhere), and waiting too long before taking
such action could lead to an output buffer underrun.
If the application does not support ASIOOutputReady()
, then the driver needs
to wait at least until bufferSwitch
returns before it could process the output
buffer, increasing the likelihood of missing deadlines. Worse, if the driver
assumes that the host application can continue writing to the output buffer
even after bufferSwitch
has returned (see previous section), then it needs to
wait at least until the next bufferSwitch
call has returned before it can
process the output buffer of the previous call. If the driver waits this long,
then it has to keep an additional output buffer in the hardware queue, otherwise
buffer underrun will occur around bufferSwitch
time. This is why the
ASIOGetLatencies()
documentation states that the output latency is doubled
under this mode of operation. The point of ASIOOutputReady()
is to avoid this
problem by allowing the driver to do what it needs to do with the output buffers
before underrun might occur.
One caveat of ASIOOutputReady()
is that, if it is called outside of a
bufferSwitch
callback, it could race with that callback, and then it is not
clear if the ASIOOutputReady()
call applies to the previous output buffer or
to the one that was passed to the bufferSwitch
call that just happened. This
could mislead the driver into thinking that the application has stopped using
the new output buffer, resulting in race conditions. For this reason it might
be safer to make the driver block until the application has called
ASIOOutputReady()
before making the next call to bufferSwitch
.
The ASIO SDK documentation states in section 5:
The audio streaming starts after the ASIOStart() call. Prior to starting the hardware streaming the driver will issue one or more
bufferSwitch()
orbufferSwitchTimeInfo()
callbacks to fill its first output buffer(s). Since the hardware did not provide any input data yet the input channels' buffers should be filled with silence by the driver.
Section 6 states:
the same system time is used for the first two callbacks, as the hardware did not start yet
The ASIOStart()
documentation states:
The first call to the hosts'
bufferSwitch(index == 0)
then tells the host to read from input buffer A (index 0), and start processing to output buffer A while output buffer B (which has been filled by the host prior to callingASIOStart()
) is possibly sounding
What this means in practice is that the host should fill buffer 1 prior to
calling ASIOStart()
. On ASIOStart()
the driver fills all input buffers with
silence, then calls bufferSwitch(0)
. After bufferSwitch(0)
returns (which
indicates the application is done using buffer 1), the driver sends buffer 1 to
the hardware. The driver can then start the hardware and start reading into
input buffer 1. On ASIOOutputReady()
, the driver sends output buffer 0 to the
hardware. When the driver has finished filling input buffer 1, it can call
bufferSwitch(1)
. Upon return the driver can start filling buffer 0, and we are
now in steady-state.
The above assumes that the application is allowed to use buffers until it
returns from the next bufferSwitch
call (see section about buffer validity),
and that the application supports ASIOOutputReady()
. If the application does
not support ASIOOutputReady()
, the driver will have to call bufferSwitch(1)
earlier, before starting the hardware (passing silence again) so that it can
put another output buffer in the queue (see the section about
ASIOOutputReady()
for why it has to do this). Then the driver enters
steady-state, with one additional buffer in the output queue at all times. This
seems to be the scenario described in the example in section 6 of the ASIO SDK
documentation.
In the presence of such ambiguity around buffer validity, the best option for application developers and driver developers is to apply the Robustness principle to avoid race conditions:
- Paranoid driver developers should assume that the application will access
buffer 0 as soon as
bufferSwitch(0)
is called, up untilbufferSwitch(1)
returns (and vice-versa).- If
ASIOOutputReady()
is called, the driver can assume the application is not using the output buffer anymore, but be careful about such calls racing againstbufferSwitch
. You might want to wait forASIOOutputReady()
to be called before making the nextbufferSwitch
call.
- If
- Paranoid host application developers should assume that buffer 0 will cease to
be valid as soon as they return from
bufferSwitch(0)
(same for the other buffer).- In any case, applications should always call
ASIOOutputReady()
when they have stopped using an output buffer.
- In any case, applications should always call
An example sequence of calls could look as follows. If the application does not
support ASIOOutputReady()
:
- Application fills output buffer 1.
- One could argue the application could wait right up until step 6 before
doing that, but that would go against the documentation of
ASIOStart()
.
- One could argue the application could wait right up until step 6 before
doing that, but that would go against the documentation of
- Application calls
ASIOStart()
. - Driver initially fills the input buffers with silence.
- Driver returns from
ASIOStart()
. - Driver calls
bufferSwitch(0)
. - Application returns from
bufferSwitch(0)
. - Driver sends output buffer 1 to the hardware.
- Driver calls
bufferSwitch(1)
. - Application returns from
bufferSwitch(1)
. - Driver sends output buffer 0 to the hardware.
- Driver starts hardware playback and recording.
- Driver waits for incoming data to arrive, stores it in buffer 0.
- Driver calls
bufferSwitch(0)
. - Application returns from
bufferSwitch(0)
- Driver sends output buffer 1 to the hardware.
- Driver waits for incoming data to arrive, stores it in buffer 1.
- Driver calls
bufferSwitch(1)
. - Application returns from
bufferSwitch(1)
- Driver sends output buffer 0 to the hardware.
- Steady-state: go to step 12.
If the application supports ASIOOutputReady()
:
- Application advertises support for
ASIOOutputReady()
by calling it. - Application fills output buffer 1.
- Application calls
ASIOStart()
. - Driver initially fills the input buffers with silence.
- Driver returns from
ASIOStart()
. - Driver sends output buffer 1 to the hardware.
- One could argue that the application should call
ASIOOutputReady()
before the driver does that, for better consistency with steady-state operation. However, the application might disagree, leading the driver to wait forever.
- One could argue that the application should call
- Driver starts hardware playback and recording.
- Driver calls
bufferSwitch(0)
. - Application returns from
bufferSwitch(0)
.- Before or after this step, the application calls
ASIOOutputReady()
. Upon receiving this call the driver sends output buffer 0 to the hardware.
- Before or after this step, the application calls
- Driver waits for incoming data to arrive, stores it in buffer 1.
- Driver calls
bufferSwitch(1)
. - Application returns from
bufferSwitch(1)
.- Before or after this step, the application calls
ASIOOutputReady()
. Upon receiving this call the driver sends output buffer 1 to the hardware.
- Before or after this step, the application calls
- Driver waits for incoming data to arrive, stores it in buffer 0.
- Steady-state: go to step 9.
ASIO is a trademark and software of Steinberg Media Technologies GmbH