Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How OutputStream callback is supposed to stop on eof? #306

Closed
jupiterbjy opened this issue Jan 19, 2021 · 11 comments
Closed

How OutputStream callback is supposed to stop on eof? #306

jupiterbjy opened this issue Jan 19, 2021 · 11 comments

Comments

@jupiterbjy
Copy link

jupiterbjy commented Jan 19, 2021

Here's what I find confusing.

On documents about Stream callback, it says:

Note

The callback must always fill the entire output buffer, no matter if or which exceptions are raised.

While so, document don't mention about hot to stop a callback.
So I made a guess that am I supposed to keep buffer zero-filled all time even if eof was reached.

However, if an audio files reaches EOF and I need to stop the callback, how a callback is supposed to know whether it's EOF or not? If it's possible, then what's the designed / intended way of doing so?


Below are things I've tried of:

  • Checking if last one or few blocks are zero-filled
  • assert soundfile.SoundFIle.tell() != soundfile.SoundFIle.frames, check if current frame and total frames equals.
  • Check if last returns of soundfile.SoundFIle.tell() is same.

First one immediately fails if there's silence in audio files, and calculation cost may be expensive, thus violating following line:

The PortAudio stream callback runs at very high or real-time priority. It is required to consistently meet its time deadlines.

Second fails as SoundFile.buffer_read_into() stop advancing cursor earlier than frames count and keep zero-filling the buffer. I can't check if last frame was same in case of

# current frame increasing by len(buff) or callback frame parameter
buff/.tell()/.frames
1136 9332240 9338880
1136 9333376 9338880
1136 9334512 9338880
1136 9335648 9338880
1136 9336784 9338880
1136 9337856 9338880  # <- stops increasing
1136 9337856 9338880
1136 9337856 9338880
. . .
1136 9337856 9338880

From this, I thought of third. If current frame(aka cursor) position is same and there's no duplicated frames, then I could check if last frame was same or not, and this works so far. Code as follows.

https://github.com/jupiterbjy/CUIAudioPlayer/blob/4495df15ae36fa98a2c84a5cc7f619e2abc25770/Demo/SoundDeviceCallbackTest.py

But this is dependent on SoundFont Module thus I don't think this is any closer to designer's intention.

@mgeier
Copy link
Member

mgeier commented Jan 20, 2021

However, if an audio files reaches EOF and I need to stop the callback, how a callback is supposed to know whether it's EOF or not? If it's possible, then what's the designed / intended way of doing so?

Well that's your job as implementer of the callback function.

If you want the callback function to not be called anymore, you should raise CallbackStop or CallbackAbort (see https://python-sounddevice.readthedocs.io/en/0.4.1/api/misc.html#sounddevice.CallbackStop).

Checking if last one or few blocks are zero-filled

Why would a block be zero-filled? Unless the audio file contains zeros?

[...] and calculation cost may be expensive, thus violating following line:

The PortAudio stream callback runs at very high or real-time priority. It is required to consistently meet its time deadlines.

Checking for zero has very consistent timing and is quite fast, so this is not a problem.

Still, checking for zeros is not the right answer for different reasons.

assert soundfile.SoundFIle.tell() != soundfile.SoundFIle.frames, check if current frame and total frames equals.

The assert statement is not the right tool for this, but checking for the current and the last frame in the sound file sounds like a good idea.

Check if last returns of soundfile.SoundFIle.tell() is same.

I think with this you would always be one block too late.

Second fails as SoundFile.buffer_read_into() stop advancing cursor earlier than frames count and keep zero-filling the buffer.

Why do you think anything is zero-filled?

The documentation says "The rest of the buffer is not filled with meaningful data."
See https://python-soundfile.readthedocs.io/en/0.10.3post1/#soundfile.SoundFile.buffer_read_into.

BTW, you should definitely not ignore the return value!

You should think about all the possible situations that could happen after calling buffer_read_into():

  • zero frames have been read
  • fewer than blocksize frames have been read
  • exactly blocksize frames have been read

You should think about whether those are really all possible cases and what to do in each case.

Here is a similar situation in one of the examples:

if len(data) < len(outdata):
outdata[:len(data)] = data
outdata[len(data):] = b'\x00' * (len(outdata) - len(data))
raise sd.CallbackStop
else:
outdata[:] = data

@jupiterbjy
Copy link
Author

Thanks for reply, as always, it's definitely helpful to get direct opinion from dev!

For my addiction with zero-filling - when I leave SoundFile.buffer_read_into() filling meaningless data to buffer, there's nothing I can hear, so I thought it was zero-filling data. That was a hasty reasoning.

About the example play_long_file, I admit that I overlooked it after having two confusion on it.

Document says:

The third argument holds the number of frames to be processed by the stream callback. This is the same as the length of the input and output buffers.

But the example uses len(outdata) which has to be same as frames. This confused me that if buffer is dynamic or not, and whether I should prefer one over another.

Additionally, for what I've seen that example is set to use buffer dtype of float32, however - that example is filling buffer with byte strings. Noticing that, I tried with this callback.

def stream_cb(data_out, frames: int, time, status: sd.CallbackFlags) -> None:
    assert not status
    data_out[:] = b"\x00" * frames

This immediately caused error.

# Skipped scripts names
  File "E:\github\CUIAudioPlayer\SoundModule.py", line 177, in stream_cb
    data_out[:] = b"\x00" * frames
ValueError: could not convert string to float: ''

Since example scripts does not cover that if block, scripts just skips over and go else block and I couldn't test if it crashes or not. This was even more confusing to me.

Other than that, there's still more questions I'd like to ask, but I'm afraid it doesn't belong to this issue anymore. If you don't mind answering those, can I have your preferred email address, maybe?

@mgeier
Copy link
Member

mgeier commented Jan 21, 2021

For my addiction with zero-filling - when I leave SoundFile.buffer_read_into() filling meaningless data to buffer, there's nothing I can hear, so I thought it was zero-filling data. That was a hasty reasoning.

It may actually fill with zeros. Maybe it does it some times and sometimes not, maybe only on certain operating systems, but you should not rely on it.

You should check the documentation in such a case and not assume general behavior from a special case.

The third argument holds the number of frames to be processed by the stream callback. This is the same as the length of the input and output buffers.

But the example uses len(outdata) which has to be same as frames. This confused me that if buffer is dynamic or not, and whether I should prefer one over another.

The sentence "This is the same as the length of the input and output buffers." only appears in the docs for Stream, not in RawStream.

In the case of Stream, where NumPy arrays are used, len(outdata) == frames is true.

In the case of RawStream, outdata is a Python buffer object, that means that len(outdata) does return the length in bytes and not in frames!
In this case, len(outdata) == frames * stream.samplesize * channels is true.

https://python-sounddevice.readthedocs.io/en/latest/api/streams.html#sounddevice.Stream.samplesize

Probably we should clarify this in the docs, do you want to make a PR for that?

That's why I'm using len(outdata) in the example. I am interested in the length in bytes.

Additionally, for what I've seen that example is set to use buffer dtype of float32, however - that example is filling buffer with byte strings.

Yes, the example uses RawOutputStream, and in this case byte objects (or buffer objects) have to be used.

What are you using?

If you use OutputStream, you don't have to worry about using bytes.
You can simply use NumPy arrays and everything will be fine!

Since example scripts does not cover that if block, scripts just skips over and go else block and I couldn't test if it crashes or not. This was even more confusing to me.

It should cover the else block in the end if the file length isn't an integer multiple of the block size. Does it not do that?

You could try using a very short audio file for testing?

Other than that, there's still more questions I'd like to ask, but I'm afraid it doesn't belong to this issue anymore. If you don't mind answering those, can I have your preferred email address, maybe?

I will only answer usage questions if the question and answer is publicly accessible.
Feel free to create a new issue for each new question.

If you need to contact me privately, my e-mail address is visible in the Git commits.

@jupiterbjy
Copy link
Author

I completely overlooked the difference of buffer objects between callbacks of Raw and numpy variants. I'm terribly sorry about that. What a shame to have a headache on it for hours, now I think I'm going in right direction.

Since I tried that callback on OutputStream, proper callback should have been like this:

last_frame = -1
audio_ref  # SoundFile Object
channel = audio_ref.channels

def stream_cb(data_out, frames: int, time, status: sd.CallbackFlags) -> None:
    nonlocal last_frame
    assert not status

    if (written := audio_ref.buffer_read_into(data_out, dtype)) < frames:
        data_out[written:] = [[0.0] * channel for _ in range(frames - written)]
        raise sd.CallbackStop

    if last_frame == (current_frame := audio_ref.tell()):
        raise sd.CallbackAbort

    last_frame = current_frame

With this pause/resume is working greatly, and I can't be happier than right now. Also now I also see underflow happening too, I probably made a mistake previously.


About the PR, I think this module is overall well-documented, but I find that some methods' aren't enough.

The arguments are the same as in the callback parameter of Stream, except that indata and outdata are plain Python buffer objects instead of NumPy arrays.

Apparently "plain Python buffer objects" was a _cffi_backend.buffer, and I'm not sure if it was from memoryview - which once it's name was buffer() - or not. I couldn't find the part where the callback is called in source codes yet, and no one would be willing to type-hint with the object that requires importing "prohibited" internal module.

Like this, few parameters or arguments might need few more users are the one who is writing a callback, they, myself included, might have no clue about the exact object/interface that the signature is pointing at. And just like I couldn't, it might be hard to navigate thru source codes and find a complete signature for others too. If we know exactly what object/interface the signature is referencing about, we can simply add a type hint and IDE can help us rest of the way.

From this context, I'd like to add type hints for these documents on PR on top of the relationship between buffer, frames, samplesize, and channels - after figuring those out myself first.

I owe you a lot for all your helps!

@mgeier
Copy link
Member

mgeier commented Jan 23, 2021

I'm terribly sorry about that. What a shame to have a headache on it for hours, now I think I'm going in right direction.

Don't worry, no problem at all!

For me this is a sign that the documentation should be improved to make it less confusing for new users.

Your code example with the callback function looks good. I'm not sure whether the tell() call is really necessary. Wouldn't it work just as well without it?

And you can simplify the zero-filling:

data_out[written:].fill(0)

If I understood correctly, now you are using OutputStream (and no raw stream anymore), so why are you still using buffer_read_into()?
If you use NumPy arrays, you can (and IMHO should) use read(..., out=data_out), see https://python-soundfile.readthedocs.io/en/0.10.3post1/#soundfile.SoundFile.read.

It doesn't make a huge difference in this case, but the dtype handling would be automatic and less error-prone.

The arguments are the same as in the callback parameter of Stream, except that indata and outdata are plain Python buffer objects instead of NumPy arrays.

Apparently "plain Python buffer objects" was a _cffi_backend.buffer, and I'm not sure if it was from memoryview - which once it's name was buffer() - or not. I couldn't find the part where the callback is called in source codes yet, and no one would be willing to type-hint with the object that requires importing "prohibited" internal module.

Yes, that's the problem, the returned type isn't a public type.
And it doesn't have to.
That's the idea of the "buffer protocol" which is not defined by a Python class. It's much more low-level.
See https://docs.python.org/3/c-api/buffer.html.
I didn't really want to use this link in the URL, because I think it might be more confusing than helpful for most people.

I couldn't find the part where the callback is called in source codes yet

That's probably because the callback isn't called by the Python module!

The Python code only provides a function pointer called callback_ptr which is then magically called by the PortAudio library.

if callback is None:
callback_ptr = _ffi.NULL
elif kind == 'input' and wrap_callback == 'buffer':
@ffi_callback
def callback_ptr(iptr, optr, frames, time, status, _):
data = _buffer(iptr, frames, self._channels, self._samplesize)
return _wrap_callback(callback, data, frames, time, status)
elif kind == 'input' and wrap_callback == 'array':
@ffi_callback
def callback_ptr(iptr, optr, frames, time, status, _):
data = _array(
_buffer(iptr, frames, self._channels, self._samplesize),
self._channels, self._dtype)
return _wrap_callback(callback, data, frames, time, status)
elif kind == 'output' and wrap_callback == 'buffer':
@ffi_callback
def callback_ptr(iptr, optr, frames, time, status, _):
data = _buffer(optr, frames, self._channels, self._samplesize)
return _wrap_callback(callback, data, frames, time, status)
elif kind == 'output' and wrap_callback == 'array':
@ffi_callback
def callback_ptr(iptr, optr, frames, time, status, _):
data = _array(
_buffer(optr, frames, self._channels, self._samplesize),
self._channels, self._dtype)
return _wrap_callback(callback, data, frames, time, status)
elif kind == 'duplex' and wrap_callback == 'buffer':
@ffi_callback
def callback_ptr(iptr, optr, frames, time, status, _):
ichannels, ochannels = self._channels
isize, osize = self._samplesize
idata = _buffer(iptr, frames, ichannels, isize)
odata = _buffer(optr, frames, ochannels, osize)
return _wrap_callback(
callback, idata, odata, frames, time, status)
elif kind == 'duplex' and wrap_callback == 'array':
@ffi_callback
def callback_ptr(iptr, optr, frames, time, status, _):
ichannels, ochannels = self._channels
idtype, odtype = self._dtype
isize, osize = self._samplesize
idata = _array(_buffer(iptr, frames, ichannels, isize),
ichannels, idtype)
odata = _array(_buffer(optr, frames, ochannels, osize),
ochannels, odtype)
return _wrap_callback(
callback, idata, odata, frames, time, status)
else:
# Use cast() to allow CData from different FFI instance:
callback_ptr = _ffi.cast('PaStreamCallback*', callback)

The raw buffers for the callback are prepared there:

def _buffer(ptr, frames, channels, samplesize):
"""Create a buffer object from a pointer to some memory."""
return _ffi.buffer(ptr, frames * channels * samplesize)

Once the callback has been called by PortAudio, the user-defined callback function is called there:

args = args[:-1] + (CallbackFlags(args[-1]),)
try:
callback(*args)
except CallbackStop:
return _lib.paComplete
except CallbackAbort:
return _lib.paAbort
return _lib.paContinue

we can simply add a type hint and IDE can help us rest of the way.

For the "buffer" type it might not be easy, because is isn't a native Python type.

I don't have any experience with Python type hints, but I found this issue: python/typing#593

I owe you a lot for all your helps!

I'm glad I could help!

You can repay me by helping me improve the documentation and the examples!

@jupiterbjy
Copy link
Author

Just tried out SoundFile.read() and saw it also provides fill value which is neat! Now core logic is down to just 6:

last_frame: int = -1

def stream_cb(data_out, frames: int, time, status: sd.CallbackFlags) -> None:
        nonlocal last_frame
        assert not status

        data_out[:] = audio_ref.read(frames, fill_value=0)

        if last_frame == (current_frame := audio_ref.tell()):
            raise sd.CallbackAbort

Thanks for tip, I literally forgot that SoundFile implements file protocol.


For the "buffer" type it might not be easy, because is isn't a native Python type.

Just found out that developers decided not too for the similar reason, from Raymond Hettinger himself:

Author: Raymond Hettinger | Date: 2016-07-14 20:47
Looking at the patch, I'm thinking that this endeavor isn't at all worthwhile. We don't need an ABC for everything especially if there in no pure python API involved. I recommend closing this one until there is a proven need and a clear path forward.

on the last of thread is a github issue page you've linked, yet there's no update since then.

Until then, seems like all we can do is checking out docs for general bytes-like object or dir() it, I suppose.


Your code example with the callback function looks good. I'm not sure whether the tell() call is really necessary. Wouldn't it work just as well without it?

For what I've tried until now, checking if last frame is same with current File-object cursor position returned by SoundFont.tell() was the most stable method of determining end of audio stream - which I mentioned at opening of this issue as third method I've tried. Or that's at least the best method with my limited experience with SoundFile module.

And there's additional reason that's not shown up until now as it wasn't part of minimum reproducible example.

From actual code using the callback:

last_frame: int = -1
cycle = itertools.cycle((not n for n in range(stream_manager.callback_minimum_cycle)))

# to reduce load, custom callback will be called every n-th iteration of this generator.

def stream_cb(data_out, frames: int, time, status: sd.CallbackFlags) -> None:
    nonlocal last_frame
    assert not status

    data_out[:] = audio_ref.read(frames, fill_value=0)

    if last_frame == (current_frame := audio_ref.tell()):
        raise sd.CallbackAbort

    last_frame = current_frame

    if next(cycle):
        callback(audio_info, current_frame)
        # Stream callback signature for user-supplied callbacks
        # Providing current_frame and duration to reduce call overhead from user-callback side.

Results of SoundFile.tell() is also passed to external-supplied functions so one can calculate playback progress of current file - i.e. in seconds or progress bar like following:

Since py_cui has no exposed event loops or no public API for doing such, I'm using Sounddevice callback to drive CUI elements instead!

So far this isn't putting pressure on playback, but would be safe to guess the maximum time on callback should be under buffer_frame_count / sampling_rate seconds?

@mgeier
Copy link
Member

mgeier commented Jan 27, 2021

So far this isn't putting pressure on playback, but would be safe to guess the maximum time on callback should be under buffer_frame_count / sampling_rate seconds?

That's definitely an upper limit, but the actually available time might be significantly less.

The problem is that you should always consider the worst case, that the operating system is blocking for some reason which might lead to an audible artifact (a drop-out or a click or something like this).

You can of course never guarantee anything, but you should try to keep some additional time for such cases.

Doing GUI stuff in the audio callback is normally a no-go, but I'm not sure about TUIs.

Anyway, if you don't need very short latencies and if you use a conservatively large latency setting (e.g. 100 milliseconds), you should be fine.

@jupiterbjy
Copy link
Author

jupiterbjy commented Feb 7, 2021

Sorry for reviving this again, I just cloned module while ago to see how documents - more likely, how sphinx autodoc - are working in this module. Found out that I cloned /bastibe/python-soundfile folk, and was outdated.

There's bunch more like these folks with various commit histories. Is all other than /spatialaudio folk obsolete?

@mgeier
Copy link
Member

mgeier commented Feb 7, 2021

@jupiterbjy The Github name of the soundfile module has changed quite a few times, which is indeed confusing, but please note that these are (and always have been) two completely separate projects (both still active):

One (this here) is providing the sounddevice module, the other one is providing the soundfile module.

Both use Sphinx's autodoc extension.

@jupiterbjy
Copy link
Author

jupiterbjy commented Feb 7, 2021

Oh I made a big mistake, I copy-pasted completely wrong thing while writing, mybad!

I was trying to ask about the difference with your folk and this one. Originally I was planning to make a PR on your folk, as you were replying me on this issue.

Since then I saw your latest branch in yours is midi-synth, 5 months ago and outdated. But I saw your more recent commits on /spatialaudio/python-sounddevice dates after that and not even have a PR merge commits, indicating there was no new branches in your folk.

And that's why I was trying to ask if other folks - like yours - and this one was independent or not, as I thought committing directly to another one's repository was impossible unless accepted via Pull request.

@mgeier
Copy link
Member

mgeier commented Feb 9, 2021

I think you mean "fork" instead of "folk", right?

You should use the official repo (https://github.com/spatialaudio/python-sounddevice/) to make PRs.

My personal fork (https://github.com/mgeier/python-sounddevice/) should be irrelevant to you, but you can have a look if you want!

... and not even have a PR merge commits, indicating there was no new branches in your folk.

I actually do use my fork for making PRs here, but I often merge them using a "fast-forward" merge, that's why there are no merge commits.

It is mostly a matter of the maintainers' taste whether merge commits are used or not.

And that's why I was trying to ask if other folks - like yours - and this one was independent or not, as I thought committing directly to another one's repository was impossible unless accepted via Pull request.

You can make a PR on any fork and you will see if and how the owner of the fork reacts. In some cases it makes sense to make PRs on non-official forks, but if you want to contribute to the sounddevice module, you should make your PRs at the official repo.

And yes, you can of course only directly commit to repos where you have the relevant access rights. But by means of PRs you can make changes to any repo, as long as its maintainers accept them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants