Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force silent frames on PCM device when no audio is playing instead of closing it (to fix Jack + zita-a2j behavior) #743

Open
farshield opened this issue Dec 13, 2024 · 11 comments · May be fixed by #745

Comments

@farshield
Copy link

Feature description

As far as I understand, starting bluealsad with the "keep-alive" option like this ...

bluealsad -S --keep-alive=-1 -p a2dp-sink

... only keeps the Bluetooth transport active. If there's no audio playing from the external device, the PCM is closed and not even silent frames are written. This is a bit like an analog capture card which only starts working when you plug in an audio cable.

Most of the time this is totally fine and maybe it is even the desired behavior (e.g. efficiency, power consumption in embedded devices)?

However, there are scenarios where it is desired to have silent frames output. Currently, it's not possible to import bluealsa into Jack with zita-a2j reliably.

Additional context

When there is no audio playing and you import bluealsa into Jack with zita-a2j, then once you start playing something, the audio is muffled and distorted.

Steps to reproduce:

  1. Run bluealsa as a2dp-sink on the host device:
/usr/bin/bluealsa -S --keep-alive=5 -p a2dp-sink
  1. Make the host device discoverable:
$ bluetoothctl --agent=NoInputNoOutput
system-alias TestDevice
discoverable on
pairable on
default-agent
  1. Connect phone to host device called "TestDevice".

  2. Start Jack server (the order for this doesn't matter, you can start Jack server before as well):

jackd -d alsa -P hw:1 -C hw:2 -i 2 -o 2 -r 44100 -p 64 -n 3 -D -S
  1. Assuming no audio is playing from the phone, start zita-a2j and connect outputs:
zita-a2j -v -j bluealsa -d bluealsa -c 2 -r 44100 -p 4096 -n 3
jack_connect bluealsa:capture_1 system:playback_1 && jack_connect bluealsa:capture_2 system:playback_2
  1. Start playing some music from the phone. The audio is muffled, distorted and you can barely hear anything.

However, if you start to play music before you start zita-a2j, then it is fine. BUT, if you pause the audio and resume after a few seconds, then you have the same problem.

You could argue that zita-a2j should also be aware of the PCM state, but I still think it's counterintuitive (at least for me), to not have silent frames when a device is connected but no audio is playing.

Anyone willing to provide ideas and insights on how this could be patched in bluez-alsa?

@borine
Copy link
Collaborator

borine commented Dec 14, 2024

I was looking into this same feature last year, but got sidetracked by other priorities. I've just rebased those old patches against the latest bluez-alsa master branch and uploaded to a branch on my forked repository here: https://github.com/borine/bluez-alsa/tree/hwcompat

By opening the bluealsa plugin with bluealsa:HWCOMPAT=silence I get reasonably good results with zita-a2j, alsa_in and also alsaloop. Please try out that branch to see if it helps in your use case.

I must stress that this is experimental, prototype, code and probably contains lots of bugs. It needs proper peer review and testing; but if you are able to help with testing and debugging (and also any general comments regarding the design and code would be welcome), then I will consider raising a PR to have it merged here.

@farshield
Copy link
Author

I did some preliminary tests and it seems to work well with zita-a2j:

zita-a2j -v -j bluealsa -d bluealsa:HWCOMPAT=silence -c 2 -r 44100 -p 4096 -n 3

Silent frames are inserted even when the transport channel goes too sleep. One can see such silent frames with:

arecord -Dbluealsa:HWCOMPAT=silence -c 2 -f s16_le -r 44100 | xxd

I'm still struggling with random audio popping and sporadic 10 seconds of silence. Logs:

# Jack server logs:
JackEngine::XRun: client = bluealsa was not finished, state = Running
JackAudioDriver::ProcessGraphAsyncMaster: Process error

# zita-a2j complains with:
Detected excessive timing errors, waiting 10 seconds.

However, this only happens when I run Jack with a low buffer size:

jackd -dalsa -dhw:USB -r 48000 -p 64 -n 3

Running it with -p 512 seems much better, however this will increase the latency for the other audio devices I need.
I can generally combine different devices with different buffer sizes well, but not with Bluetooth. If you know any way around this, let me know.

@borine In the meantime, thank you for your reply and I will take a look at your commits.

@borine
Copy link
Collaborator

borine commented Dec 16, 2024

@farshield thanks for trying this.

I am not a jack expert, not even a regular jack user, so I don't know if there is anything that can be done within the jack configuration to improve the latency. My testing has all been with a jack period size of 1024 (-p 1024).

The real problem is that bluetooth audio does not offer any timing data; the bluealsa plugin must simulate the period timer by counting frames - but obviously when the device is not sending any frames this fails, so as noted in the manual page the device timer is effectively stopped. My patch here uses poll() timeouts to detect breaks in the incoming stream, and the local system clock to time the generated silence frames. The resulting drift and jitter is truly awful, and to have any hope of a steady stream the period size has to be significantly greater than the jitter. So small period sizes are simply not workable.

@farshield
Copy link
Author

I've also been in contact with the author of zita-ajbridge (Linux Audio projects at Kokkini Zita). He has the following to say about this issue:

    The only information zita-ajbridge has to control the resampling
    is the period timing. Some heavy filtering is done on this, so
    a bit of jitter shouldn't matter too much.
    
    But two things ARE important:
    
    - Try to get the *average* (over a second or so) period time as
      close as possible to the expected value. Ideally that should
      mean using an absolute timeout, but poll() doesn't support
      that (AFAIK). But dynamically adjusting the timeout value to
      compensate for processing time between timeouts may help.
    
    - Try to make the transitions between 'inserting silence' and
      'receiving samples' as clean as possible. Period timing
      should appear to be continuous when changing states, other-
      wise the delay locked loop in ajbridge will 'unlock'.

@borine
Copy link
Collaborator

borine commented Dec 17, 2024

Bluetooth audio frames arrive in chunks; as each codec frame is decoded bluealsad dumps all of the resulting audio frames to a fifo in a single write. The chunks may not be delivered at constant intervals, it depends on fragmentation when packing codec frames into AVDTP transport frames and also air link issues such as corrupted transport frames requiring to be re-sent, etc. So there is naturally a large amount of jitter resulting from relying on the delivery time of individual frames to derive an implicit timing signal for period signalling. For example, when measured by the system clock, the first period of 4096 frames may arrive in 95ms, the next 150ms later, the third 45ms after that, and so on. Overall the average period time will be close to the expected 92.879ms (subject to timer drift between the local system and the phone), but the variation between individual periods (ie the jitter) is absolutely huge.

My patch attempts to smooth this out as far as possible by ensuring there is a "start threshold" of frames buffered in the fifo before starting to read into the ring buffer, and then using short sleeps (measured by the local system timer) between reads so that they occur at roughly equal intervals, as expected by the ALSA API. The poll() timeout is dynamically adjusted by calculating a "deadline" for each read, and if insufficient frames have not been received by that deadline then silence frames are inserted to make up the shortfall so that the period is complete.

As I understand it, the timer used by poll() has a granularity of 1ms, and the timeout may be late because of kernel scheduling priorities. So it is not realistically possible to reduce the jitter below roughly 2ms per period, even if the rest of my algorithm was jitter free (which I am sure it is not, I am no expert in this). The plugin sets a lower bound of 10ms on the period size, which should be enough to cope with a jitter of 2ms, but I would still advise to use at least 20ms. Jack permits only period sizes that are a power of 2, so in this example at 44100 rate I think a period size of 1024 is the best compromise between latency and audio quality.

As to why the hw device period size is having such a big effect, I can only assume that the zita-a2j resampler has to match this period deadline to write to the jack buffer, but cannot do this because it is constrained by the jitter on its read side (a jack period size of 64 frames at 48000 rate gives a period time of 1.5ms, sufficiently below the 2ms jitter value of the bluealsa device). This is just guesswork, I have not read the zita code.

@borine
Copy link
Collaborator

borine commented Dec 18, 2024

I can generally combine different devices with different buffer sizes well ...

I've just tested an intel PCH sound card as primary with jackd and a period size of 64:

jackd -d alsa -P hw:PCH -o2 -n3 -S -p64

with a usb device as secondary via zita-a2j and a period size of 1024:

zita-a2j -v -j usb -d hw:Device -c1 -r48000 -n3 -p1024

The result is the same as when using bluealsa; jackd just spews out errors:

JackEngine::XRun: client = usb was not finished, state = Triggered
JackAudioDriver::ProcessGraphAsyncMaster: Process error
JackEngine::XRun: client = usb was not finished, state = Triggered
JackAudioDriver::ProcessGraphAsyncMaster: Process error
JackEngine::XRun: client = usb was not finished, state = Triggered
JackAudioDriver::ProcessGraphAsyncMaster: Process error

So that issue is not specific to bluealsa, clearly zita-a2j cannot always sync a device with very large period to one with very small period. Given that it is not possible to achieve low latency with Bluetooth A2DP (that is a limitation of the profile definition and independent of any specific implementation) I think that you will never be able to use Bluetooth on your jack audio system.

The only way is to raise the overall system latency. With the latest changes I just uploaded to my 'hwcompat' branch I am now getting excellent results using zita-a2j with a period size of 1024 on both zita and jackd. And also with my own target application which is alsaloop. I am thinking about tidying up the code and raising a PR in the new year.

@farshield
Copy link
Author

First of all, thanks for explaining in detail how the Bluetooth audio works and for all the tests you did.

I've just tested an intel PCH sound card as primary with jackd and a period size of 64

I always had trouble running Intel PCH sound card with Jack at lower buffer sizes, even without zita-a2j.

However, combining my USB Headphones (hw:J380) with a Scarlet USB interface (hw:USB) works totally fine:

jackd -dalsa -dhw:J380 -r48000 -p64 -n3
zita-a2j -v -j scarlet -d hw:USB -c2 -r48000 -n3 -p4096

Connecting the Scarlet capture channels to the Jack system:playback_1/2 works well, without any audio cracking, at both 1024 and 4096, even when Jack server runs at 64. Does it have to do with both being on USB and having synced clocks? I will try later with different sound cards.

I am thinking about tidying up the code and raising a PR in the new year.

Much appreciated!

@borine
Copy link
Collaborator

borine commented Dec 19, 2024

I always had trouble running Intel PCH sound card with Jack at lower buffer sizes, even without zita-a2j

Ah, I see, not normally using jack I was unaware of problems with PCH.

I've now moved my testing to the only other machine I have available: a very old raspberry pi Zero W Rev 1.1 (single core, armv6l, 1GHz) with an IQAudio DAC. I've also plugged that same USB audio adapter into it. Now, repeating the same jack config as on the PCH laptop, I can run jack with a period of 64 on the IQAudio card and the zita-a2j on the USB with a period of 4096. I get xruns every few seconds, but not the flood that was occuring with the laptop. Running jackd with a period of 128 gets rid of the xruns. So I tried bluealsa with zita-a2j, and again with a jackd period of 128 I get very good sound, with only 1 xrun in a test running for 1 hour. However, with a jackd period size of 64 I get an xrun every two seconds or so. I don't see anything that I can do to improve that, but if anyone with more experience with jack can offer some suggestions I will try them out. Still, running such a weedy machine with bluetooth and such low latency for the sound card is, I think, very impressive.

@arkq
Copy link
Owner

arkq commented Dec 19, 2024

I am thinking about tidying up the code and raising a PR in the new year.

@borine that's OK for me, however, there is one thing.

  • "silence"
    Inserts silence for capture streams, or simply drops frames for playback
    streams, whenever the transport is not acquired, thus maintaining a
    continuous stream as far as the application is concerned.

This will cover the case when the transport is not acquired, but there are also cases when the transport is acquired but due to BT link instability, frames are not delivered in timely manner.

Also, have you thought about trying to implement the "inserts silence" on the server side? I'm talking about the source PCM (capture) only. Dropping frames for playback can be done on ALSA plug-in side. Such design will allow bluealsactl open ... to benefit from such feature (if enabled) as well. Anyway, I'm not exactly sure how to implement that... :D

I think, that somehow problematic might be that currently the BT read and PCM write in every IO thread is done in a sync way (or maybe it's not an issue...). I imagine that the io_poll_and_read_bt() would have to be modified to allow poll() timeout (or add timerfd to the poll) and the overall "IO module" would have to have some sort of accounting for how many silent frames have been written to PCM, so such number of frames could be latter dropped from the decoded audio (or maybe not...).

The are some obvious cases of BT frames drops causing missing PCM frames, and there are cases of late receives. I think that in case of BT frame drops we should have some sort of PLC (packet loss concealment) for every codec. Currently PLC is available only for mSBC, LC3-SWB and LC3plus. So, that's the area of some improvement. However, PLC is applied only when the next correct frame is received, so, for injecting silent frames, that's already too late. But, when PLC is applied, it might generate a lot of PCM frames is a short period of time. So, this "inserts silence" feature will have to account for such cases and drop excessive frames.

EDIT:
After reading what I've written, I'm not entirely sure whether adding such feature to the server would be a good idea after all... :D However, I think that the server should have PLC implemented for all codecs. And the consequence of that is the fact that we need "PLC" in case when transport is not acquired :D So, no accounting for inserted frames and things like that, but in case when pcm->fd_bt == -1 the poll should not block forever, but periodically write silence to PCM client. Such thing should be rather simple to implement. Everything beyond that should go into the ALSA plug-in. What do you think?

@borine
Copy link
Collaborator

borine commented Dec 19, 2024

in case when pcm->fd_bt == -1 the poll should not block forever, but periodically write silence to PCM client

I need more time to think that through, but my initial reaction is that it would have consequences for existing installations which must be considered? Take bluealsa-aplay: at present it releases the sound card pcm when the BT stream stops, making the device available to other applications. I rely on that to permit bluealsa-aplay to run permanently on my mpd music server. But if the stream never stops and just becomes silence, how will bluealsa-aplay know when to release the card pcm? Perhaps it could use the PCM "Running" property as a toggle switch, but what about third party applications?

So I think this depends on the nature of the client application and the local context in which it is used. For ALSA applications it absolutely makes sense to have audio frames from the moment that the ALSA pcm is started and continue until it is stopped, but I need time to consider the potential consequences for other use cases.

@arkq
Copy link
Owner

arkq commented Dec 19, 2024

my initial reaction is that it would have consequences for existing installations which must be considered?

Yes, that might be true, but it's not a big change, I think. Anyway, I will think about that as well.

Take bluealsa-aplay: at present it releases the sound card pcm when the BT stream stops, making the device available to other applications. I rely on that to permit bluealsa-aplay to run permanently on my mpd music server. But if the stream never stops and just becomes silence, how will bluealsa-aplay know when to release the card pcm? Perhaps it could use the PCM "Running" property as a toggle switch, but what about third party applications?

The current implementation has one flaw. When the BT link is not reliable, the bluealsa-aplay starts/stops PCM which might be not desired if BT transport is not release/acquired. So, the better option is to indeed change that implementation to rely on "Running" property (it's like an independent change).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants