-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernel Oops related to snd_bcm2835 #2
Comments
hello_audio doesn't use ALSA (it uses OpenMAX IL). |
Hi Ben Cheers On Mon, Apr 16, 2012 at 6:49 PM, Ben O'Steen
|
@popcornmix aplay fails with a kernel oops as well -> https://gist.github.com/2400868 - I'll try from a freshly applied image @raspberrypi So as not to confuse this issue, I'll msg you the details. |
Can you try the start.elf from here: |
Oops is repeatable with mp3's, but ogg files just fail to play. Fresh image (13/04/2012 debian6) root@raspberrypi:/mnt/foo# aplay themetal.ogg root@raspberrypi:/mnt/foo# aplay 11_Everlong_Foo\ Fighters_The\ Colour\ And\ The\ Shape.mp3 Message from syslogd@raspberrypi at Apr 13 09:02:12 ... etc... Then, attempting to play anything will hang the process (as you might expect) |
Booted fine with new start.elf file modprobe worked aplay an ogg generated a lot of noise (aplay interpreted it as a raw file) - this was through the HDMI audio as I was using HDMI at the time. apt-get'd mplayer and mpd mplayer same ogg - found audio out, looked promising, but stalled when it came to play the file. Wouldn't let me Ctrl-C or Z out of it. Raw console log of the above here: https://gist.github.com/2402267 |
aplay works fine playing raw files (interpreting them as .wav files) - when given a .wav, it plays back without distortion :) Yay, progress! I've tried .wav playback with aplay with the new start.elf with both HDMI a/v and composite video + 3.5mm audio jack out - both work. Are there any diagnostics that I could run to help work out why compressed streams seem to hang? |
I see the same hang with mplayer. We were not supporting silence correctly, so I've fixed that. Anyway, update start.elf and snd-bcm2835.ko (you'll possibly need the other changed files updated) from https://github.com/raspberrypi/firmware and it should be a little better. |
Passing silent output every so often is a symptom that the decoding thread can't keep up. For example, mplayer does this for buffer underruns: http://mailman.alsa-project.org/pipermail/alsa-devel/2011-February/037231.html Is snd_pcm_forward being called or is it actually passing silent data? Mplayer and other apps use get_delay heavily to work out the latency of playback - is this giving them numbers they expect (which might be different from accurate ones ;)) |
Swapped over the new start.elf and snd_bcm2835 module - everything works as before and decoded music plays via mplayer for a while before it had an issue. Excellent :) I need to pick a higher quality audio file, but it sounded decent before it stopped.
|
I don't believe mplayer is not keeping up. When I discard the silence requests, it sounds correct and arm is only 20% busy. Basically we have a message passing interface to GPU (vchiq) where blocks of audio are sent (copied to GPU memory space). The GPU adds these to an audio output fifo (with resampling/mixing/volume scaling). I then call what I'm guessing are the right functions to update the ALSA concept of the read pointer, but I'm guessing this isn't quite right. If anyone knows about ALSA feel free to inspect the code. |
Thanks - I'll have a poke around the code and try to encourage others to do the same. I don't think it is a question of sending accurate timing info back to clients, just values that they would believe to be the correct ones :) Module Code: https://github.com/raspberrypi/linux/tree/rpi-patches/sound/arm |
Further info: Running mplayer through strace shows that when it does hang, it hangs halfway through displaying an ioctl call:
It hangs just before "<unfinished...>", only outputing that to the terminal after Ctrl-C is pressed. If Ctrl-C is not pressed it seems like it would hang there, no sound, no movement, for as long as it is left. Running mplayer through gdb, it replicates the same hang, but curiously if the program execution is stopped (either Ctrl-C or -Z in gdb) and then restarted (continue or fg respectively), the sound will often resume playing for a short, indeterminate time afterwards. Just to confirm what you said earlier about the read pointer callbacks, running a simpler, dumber program such as ogg123 to output the sound through the ALSA driver seems to be able to play without hanging. Likewise with a simple hello_alsa style program - I can push a dodgy sine wave through the interface for as long as I please :) Also: This looks to be a helpful (if old) guide to writing an ALSA driver. Thanks for the work you are doing on this, I know what a complete nightmare it is to work with the linux sound... ecosystem. I asked a friend about writing alsa drivers: "oh hell no! Run, save yourself. I've done that once and never again." |
I compiled pyo (Python digital signal processing module) for the Rpi with portaudio as its backend. Sound is generated nicely, but after 10 seconds or so the sound drops away and this error is shown in dmesg:
I'm certainly not an expert but I though this one was in line with the above... |
I get the same vcos based error myself when the audio goes - however wrapping the transmit code with some in and out debug printk's shows that the blocking call completes. (wrapping line 767 in sound/arm/bcm2835-vchiq.c in function bcm2835_audio_write) Eg paraphrasing the kern log: vchi_bulk_queue_transmit IN |
I believe the "bcm2835_audio_write: failed on vchi_bulk_queue_transmit" only occurs after kill/control-c which is expected - we close the vchiq connection when the launching (userland) app is killed, and all the blocking calls will return with error. |
Here is a log from running ogg123 which hangs for me. The key thing is I get one write with 30240 bytes. Then I get a start trigger which I have to process from a worker thread, as triggers can come from interrupt context. Before the start messages has been sent from worker thread, I get a silence call. Basically it's already failed at this point, after 1ms. If I play the silence, then it will sound glitchy. Why did ogg123 choose to play silence? [ 103.414369] snd_bcm2835_pcm_prepare:245 .. IN |
I don't think the silence comes from ogg123 directly - I am still trying to understand the chain of functions between what the client does and what is received by the driver. I'm using mplayer, as that has a reasonably well-isolated alsa plugin. Their subversion isn't HTTP browsable, so I've put up the key file to a gist: https://gist.github.com/2465839 it's libao2/ao_alsa.c within their sources. The one thing it does very regularly is call 'get_delay' which in turn calls 'snd_pcm_delay(alsa_handler, &delay) < 0)' Only if delay is below 0, '/* underrun - move the application pointer forward to catch up */' it calls 'snd_pcm_forward(alsa_handler, -delay);' - now, does this cause the silence to be added? Need to get a version of mplayer built from source to start debugging. There is also this:
So, this again points to the silence calls coming from ALSA itself, when the player signals that an xrun has occurred. |
I've had another look at this today (there's only so much ALSA anyone can stomach) and have got somewhere: Changing the bulk transfer flag back to BLOCKED_UNTIL_QUEUED Then, altering the pointer callback to recalc the 'pos' anew: This seems to get a nice smooth playback with many of the things I have tried (ogg123, mpg123, timidity, scummvm and so on) but still has a crackling issue with mplayer. Key thing is that it hasn't yet hung on me yet! This also seems to address issue#6 but I haven't tested that extensively at all. |
Thanks for looking. Just for your information VCHI_FLAGS_BLOCK_UNTIL_QUEUED for a bulk message means the data must remain valid until it gets sent. So possibly the unhappiness in ogg123 etc was because the bulk message was blocking the user task for too long. |
I don't think so - I had added in a fake lag into the pointer callback as an experiment earlier and that seemed to improve things, which led me to doing this. At what rate is the audio_vchi_callback meant to occur? turning on a few of those audio_debugs showed that the pos update seemed to happen at a slow rate - eg it would hop 3-4k after a period of time. The hangs certainly only occur when the blocking til read flag is on, as so far it's been stable playing music, and running games via SDL (compiled the Fuse spectrum emulator for it). |
audio_vchi_callback is called every 10ms |
I wonder if we need to either have a pool of buffers and take a Eben On Sun, Apr 29, 2012 at 12:55 PM, popcornmix
|
So far, all I've been trying to do is take the code (which should work) and ahem kludge it so that ALSA gets the numbers it seems to want. I've found this diagram helpful when talking about ALSA: Most of the following is for my benefit and for anyone following along, just talking through how I think things hang together. Please correct me! We've got three sequencial buffers between the alsa client and the playback:
alsa maintains 1), and it is passed the location of the hardware read pointer (hw_ptr) in the pointer callback.
Anecdotally of course, I've had hang-free playback by switching to the VCHI_FLAGS_BLOCK_UNTIL_QUEUED but this reuses the ringbuffer and I've not had much luck fudging a fake hw_ptr to avoid overwriting by certain apps (same ones that seem to exacerbate the hang issue previously too but more testing required.) For 3) We could measure bytes in, and every 10ms there is a callback to bcm2835-vchiq.c:audio_vchi_callback with the number of bytes consumed by playback. The overwriting of the queued buffer suggests that certain apps are filling up the buffer too fast - are those spurious silence calls adding bytes to the playback tally on the audio_vchi_callback? |
Added a few atomic counters to the alsa_stream to tally up no of sent, retrieved and silence requested bytes to try to confirm the silence problem. (Sent and silence bytes were tallied up in bcm2835-vchiq.c:bcm2835_audio_write and the total retrieved added to audio_vchi_callback) Sent bytes is a count of the bytes of audio sent that are not 'silence'. Difference is clear: Apps that sound fine = No silence calls. Apps that jitter = silence calls. https://gist.github.com/2563242 (First half is mplayer, second half is mpg123. Playing the same mp3 for ~10secs each. Note how the silence bytes (3rd col) matches the total bytes retrieved (2nd col) but lagging behind it by one or two updates. ) after ~10 seconds, mplayer has sent 1744896 bytes, vchiq callback has noted 1718480 retrieved, and there has been 1716704 bytes of silence called! mpg123 has similar sent/received for the same timespan, but no silence. This is with the code (I assume) sending a header to the vchiq instance saying to play 'count' bytes of silence (m.u.silence) but with the actual bulk_transmit part skipped. Changing the silence callback to do nothing (bcm2835-pcm.c:snd_bcm2835_pcm_silence) had no effect on the sound quality. Both mplayer and mpg123 write interleaved (writei) but have differing tactics to an xrun - mplayer calls snd_pcm_forward and mpg123 just has another go ("written = snd_pcm_writei(pcm, buf, frames);") Fixing mplayer at the expense of other thingsBy altering the callbacks for silence and copy, I have got a working mplayer (and other apps that use this mode) at the expense of the other apps... /* pointer callback */
static snd_pcm_uframes_t
snd_bcm2835_pcm_pointer(struct snd_pcm_substream *substream)
{
struct snd_pcm_runtime *runtime = substream->runtime;
bcm2835_alsa_stream_t *alsa_stream = runtime->private_data;
audio_info(" .. IN\n");
audio_debug("pcm_pointer... (%d) hwptr=%d appl=%d pos=%d\n", 0,
frames_to_bytes(runtime, runtime->status->hw_ptr),
frames_to_bytes(runtime, runtime->control->appl_ptr),
alsa_stream->pos);
audio_info(" .. OUT\n");
return bytes_to_frames(runtime, alsa_stream->pos);
}
static int snd_bcm2835_pcm_copy(struct snd_pcm_substream *substream,
int channel, snd_pcm_uframes_t pos, void *src,
snd_pcm_uframes_t count)
{
int ret;
struct snd_pcm_runtime *runtime = substream->runtime;
bcm2835_alsa_stream_t *alsa_stream = runtime->private_data;
audio_info(" .. IN\n");
audio_debug("copy.......... (%d) hwptr=%d appl=%d pos=%d\n",
frames_to_bytes(runtime, count), frames_to_bytes(runtime,
runtime->
status->
hw_ptr),
frames_to_bytes(runtime, runtime->control->appl_ptr),
alsa_stream->pos);
/*
.... -> you copy the given amount of
data (count) at the specified pointer (src) to the specified offset (pos)
on the hardware buffer.
When coded like memcpy-like way, the copy would be like:
my_memcpy(my_buffer + frames_to_bytes(runtime, pos), src,
frames_to_bytes(runtime, count));
*/
ret =
bcm2835_audio_write(alsa_stream, frames_to_bytes(runtime, count),
src);
// Update the "pointer"
alsa_stream->pos += frames_to_bytes(runtime, count);
alsa_stream->pos %= alsa_stream->buffer_size;
audio_info(" .. OUT\n");
return ret;
}
static int snd_bcm2835_pcm_silence(struct snd_pcm_substream *substream,
int channel, snd_pcm_uframes_t post,
snd_pcm_uframes_t count)
{
// int ret;
struct snd_pcm_runtime *runtime = substream->runtime;
bcm2835_alsa_stream_t *alsa_stream = runtime->private_data;
audio_info(" .. IN\n");
audio_debug("silence....... (%d) hwptr=%d appl=%d pos=%d\n",
frames_to_bytes(runtime, count), frames_to_bytes(runtime,
runtime->
status->
hw_ptr),
frames_to_bytes(runtime, runtime->control->appl_ptr),
alsa_stream->pos);
/*
The role of silence callback is to set the given amount (count) of silence
data at the specified offset (pos) on the hardware buffer. Suppose that the
data format is signed (that is, the silent-data is 0), and the
implementation using a memset-like function would be like:
my_memcpy(my_buffer + frames_to_bytes(runtime, pos), 0,
frames_to_bytes(runtime, count));
*/
//ret =
// bcm2835_audio_write(alsa_stream, frames_to_bytes(runtime, post),
// NULL);
alsa_stream->pos += frames_to_bytes(runtime, count);
alsa_stream->pos %= alsa_stream->buffer_size;
audio_info(" .. OUT\n");
return 0;
} |
Thanks for the investagation benosteen. Rather that fixing the calls to pcm_silence, I'd like to know why pcm_silence is getting called. I'm guessing if you run with a working ALSA driver pcm_silence does not get called. |
What I discovered last night was that snd_pcm_silence did not mean play silence or catch up as I had thought, but was a command to erase part of the fake ring buffer. Take a look at how sound/soc/blackfin/bf5xx-tdm-pcm.c does its silence and copy ops. From logging the data about what the offsets are (pos/post params passed to the bcm driver), the silence offset lags behind the alsa_stream->pos ptr and effectively ask for the buffer that was just played to be memset to zero. In this case, where there is no hw ring buffer to clear indirectly but a fifo queue, you can change the pcm_silence call to just return 0 and do nothing else. This still leaves the pointer callback - the blackfin driver above is also interesting in this respect too. |
Is there a way to tell if the silence problem is affecting a program? I'm trying to get pianobar working, but the sound is very choppy and I'm wondering if it might be related to this. mpd works fine for me |
There's been a firmware update that could help ALSA issues. Please update and test. |
@popcornmix Ah, cool. Do you know if that firmware update is included in the Debian wheeze image that's currently being tested, or will I have to figure out how to install manually from here? |
No, it's newer than the Wheezy image. You can update firmware with instructions here: |
pianobar no longer gets stuck during playback, and playback is significantly better. It's still very choppy, though, but that could be a different cause (pianobar goes to >90% CPU usage when playing... usually that's a sampling problem, but I'm not sure what it is in this case. Maybe unaccelerated AAC decode just takes that much juice? Also, I'm using tsocks to run over an SSH proxy, so maybe it's counting the encryption time as part of the process? I will poke some more.) |
Closing as I believe the original oops issue is fixed. |
Hello people! Can anyone tell me what to do to fix this issue in current raspbian (fresh 2013-02-09 with packages updated)? My mplayer fails to play anything exactly as described, while aplay does play wav files after modprobing. |
Oh, sorry to mislead, it fails only under a regular user (aplay fails as well), everything works fine under root. Anyone knows if it's the desired effect? |
Try the updated stable branch ( 3.6.xx ) via sudo rpi-update or the testing branch ( 3.8.xx ) via sudo BRANCH=next rpi-update as ALSA fixes have been integrated last week. Remember to move (backup) /etc/asound.conf out of /etc as it's not needed anymore. To get back to the stable tree run: sudo rm /boot/.firmware_revision and sudo rpi-update. |
Hello there and thanks for you answer! Just updated via
The output is different from yesterday's. |
What does groups report?
make sure it contains audio. |
Huh, finally! The PS. How difficult is it to remove the need to add users to |
On Ubuntu, new users added through the graphical tools are in the 'audio' group by default (Unless you specifically prevent them from being added). While this does provide simplicity (eg, things work), it does so at the cost of security. Conventional *nix practice is to only add new users to the group 'users' by default, then add them to the appropriate groups for their privileges manually (ie. you have to give the user the priveleges, as opposed to needing to deny them). |
Thanks for good explanation! |
kernel: net: bcmgenet: Workaround #2 for Pi4 Ethernet fail See: raspberrypi/linux#3108 kernel: drm/vc4: Add missing NULL check to vc4_crtc_consume_event kernel: vc4-kms commits backported See: raspberrypi/linux#3138 firmware: AWB: Allow user to set manual gains for the RPi algorithm firmware: arm_loader: Fix disabling of audio before reboot firmware: fixed up the custom CVT values to use correct group firmware: hdmi: Add independent table entries for hdmi_timings on each display See: #1218
hello_audio builds and runs fine (alarmingly loud ;)) but modprobing the new alsa driver in and trying to play a song using mplayer causes a kernel oops:
Init:
Then, on SSH terminal or normal terminal:
modprobe snd_bcm2835
mplayer foo.mp3
(mplayer fails to find any audio hw to playback on)
then attempting to ls the mp3 directory will kick off the error.
console log: https://gist.github.com/2400335
On primary tty on the RPi, the main kernel oops is shown:
From /var/log/syslog: https://gist.github.com/2400347
The text was updated successfully, but these errors were encountered: