-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Firmware transaction timeout - raspistill hang #1552
Comments
IDK why raspinfo hangs now (perhaps I never tried raspinfo after hitting this on 5.4.83 ??). |
On 5.4.83
|
Confirming I have the same issue using 5.4.83-v8+ and 5.10.17-v7l+ but no issue with 4.19.97-v7l+ . |
Please try upgrading just the firmware and not the kernel. Almost all of the camera stack is on the firmware side, so I'm expecting the issue to be there rather than in the kernel. If you have a reliable way of reproducing this, then you can use |
Thanks, @6x9, for the tip. I will put both kernel and firmware from 5.4.81 (problem not seen) on my rpi4 & then will do the above to update just firmware to 5.4.83 stepwise using your link, above. Is there any helpful debug information I can collect? |
Ok, did It appears to be running ok (no FW hang yet), but I ran raspinfo.nohang-5_4_81-fa9d0f5299d132240c94fe6e7065b9c53897d725.txt |
Ran through these in order, letting them run for 1/2-1 hr:
The last one (453e49bdd87325369b462b40e809d5f3187df21d) failed in 11 minutes. Tried
|
Linking to the forum thread https://www.raspberrypi.org/forums/viewtopic.php?f=43&t=304989 Thank you for testing. 453e49bdd87325369b462b40e809d5f3187df21d does have an update to the IMX477 HQ camera driver, but that should only be for the high framerate mode to avoid image quality issues. It shouldn't affect stills captures at all, but at least it's something to investigate. |
That is not a camera module made by Raspberry Pi. It is a third party clone of the v1 module (Omnivision OV5647 5MPix). The assert logs that had been provided on the forum thread imply that the camera module isn't shutting down cleanly, and is locking up the camera subsystem (and possibly other parts) on the GPU. |
so, put I will try that, but I believe the vcdbg commands hang... |
Nice! This time the problem hit really fast. And the vcdbg seems to have worked: |
I don't see any of the Are the "dump_stack failed" expected? Or are these also caused by this problem? Anything else I can capture from the failure that would help debug? |
vcdbg passes back debug information from the videocore firmware, so all the debug printed is from that. And its all closed source. |
I too have the same issue and am using the V2 camera from Amazon. |
OK. Please let me know if there is anything else I can provide for debug or if you'd like me to try a potential fix/debug code. |
I'm the original poster on the
I'm the original poster (PiPiParty) of that thread, lemme know if there is any testing I can do, I've got the HQ camera. |
So... I'm wondering: I see multiple forum discussions and git issues on very similar problems to this and all those just stop with no resolution. With the firmware being a closed component, there is no way to assist in debug and fix. Is there any way to know if there is anyone is even working on this and/or any progress? I believe this is the right place to ask this - if not, can anyone here point me in the right direction? I'm asking because we have many rpi's running our application which is dependent on networking as well as the camera. The wireless networking appears to be much more stable on the latest releases so we want to be on the latest. We however don't feel comfortable with a mismatch of kernel/firmware/everything else. |
I've been trying unsuccessfully to reproduce this for most of today. With a freshly run
Edit: Perhaps I ought to remove the sleep from the loop. And without the sleep, it has failed! |
Last line of
I see no asserts or exceptions on the VC side. dmesg error stack:
|
What state is the VPU in at that point? Do vcgencmds work? What do "sudo vcdbg malloc" and "sudo vcdbg reloc" report? |
The VPU was in a locked state, but curiously only after I managed to do a vcdbg log msg. |
I've now reproduced too on a CM4 with 477. Something in camplus_close is likely taking longer for some odd reason, and we're hitting the 2s component destroy timeout. |
"log msg". "malloc" and "reloc" don't rely on cooperation from the VPU. |
My whole system locked up at that point. Had to power cycle to get back. |
Reproduced with debugger. |
Nice work - surely we are but moments away from a fix... |
Hmm, don't hold your breath there. Internal issue raised. (vc4#46) |
Patch merged to the internal repo - it should be in the next rpi-update release. |
Way awesome guys! Assuming I can just rpi-update to grab it, I'll do that later today & exercise it a bit. Thanks!! |
Great news, thanks all!! |
Last night, I did the rpi-update and it picked up c0cf93a133dab106439c208f03d32155bc19e432 plus 5.10.25-v7l+... still running fine after 12 hours. Before, the failure usually occurred after just a few minutes. This IS great news! After a bit more run time, do I close this issue or is that done by someone else? |
Close it when you feel ready. |
When attempting to run "sudo rpi-update" on my Pi4+, I get this message:
Not sure if I can safely ignore this warning? The machine is located a half hour drive away from me. So I would rather not have to go there if it fails to boot :-) |
The warning is aimed at old Raspbian images with 56MB boot partitions. I'm sure with 255MB you'll be fine, but you can always check the free space with the command |
I'm at 36 hours, looking good so far. |
Been running great for 3 days now. Thanks again for the great work! |
I did the firmware update and got 5.10.25-v8+ . However, the camera still failed after one day after reboot. The error I get is:
This might be a different issue from what was discussed here. The messages I get from " sudo vcdbg log assert" are:
|
is almost certainly that something else is using the CSI2 receiver. It's possible that a previous instance has got "stuck", but that shouldn't be the case with the fix created for this issue. |
@feacluster , If you had already hit this problem before you put the new firmware on, a regular reboot attempt (reboot/shutdown -r) seems to not really reboot the pi. You may need to power cycle or see the reboot sequence in the link @6x9 just posted (#4047). That worked for me. "uname -r" will let you know if the new kernel was loaded - IDK how to tell the currently loaded firmware level. |
I'm still seeing this issue on a Pi 3B+ with the HQ camera. I'm running kernel 5.10.25-v7+ and used
Just wanted to check that this is correct and isn't due to an outdated firmware as the date it shows seems old. Running |
@simonyangme I am just confirming that I too still have this issue on one of my seven cameras despite upgrading to latest kernel . I also tried doing a full power cycle by unplugging as suggested earlier. I am using a Pi 4 , not 3B+. I don't believe it is a hardware issue because the camera works fine for several hours. Often upto a full day. As a workaround I have a cronjob to reboot everyday at midnight. I suppose I can try replacing camera. But it is in a somewhat inaccessible location so would rather not :-) |
@simonyangme that firmware is too old. To force an rpi-update use:
although there's no need to specify a specific older firmware, all subsequent versions will contain this fix. |
@popcornmix That seemed to do the trick! Is there any reason this is required? Simply doing |
|
I'm now running:
and still see the same issue when I switch to full frame mode and capture an image. |
With a bit more testing, the failure seems to happen randomly on the first capture after boot on my system. If it happens, then I am forced to power cycle to recover. If it does not happen, then capturing 5+ images does not seem to trigger it. |
FWIW I've tried going back to dea7234943c604462e476a8afc13c587418e8709 kernel and firmware, and have not seen the issue since. I have not had a chance to further narrow down what might be contributing to the problem, but got the following from
Please let me know if I should open a new issue for this and if there's any other info I can provide to help track this down. |
Unfortunately the issue seems to be happening again. I have seven of these v2.1 cameras but two of them have started showing the same errors reported in this thread. See code snippets below: As a workaround I have a cron job that reboots the pi every day at midnight. The camera will work ok for 1-2 days but then fail with:
|
I seem to be having the same issue mentioned in these threads. I am running Model 4B 8GB with RPi HQ camera. Raspistill calls seem to be locking up my raspberry pi, the only solution is a hard reboot. I have tried
When running raspistill with the --verbose output, this gets printed in the feed at each camera call: vcdbg log output:
Any help would be greatly appreciated! |
The
I added start_debug=1 into config.txt, rebooted, and re-encountered the issue. See
|
I don't know what information is useful, so I'm going try to be thorough and also attach |
Is this the right place for my bug report?
I'm pretty sure.
Describe the bug
raspistill hangs on Firmware transaction timeout. After that point Xorg also hangs up. On this latest recreate, raspinfo is also hung when running after the problem to collect data.
To reproduce
Run raspistill repeatedly until the error occurs.
Expected behaviour
The kernel & firmware should remain stable at all times. Running a camera capture should always create an image and return or gracefully fail leaving the system stable.
Actual behaviour
Running raspistill repeatedly will eventually get the firmware transaction timeout.
System
raspinfo
is hung after displaying the following:I will continue to try to get the raspinfo or, failing that, will try to get the equivalent parts.
cat /etc/rpi-issue
)?vcgencmd version
)?vcgencmd version
hangs and cannot be interrupted by ^Cuname -a
)?and all subsequent levels.
Logs
dmesg.txt
Additional context
This recreate is the first time raspinfo hung.
I displaying to 2 HDMI monitors.
There are 2 USB to serial adapter cables connected but not in use during recreate.
This problem occurs in all levels following and including the Dec 14, 2020 commit.
I am running a C++ application which snaps 2 images every few seconds using these 2 commands for the 2 invocations (via system()) of raspistill:
When this problem occurs I have to either power cycle or run the following to reboot (
sudo reboot
seems to start rebooting, but never actually reboots):No success with raspinfo - I'll recreate & see if I can collect & post the additional info.
The text was updated successfully, but these errors were encountered: