-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Device freezes #304
Comments
I tried v22086, which produces the same issue. Only stable version for me is 17119 |
We have the same issue in our club glider. We powered the device with a separate battery (new) but it crashes after about 2 hours. 17119 works fine. |
We got the same issues in two gilders with SteFly OV. We haven't modified the DC/DC converter yet but it seems to happen also with modified one (reported in xcsoar forum). First we thought it's a power issue because it seems to happen when the radio tx or rx but we completely power it from one battery so this shouldn't be the issue. |
Do we know at what current the reseteable fuse kicks in on the SteFly OV? |
Good question, don't know exactly if and where it is. We got a Stefly OV in spare, I can try to figure it out. |
I can leave it running tomorrow for 2 hours or longer. Which start menu do you mean, the OV menu? Mine is a DIY, not sure about the converter but will open it up and have a look if I can find a part number. My device is not yet built in. I have connected it to 2 different PSUs (Basetech and Delta). The devices uses about 0.58A during operation at current Image. Once the device freezes, there is a spike to about 0.7A and it stays there. With Image 17119 it uses 0.45A. |
I seem to have the same issue with my vanilla Stefly 57. Should I piggy back on this issue or do you want me to open an additional one? |
I think we should do it in a systematic approach... Maybe create a table where everyone can add the affected configuration, so we can eleiminate some things maybe ... Data we would need in the table: Hardware Variant (CH070, PQ070, etc). Advanced debug: |
Fair point, linuxianer99, so here we go Data we would need in the table: Advanced debug: OV so far has been connected via /tty/S1 to an XCVario (w/ connected Flarm). Have changed this now to WIFI with a USB Wifi dongle. Will test / have tested the following cases (may take some time, will update accordingly): |
I have created a little spreadsheet on google docs (not sure everyone here likes google - but it was the easiest for the moment) where people can document their freezes along the lines of @linuxianer99's request. Feel free to add yourself. Also: feel free to spread the word to those OV-users not reading this here. |
Please start at the minimum for testing. So no devices connected and menu, then work up from there. Also reenable the serial console and see the last messages. |
How do I enable serial console? |
It's always enabled ... just connect to Cubieboard port Serial 0 |
Preliminary results are in the sheet now. Seems my USB-serial adaptor is broken. Ordered a new one and will retest with console open. |
Before you trash your USB-serial adaptor note that only error messages show up on the serial after #265 |
Thanks for the hint @mihu-ov - but once I had it connected to OV I remembered that it wasn't working properly already a year ago, when I was debugging a pfsense router on a firebox. 115kBaud, 8N1 is the correct setting, though, isn't it? |
https://www.openvario.org/doku.php?id=projects:series_00:electrical_tests:serial_console_boot_log has some instructions that may be helpful. |
Do we have a prompt on ttyS0? if so you can login and do a |
Based on @mihu-ov's statement re: #265 it doesn't look like there's a login on that port...provided the necessary packages are built into the OV kernel and distro we could, however, possibly enable it and do as you suggested @lordfolken |
Still preliminary - however: it appears 22050 is stable as well as 22028 (freevario version). 22086 is crashing. So most likely something must have happened between 22050 and 22086. Not familiar enough with github to find the associated changes in between these two releases. Maybe someone better at mastering github than me could do that. |
22050 is day 50 in year 2022 or February 19th and 22086 is March 27th. https://github.com/Openvario/meta-openvario/commits/master shows a kernel update from 5.15.24 -> 5.15.27 and "sensord: use systemd socket activation" (but you have sensord disabled according to your spreadsheet). |
So since sensord is unlikely to be the culprit this leaves us with the kernel update as a possible cause. How easy is it to revert the build system back to 5.15.24 but retain all other changes? If easy (and someone explains this to me in simple language) I could build an up-to-date image with the old kernel and run that as attempt for positive proof.... |
Ok, I figured the build process and built an image w/ kernel 5.15.24....let's see what happens... Turns out that my ov image 22028 (from the freevario git repo) was built using kernel 5.10.2. So chances are that kernel 5.15.24 may not be the right version for a stable image. |
I filled my information into the spreadsheet. I flew 6.5h at the weekend (and the device was powered ~8h in total), no freezes anymore after disabling sensord, variod and pulseaudio. The Openvario was also more responsive. I am not at the latest master branch commit with my image, so I could try that (after our competition) to see if the Kernel as implied has something to do with it. |
Added "pulseaudio" to the "running demons" section. |
Quick request: (to whoever entered line 13 in my table): could you please check, which kernel version your OV is using (hook a keyboard to OV, go to OV menu, exit to shell and type "uname -a" (w/o ")? Or anybody else using 22050...be sure to check on the system (and not based on what should be in the image :-) ). I was using an image called ..22028.. which in reality was 21350 with a kernel version of 5.10.2 (relatively old kernel). |
@Scumi could you please check the exact version of your kernel on the OV where everything runs fine? |
Hi @mihu-ov, (1) I have not yet hat a chance to measure power consumption. Will redo my panel in the weeks to come and check power consumption during this process. (2) Voltage increase in itself helps stability but doesn't make OV completely rock solid. I have played with this and found that the best solution seems to be to fix cpu freq at a higher freq e.g. 720MHz. This poses the question: why then increase voltage (I have written a bit about that in my commit message). Simple answer: CPU frequency fix (or changes) can be done at run time through changes in the script (see PR #331) - this may create a less stable system (if min freq != max freq and the kernel starts changing CPU freqs). It will be increasingly instable if we allow voltages below 1.2V (especially on a system that cycles CPU freq) . In a way this is a belt and braces strategy to create the best possible stability. So long story short: is the 1.2V fix absolutely necessary (if cpu freq fixed at 720MHz)? No!. Is it better for cases where users want to "play" with changing CPU speed? Absolutely! cheers |
I've been looking at temperature reported by
Not sure. However, there might be individuals where 1.2V isn't enough for CPU frequencies <= 720MHz. So we should be prepared.
No. The key is to limit the maximum frequency (to 720MHz) and make sure the voltage doesn't change (PR #330 ). I've been experimenting with no limits on the minimum frequency and that went well. |
root@openvario-57-lvds:/sys/devices/system/cpu/cpu1# uptime I would consider this to be stable enough for the cockpit. When do you guys intend to roll in the pull requests? |
I believe we can close this issue now as being fixed (at least for the time being). I'm only hearing positive feedback to these fixes... |
Hi there, |
Yeah, please do - also: did you download the image from one of my download locations? |
Hi all,
we had image 23050, 2 OVs per plane, running in our ASK 21 and our Duo Discus on Friday, Saturday, Sunday (Discus only) and Monday for more then 5 up to 8 hours without any issue on both instruments. Also the images from Uwe Augustin were running all the time.
Our neighbor club has OV with that image running in two LS 4 and in ASK 21. In one LS 4 the battery is bad, and the OV crashed when the battery went down. Only Flarm was still working no OV, no radio. The other LS 4 has a good and stable battery, and the OV did work without a problem.
Their ASK 21 has a bad battery which loses capacity rather fast after a while. The behavior was a bit strange for me. First all is working well, then after a while OV is working (vario sound, moving map, vario showing) but it does no longer accept any action from the remote stick. The behavior is the same as with an older image.
Did anyone notice such a behavior? And how can this be repaired. I think it is not a problem of the image. Two different images showing the same behavior. It might be a problem of the new USB-Connectors which clamp the cables directly into the connector.
Has anyone any idea?
Regards
eku60
Von: D-2402 ***@***.***
Gesendet: Dienstag, 11. April 2023 18:21
An: Openvario/meta-openvario ***@***.***>
Cc: eku60 ***@***.***>; Mention ***@***.***>
Betreff: Re: [Openvario/meta-openvario] Device freezes (Issue #304)
Yeah, please do - also: did you download the image from one of my download locations?
—
Reply to this email directly, view it on GitHub <#304 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AL7IIITHIQIAH63BXRQFSBLXAWAHRANCNFSM5VMCREXA> .
You are receiving this because you were mentioned. <https://github.com/notifications/beacon/AL7IIISNPXNWLE4CM6JQQU3XAWAHRA5CNFSM5VMCREXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOLGQL4CQ.gif> Message ID: ***@***.*** ***@***.***> >
|
See discussion in #312 |
Beware: the stick (as well as the rotary) controllers run with arduino boards that do the conversion from whatever input rotary and stick buttons/dials create and translate them to key strokes. Could it be that these arduino boards react more sensitive to voltage drops than OV? Or (given the discussion in #312) it could be the shielding of the boards/cables. |
Hello,
thanks for the info. Didn't notice that issue. Our planes have the old cable
and connectors and we had no problems so far. The ones who make problems are
the new version from Stefan Langer.
As the batteries are not in a good state, they ordered new ones (LiFePo4).
We will then test again. If the problem still exists, new cabling will be
next.
The (older) rotaries have shielded cables, and short ones.
Regards
eku60
Von: D-2402 ***@***.***
Gesendet: Dienstag, 11. April 2023 20:07
An: Openvario/meta-openvario ***@***.***>
Cc: eku60 ***@***.***>; Mention ***@***.***>
Betreff: Re: [Openvario/meta-openvario] Device freezes (Issue #304)
Beware: the stick (as well as the rotary) controllers run with arduino
boards that do the conversion from whatever input rotary and stick
buttons/dials create and translate them to key strokes. Could it be that
these arduino boards react more sensitive to voltage drops than OV?
�
Reply to this email directly, view it on GitHub
<#304 (comment)
751> , or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AL7IIIXJXJINKMKIUCAQS3TXA
WMU7ANCNFSM5VMCREXA> .
You are receiving this because you were mentioned.
<https://github.com/notifications/beacon/AL7IIIS27WGB5324ULNDC5TXAWMU7A5CNFS
M5VMCREXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOLGRRQDY
.gif> Message ID: ***@***.***
***@***.***> >
|
I've downloaded the image from this repo releases. |
The SteFly OVs are known to have a critical DC-DC converter. At least some of them. It can't cope with input voltage below 11 V (more ore less). It just "browns out". A good battery and good (power) cables will help. Replacing the DC-DC converter is not a bad idea. I had a couple of OV instruments running for 20+ hours in flight this season and they both work just fine with the recent image. I fixed one issue with a SteFly OV: the 15-pin connector didn't connect safely. I replaced the fixation with screws such that the connector stays in place firmly. That issue is easy to detect: just wiggle the connector and see if the OV keeps running. |
Thanks for the tipps. We just purchased a LiFePo4 Battery and will check our cable management. Will get back to you as soon we tested it. |
So it really seems that our batteries were too weak. We tested the lifepo4 batteries this weekend and so far have not had a crash. What is still a mystery though is that we never had these problems with the old 17119 image anyway. Thanks to all the hard work you guys put into this. |
Very simple: 17119 used a very different linux kernel than the current images. All current kernels have this bug in the cpu_freq driver for cubieboards - or to me more precise: the cubieboards seem to have a bug when switching cpu frequency and corresponding voltages. And the kernel doesn't work around this bug. Hence the crashes. The fix simply fixes voltage and cpu frequency to work around this hardware oddity. For a detailed background of what happened, why it happened etc. see the earlier parts of this thread or read through the commit messages for the relevant patches... |
Yes, i followed all of this and was btw really impressed by the fixes. The strange thing though is, that the newest image, including all the freq and voltage fixes crashed within about an hour, with our old battery setup. But the old one doesn't, same day, same sd card, same battery. |
Hello Leon,
the old kernel (17119) was not so sensitive on voltage swing.
The new kernels are more sensitive, so that Stefan Langer had to build a new generation of adapterboards.
With the old kernel and the old adapterboard the OV was still running when cranking the Rotax 912 and a not fully loaded battery. No chance with the new kernel.
eku60
Von: Leon Braun ***@***.***
Gesendet: Montag, 1. Mai 2023 11:56
An: Openvario/meta-openvario ***@***.***>
Cc: eku60 ***@***.***>; Mention ***@***.***>
Betreff: Re: [Openvario/meta-openvario] Device freezes (Issue #304)
Yes, i followed all of this and was btw really impressed by the fixes. The strange thing though is, that the newest image, including all the freq and voltage fixes crashed within about an hour, with our old battery setup. But the old one doesn't, same day, same sd card, same battery.
—
Reply to this email directly, view it on GitHub <#304 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AL7IIIRFHNTUG6JIEAHEOO3XD6CBXANCNFSM5VMCREXA> .
You are receiving this because you were mentioned. <https://github.com/notifications/beacon/AL7IIIRWXJG7NZMJFIYUQMTXD6CBXA5CNFSM5VMCREXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOLMVMT2Y.gif> Message ID: ***@***.*** ***@***.***> >
|
When was this voltage driver update done? What is an old board (version)? I flew with 22304 the other day. Stefly OV 7" without the vario part. It crashed after 2,5 hrs. Black screen, no response to ctrl+alt+del. My history: Then after the more resent Freevario updates (22097 and the one before), I began to see crashes. And thought that maybe I should swap to the un-moded hardware. I did, and after 2,5hr on 22304 I got the crash. No luck Knowing that maybe the supply is not up for the task, I will revert to hardware with UBEC mod. I hope this fixes it, as competition is coming up. I have run the devices many days on desk without crashes... But maybe flight conditions stress it more than just sitting on desk. |
Hello,
we bought Stefly OV and they did their work until about 2020. When cranking
the Rotax 912 (SF25) or the Wankel (ASH 26E) the OVs kept still working.
Then the problem appeared, that, when using the sensor board no airspace
warnings were possible.
So pulseaudio and a new kernel were implented. Since then the OVs crashed
when cranking these engines. Help came from a sperate 12V to 5V converter,
but the cheap ones were not reliable, so Stefan Langer created a new
Adapterboard (latest version, I think) and this problem was solved, but not
the sudden crashes appearing then.
Some pilots like "Bomilkar" or "D-2402" or "August2111" and others did hard
work to find out the kernel issue(s) responsible for these sudden crashes.
With an unofficial Image 22190 and the new Adapterboard, we had no crash
anymore even not in 8h flights on the ASH26E because voltage and frequency
have fixed settings by a shell-script.
The image on the freevario website (23016???) should work. Also August2111
has Images which are running smooth since one month. A club in the
neighborhoud had issues because of weak batteries. They replaced them and
the systems are running smooth now (all images from August2111 on PQ54
stefly systems).
I created one of myself, 23050 only for CH-07, and D-2402 has a complete
set with the same number. Both and the ones from August2111 are running
smooth in ASK21 (2*2 Ovs), ASH26E, Duo Discus (1*2 Ovs) and 2 LS4.
They all use fixed voltage, fixed frequency settings and are from 2023.
Maybe this will cost some extra power, but not yet a problem for us, also
not in the two seaters.
Try to contact D-2402 or August2111, they have the images for download, you
should try, mine is only for CH-07 (Texim 7" display). One OV ist still
running with the old Adapterboard plus extra 12V-5V converter and image
23050.
Regards eku60
…-----Original-Nachricht-----
Betreff: Re: [Openvario/meta-openvario] Device freezes (Issue #304)
Datum: 2023-05-01T23:09:44+0200
Von: "HoeckDK" ***@***.***>
An: "Openvario/meta-openvario" ***@***.***>
@eku60 <https://github.com/eku60>
When was this voltage driver update done? What is an old board (version)?
I flew with 22304 the other day. Stefly OV 7" without the vario part. It
crashed after 2,5 hrs. Black screen, no response to ctrl+alt+del.
My history:
I bought my OV in 2016. Ran flawless until 2018, where it started crashing.
I suspected a PSU issue, and eliminated the 5v PSU for an UBEC. Later found
out it was a faulty wifi module that killed it. While debugging I bought a
2nd device... That has not been used until now. That setup has served me
well until the recent updates.
Then after the more resent Freevario updates (22097 and the one before), I
began to see crashes. And thought that maybe I should swap to the un-moded
hardware. I did, and after 2,5hr on 22304 I got the crash. No luck
Knowing that maybe the supply is not up for the task, I will revert to
hardware with UBEC mod. I hope this fixes it, as competition is coming up.
I have run the devices many days on desk without crashes... But maybe
flight conditions stress it more than just sitting on desk.
—
Reply to this email directly, view it on GitHub
<#304 (comment)>
, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AL7IIIXI2NNPXTNQV44FYFDXEARBHANCNFSM5VMCREXA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
[ { ***@***.***": "http://schema.org", ***@***.***": "EmailMessage",
"potentialAction": { ***@***.***": "ViewAction", "target":
"#304 (comment)",
"url":
"#304 (comment)",
"name": "View Issue" }, "description": "View this Issue on GitHub",
"publisher": { ***@***.***": "Organization", "name": "GitHub", "url":
"https://github.com" } } ]
|
I'll put my hopes on the converting back to the modified device with the UBEC 5V converter, and pray it will work, together with 22304 |
Hi,
don't forget to fix frequency and voltage of the cpu: 720 Mhz and 1.1 or
1.2 Volt. This made Image 22190 without crashing anymore on a OV which was
crashing every 5-10 minutes before. Newer images have these fixes onboard,
nothing to do by hand.
Good luck.
…-----Original-Nachricht-----
Betreff: Re: [Openvario/meta-openvario] Device freezes (Issue #304)
Datum: 2023-05-03T22:12:21+0200
Von: "HoeckDK" ***@***.***>
An: "Openvario/meta-openvario" ***@***.***>
I'll put my hopes on the converting back to the modified device with the
UBEC 5V converter, and pray it will work, together with 22304
—
Reply to this email directly, view it on GitHub
<#304 (comment)>
, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AL7IIIWZOKAVW7HRZ6VZHDDXEK32HANCNFSM5VMCREXA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
[ { ***@***.***": "http://schema.org", ***@***.***": "EmailMessage",
"potentialAction": { ***@***.***": "ViewAction", "target":
"#304 (comment)",
"url":
"#304 (comment)",
"name": "View Issue" }, "description": "View this Issue on GitHub",
"publisher": { ***@***.***": "Organization", "name": "GitHub", "url":
"https://github.com" } } ]
|
Hi I did 4 hours of flying today. With 22304 OV. No crash. I guess there is something with the other HW setup then. Most likely the power supply? |
Hi all Today I got to fly again. 4 hrs with the modified hardware. No crashes. I guess there is something power supply related in my unmodified Stefly hardware then. However I had issues with Flarm reception, maybe also TX. This has been a challenge for me every since I owned this glider. But this season it worked well (new OV hardware). Today it didn't. Any ideas if it could be related? Noise from OV teasing Flarm. I have seen a post where OV trips noise in the radio. That does not seem to be an issue for me though. |
What do you mean "had issues with Flarm reception"? |
There are 2 useful tools to analyze the range of the 868 MHz radio of the FLARM:
I'd like to help you, but this is the wrong place: this Issue is closed and it is under a different topic. Therefor I suggest you open a new thread in the Forum: https://forum.xcsoar.org/index.php |
Btw. both of our OV's working well with the new LiFePo4 batteries. We've got around 30 hours of competition time on both of them last week with no crashes. Only one time, the OV in our Duo crashed before we started, but I thinks its overheated. After a restart and a covered up hood, everything works great for the next 6 hours. |
I have a 7" OpenVario with sensorboard. After between 5 seconds and 20 minutes of operation, the screen goes black(or white, or green, or blue...) and the whole device stops responding. I am running 21118 for the CH070 screen.
After installing 17119, the issue seems to be gone, so there must be something wrong on the software side. I can pull some logs if that helps, but I will need some guidance on where to find them on the device.
The text was updated successfully, but these errors were encountered: