-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] "Heating Failed" after PID takes over #21661
Comments
I twice edited the above to include this second graph and log, but the template seems to be truncating it out: |
I have the same issue. I am using 12 April bug fix. Yesterday I try to do PID but its fails. And it does reach to desire temp. It always below 10 degree. I have SKR 1.4 Turbo with TMC 2130 and in 2nd printer with SKR E3V2 and skr mini |
@Thinkersbluff Have we been able to exclude #21374 being the cause? |
I'm using CR6Comm-CF6-Final-cr6-se-v4.5.3-mb-2021-03-27-15-53 and can confirm I too have this issue, I never saw the problem until I got a new 4.5.3 board and upgraded to CR6. The temp drops to 10C below what you set it to and hovers around there on the heating and homing screen until the yellow heating error screen appears shortly after. It is always approx 10C too low and I can consistently reproduce this issue. |
Same problem here... Board 4.5.3 and Stock TFT... I use the CR6Comm-CF6-Final-cr6-se-v4.5.3-mb-2021-03-27-15-53.zip ... The temp drops to 10C below what you set it to and hovers around there on the heating and homing screen until the yellow heating error screen appears shortly after. It is always approx 10C too low and I can consistently reproduce this issue, too any solutions ? |
Nope. No evidence either way. |
It appears that the problem is related to using the parameters returned by the current PID routine. |
I just sent this to make temperature internal calculations using high precision again, maybe worth a test: #21678 |
There have been several temperature-related fixes merged within the last couple days. Please download |
Sorry, it has become less stable. Not only "heating failed" errors but thermal runaway as well. Need some time to collect additional data. edit: Reverted before the PR and at least there is no thermal runaway. |
Thank you for the outreach. When I was reviewing the closed Issues for possible matches to this problem, I found the description and discussion in issue 20463 was particularly similar and relevant to the situation I think we are having here. For that reason, I have experimented with two configurations in these tests:
The test procedures that I used and the results that I logged are documented in the .txt file under each plot. Here is a copy of the configuration files from the second series of tests. They differ from the first series only in the value of those two parameters. My limited math skills and lack of C++ literacy is slowing-down my efforts to understand why these changes seem to improve both the hotend stabilization and the PID Autotune function, but the plots seem to confirm that they do. I would particularly like to understand the significance of editing these parameters, so that I don't recommend changes that only break something else. I wish I had some way of independently validating the PID factors being generated by Autotune...
|
I have a similar problem (temperature 5-10 degrees higher than setpoint) with SKR 1.4 (3 of them), while on SKR 1.3 (4 of them) had never had problems. Same printers (Ender3, Sovol SV01). It seems caused by strange noise/peaks in the thermistor reading. I improved thing a lot by increasing ADC_LOWPASS_K_VALUE from deafult 2 to 6 or 8. Reading around and looking at the schematics, I see that SKR 1.4 (and maybe other boards) has different configuration than SKR 1.3, including a ESD Suppressor (CG0603MLC-3.3LE). Might this lead to issues with temperature reading, which need special attention? |
In my work life, we found that one of the easiest ways to explore complex technical issues is to exchange highly simplified pictorial models. If this model is obviously over-simplified in the mind of one or more participant, that suggests an important factor of which other participants may not be aware. By marking up this model to "correct" it, we facilitate a focused conversion of implicit to explicit knowledge. Through that exchange, we all better understand what matters. Open question to those who are interested in solving this problem: Does this model show all of the essential elements that need to be considered + all of the controls available to us to adjust our machine(s) back into a stable operating region? |
Possibly related #18642 by @bohbotjames |
Adding to my comment above. If I activate PID debug (#define PID_BED_DEBUG, then M303 D), I see crazy PID output values, especially on the derivative action, when temperature values are noisy and jump +- 1 or 2 degrees is a very short time. And the fact that the temperature hovers exactly 10 degrees above/below Setpoint is not a coincidence. I did not try, but I bet that if you change PID_FUNCTIONAL_RANGE value, your temperature will hover at a different value. |
I agree that looks like the same issue in July 2020. Closed by the author same day without explanation. |
@Thinkersbluff Did we determine if the problem happens on stock Creality boards too? We know the BTT SKR CR-6 board has the temperature noise issue - stock has not.
Met vriendelijke groet,
Sebastiaan Dammann
…________________________________
Van: Thinkersbluff ***@***.***>
Verzonden: Monday, April 26, 2021 1:28:50 PM
Aan: MarlinFirmware/Marlin ***@***.***>
CC: Sebastiaan Dammann ***@***.***>; Mention ***@***.***>
Onderwerp: Re: [MarlinFirmware/Marlin] Nozzle temperature sometimes cannot reach target - hovers instead about 10 degrees lower until "Heating Failed" timeout (#21661)
I did not try, but I bet that if you change PID_FUNCTIONAL_RANGE value, your temperature will hover at a different value.
This was also my expectation, so I did explore the impact of changing PID_FUNCTIONAL_RANGE.
The plots posted here show that if I reduce the parameter to 5, the oscillation is more likely to occur consistently.
If I increase the value to 16, it is more likely to achieve the target temperature.
Reducing PID_K1 smoothing factor from 0.95 to 0.55 also improved the system’s performance. Doing both seemed to remove the initial undershoot and ringing almost completely for the conditions tested.
The ADC_LOWPASS_VALUE parameter is unique to the HAL.h for LPC1768, so not an option for CR6 users on STM32F1 motherboards.
I am only looking at nozzle temperature at the moment but I think the bed and chamber and laser cooler all use the same PID function.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMarlinFirmware%2FMarlin%2Fissues%2F21661%23issuecomment-826759912&data=04%7C01%7C%7C003dc26da18a4e24e2df08d908a67849%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637550333336697538%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PhlaN535fhfjndFF92GiMUejCeKvdrMsXXjwchXBt1I%3D&reserved=0>, or unsubscribe<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAK4FMP7MCR6NRT5SANK7MDTKVE7DANCNFSM43FC3BGA&data=04%7C01%7C%7C003dc26da18a4e24e2df08d908a67849%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637550333336707530%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8D4B%2FoTDmDTv3BjkS1eTdgMT0xttmEvFq0KUEM9cRXs%3D&reserved=0>.
|
Dear developers, it seems that SKR 1.4 (my case, see also Talha909 ) and CR-6 (see Sebazzz) boards have temperature noise issues, apparently due to their electronics setup. Why not adding a user configurable filter for the temperature? Something like OVERSAMPLENR, but a bit more powerful and configurable from configuration_adv.h. @Thinkersbluff, could you pls reply to above question of Sebazzz? That would be helpful. Thanks |
Which question have I not answered? These posts seem a little out of sequence but there are at least two stock 4.5.3 MB users above confirming they have this issue. |
I updated my Ender 3 with Marlin 2.0.8, yesterday (fantastic work by all, thank you!) In the first graph:
In this test,the PID process itself generated a bit of a messy curve, not the smooth cycling I am used to seeing. That may somehow correlate to the set of factors that Marlin computed. The +/- 5-6 degree cycling around the target value with the computed PID parameters persisted for the 10 minutes I left it running. In the second graph:
|
I am presently struggling valiantly through the PID "documentation" referenced in the Marlin Configuration.h. (https://reprap.org/wiki/PID_Tuning) That documentation refers to a parameter which appears to have disappeared from Configuration.h since the document was last updated: "The 'sum of errors*time' value is limited to the range +/-PID_INTEGRAL_DRIVE_MAX as set in Configuration.h." There are 49 Closed issues which include a reference to PID_INTEGRAL_DRIVE_MAX, so I parsed through them to find out what happened and why. Turns out #4881 removed it (see 3rd issue below). The referenced documentation is out of date. These 3 closed issues sound a lot like this current issue. The second issue was closed without any corrective action :
|
Following the advice from team members back in 2016, found on the above related issues, here is the same Ender 3 system as tested in the previous comment, running Marlin 2.0.8 with PID_FUNCTIONAL_RANGE set to 50 (Test conditions and serial log captured in the .txt file below the graph:
I believe this one parameter change could be optimized with experimentation but annecdotal feedback from numerous issue reports seems to support a suggestion that the current default of 10 may be too low for some systems. This infers to me that Marlin is capable of calculating useable parameters and of performing PID, but it needs a bit more "elbow room" to work, hence the positive impact of increasing the value of that functional range parameter. |
btt skr cr6 + cf6,1p3 // edit: I take it back. |
I just encountered the same problem my bed temperature head up fine but my nozzle does no reach the set temperature |
Modifying the PID_FUNCTIONAL_RANGE and PID_K1 factors as described above is at least a proven work-around. |
I would recommend against using bang-bang heating for the bed because it tends to produce visible artifacts in the print. |
If the values produced by It may be the case that power to your hotend and/or bed are being leached when both are turned on at the same time. So, when tuning your hotend PID it will help to turn on bed power, and when PID-tuning the bed, it may actually help to turn on the hotend. If you typically run your fan above a certain temperature with auto-fan, you should run the auto-fan during your PID tuning. All of these will ensure that the PID values are set according to the amount of power that is available during printing. If there is any doubt that this will help, give it a try and see if the PID values are very different from doing isolated PID tuning. |
NOTE: Be sure to include the |
@thinkyhead Do you know why only half of max power is used? You set it to 255 and you only ever see 127. I fixed it in my code so it doesn't divide the power until it's in PID functional range. My bed has an AC heater so the power supply is barely taxed at 50 watts for the hot end. My bed and temp reach temp at about the same time now. I didn't change the way the hot end heats as far as power goes. |
Are you saying this half power behavior is something that recent versions of Marlin PID do, but older versions of the PID code don't do? |
When I looked at the code some time ago, I determined that @127 is actually full power - I don't know why the PWM power only goes from 0-127 but I expect it's something to do with signed values being needed for the PID algorithm. |
It looks like Marlin/Marlin/src/module/temperature.cpp Line 1324 in 0c4085d
|
@ManuelMcLure, yep. I'm not sure why they do that. I took it out (the >> 1). My bed heats fine and stays at temp reliably. |
This is very helpful advice. Thank you. Do you happen to have photos or a link that explains how to recognize this type of artifact? I often see folks asking what causes some types of print artifacts but I have never seen this particular response as a possibility. |
Well, the comments in
Note the "128 effective control positions" bit.
Marlin/Marlin/src/module/temperature.cpp Line 3019 in 0c4085d
SOFT_PWM_SCALE will be 0 and that means that pwm_mask will end up with a value of 1 << -1 . C considers the results of << as undefined if either operand is negative.EDIT: never mind - the - is outside the |
For what it is worth, the PID values determined by the Marlin autotune for my E3D Hemera are:
These values cause the temperature going up and down endlessly. However, the values below actually work and are stable:
In both cases |
@thinkyhead The command that saves to EEPROM that you gave fixes the issue -- However, I don't think it is the same bug that Creality and BTT SKR 1.4 Turbo users were experiencing in the beginning of the thread. I was experiencing the same issue with SKR 1.4 Turbo and Marlin 2.0.9.1, and I've tried fixes using ADC_LOWPASS_K_VALUE and PID_KI and PID_FUNCTIONAL_RANGE workaround settings. None of them worked. I ran M501 and the values for bed Kp Ki Kd were all zero, even I defined them in Configuration.h; here's a sample output --
I think the bug here is that the values should be defaulting from the values defined in Configuration.h and not zeroed out. Also it seems it is currently not possible to load existing BEDPID values, it is only possible to write them using M303. It could also be that SKR board doesn't honor the code that supposed to load the values. I had the same issue with:
I suspect this is what users keep filing when they realize they can't use values in Configuration.h when EEPROM is enabled -- #12468 I'm happy to open a ticket based on this. It seems all the comments starting on 2021 Jun 21 are referring to EEPROM read issue in 2.0.9.1 rather than PIDTEMP bug with Creality/SKR |
So, just to make sure we're on the same page, |
@ManuelMcLure Thank you. I'm moving from marlin 1.1.9 when I only sparsely used EEPROM and I didn't realize I needed to use M502 to load values from firmware. This seems to be a common misconception, maybe better wording in Configuration.h could alleviate the problem. Referring to Marlin 2.0.9.1, M502 mentions resetting to 'factory defaults' but doesn't mention that defaults also need to be loaded with M502
|
Yeah, it's a bit confusing. I always try to explain that there are three levels of configuration in Marlin - RAM, EEPROM, and firmware.
This is done because there's no easy way to detect whether a value was changed in the configuration files if the EEPROM layout didn't change. If you (for example) have updated your Z probe offset and stored it to EEPROM, and then load a new version of Marlin where you forgot to change the Z probe in the configuration, we don't want to override the EEPROM value and possibly cause a nozzle crash. |
This issue has had no activity in the last 60 days. Please add a reply if you want to keep this issue active, otherwise it will be automatically closed within 10 days. |
Hi |
I might have some capacity to look into this bug. Is there anyone who is actively watching this and could do some testing? @Thinkersbluff maybe? In the meantime, if anyone want to experiment with an alternative to PID, I have submitted a PR for model predictive control and I would like to hear back from others how it works for them. #23751 |
Yes, I am monitoring this actively and yes, I can make time to run specific tests. I am an anal-retentive retired Engineer with just enough knowledge and experience to follow instructions rigorously, but I am not a programmer nd I have no access to an electronics lab or to exotic test equipment like oscilloscopes. I have an Ender3 and a CR6-SE. I do use VSCode/Platformio to compile Marlin 2.x for the Ender, and I can do the same for the CR-6. The display only works if I use the Community Firmware on the CR-6, though, so some controlled experiments may only be possible in “headless mode” on that one. The Ender3 uses the original rotary knob/LCD controller. I have the stock Type1 thermistors and aftermarket Eigweit 40W heater elements on both printers. The CR-6 is currently fitted with a Trianglelabs DragonHF hotend, and I find the aftermarket heater is a marginal fit in that E3 V6 clone heater block. I had to crank the retention screw as tight as I could, to hold the element firmly and I cannot improve on that, right now. I added thermal paste to improve thermal coupling between heater element and heat block on both printers. The Ender thermistor is a glass bead type, with no thermal paste. The CR6 thermistor is a cartridge type, with thermal paste and a grub screw. Both printers do seem to be working, with a small ripple on the extruder temperature but no evidence of this original issue on either machine, “unfortunately”. One part of any worthwhile experimentation may need to be figuring out how to destabilize the PID control again, before testing a “fix”? I do have a PT1000 Type 47 thermistor available for the CR6 but not for the Ender. The BTT ADC on both printers does seem particularly vulnerable to EMI, and I got frustrated with its inability to stabilize nozzle temperature with that PT1000 thermistor installed so I rolled-back the mod. How can I be of help? |
Hmm. If the error no longer occurs, perhaps there is nothing to do. Are you using a different firmware to the one that generated the graphs above? If so an easy test would be to return to that firmware and see if the problem returns. |
I understand. I originally posted my graphs and reports on the CR6 Community FIrmware GitHub, when I saw several CR6 Facebook community members chatting about the issue and blaming it on the CF. We later concluded that the bug was here, upstream of the CF fork, and there certainly do seem to have been a series of bugs over the years, with similar sounding issues. I can see the temperature readings in the Octoprint Terminal "jumping" up and down by a couple of degrees at a time, from one sample to the next, which "dithering" I imagine is largely a matter of ADC noise overlaid on the "actual" thermal reading digitized with +/- one digit uncertainty at the ADC resolution. I do not understand enough of the system design to know how many significant figures the firmware really has to work with, but there must come a point beyond which it is futile to try to derive more accurate readings from the available data... If your PR is designed to improve the ability of Marlin to cope with "noisy" data, is there any value to a log of thermal readings from the terminal on my system? (i.e. do you have a simulation setup to compare the data at the output of various points in Marlin, with and without your PR?) |
This is a really fraught area with so many variables. Different sources of error can easily be conflated and what might be a software issue on one machine could look similar to a hardware error on another. Some of what you describe sounds like #22893. After a long conversation, that issue resulted in two PRs from me which have just been merged, #23871 and #23867. Both could have an effect on the apparent quantization of ADC values. One activates 12 bit ADC (which was supposed to be the norm for 32 bit ARM processors but owing to a subtlety was not) and the other allows 16 times oversampling for when 12 bit ADC is used. On the other hand, maybe the behaviour you mention has a quite different source! If I had someone who could replicate the 10 degree offset we might be able to work together to find its source. My MPC PR should do better than PID with noisy data. However I do need real systems to test it against because I have already established that it does well against simulated hotends. How does the CF work? Does it merge in upstream Marlin changes? Is it similar enough that upstream PRs could be merged easily? |
I believe there is a PR being actively worked now, between @thinkyhead and @Sebazzz to merge the CF fork back into the mainstream. That was the goal of the CF project from the beginning. No idea how long it might take to complete that merge though… Although it may leave support for the stock TFT display on its own branch, most of the CF fork is still Marlin and there is an unreleased extui branch on the CF fork that @Sebazzz has been updating with Marlin PRs. There may be other CF users who can still reproduce the problem. I believe at least one of the original Facebook “gang” commented on that issue and may still be monitoring it for updates. Maybe you can find a useful partner by recruiting on Community Firmware GitHub issue#248 |
Indeed, I believe I also had that problem when I briefly experimented with 2.0.9.2. I did not have the problem described here, but I could not get Filament Load/Unload to work because the firmware kept waiting for the nozzle temperature to stabilize. In the end, I swapped the PT1000 out, and the problematic behaviour “disappeared”. |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Did you test the latest
bugfix-2.0.x
code?Yes, and the problem still exists.
Bug Description
This bug is present in the CR6-SE Community Firmware at Release 6, which incorporates Marlin bugfix2.0. I am running this particular version, compiled for my printer's hardware configuration: CF6.1-Pre2-btt-skr-cr6-with-stock-creality-tft-2021-04-18-22-12.zip
I reported this bug on the Community Firmware GitHub Issue#248, but the CF6 developer has asked me to report it upstream, here.
Description:
The nozzle temperature sometimes fails to reach the target value when heating.
When this is happening, the temperature may climb to within a few degrees of target, but then drops again, cycling around a center value approximately 10 degrees below target. Eventually, the system throws a "Heating Failed failed to achieve target temperature within the alloted timeframe" message on the screen and "kills" the job, forcing the user to cycle power to recover.
wtf.txt
Bug Timeline
Issues with PID not performing as well as in the past have been reported by several users since Release 6 of the Community Firmware.
Expected behavior
Nozzle should heat to the target temperature and stabilize. Particularly if the system has just had a PID run (M303 E0 Sxxx U1) at the same target temperature with no problem, yet now can not heat to that xxx temperature to print.
Actual behavior
When this problem occurs, the serial interface shows that the printer recognizes the correct target temperature, yet it stops short of heating to that value, cycling instead around a value about 10 deg C lower than the target.
NOTE: These users on the Creality CR6SE/MAX Official Facebook Group describe this same problem in other scenarios, so it is not specifically or uniquely an issue when heating to 235C or when running an esteps extrusion.
I have also been able to successfully achieve 230C when I could not achieve 235C and I have achieved 235C when the nozzle was already at 230C, so there are other parameters at play, here, that I have not yet isolated. The part cooling fan was off, the whole time.
In the final cycle of the second graph (see my comment below this post), the printer actually bumped-up from 230 to 235 just at the end, there. No idea why, as you can see it was settling at the lower value & I touched nothing.
Steps to Reproduce
I was able to reproduce this problem fairly consistently as follows:
Version of Marlin Firmware
Latest Bugfix2.0 merged into Community FIrmware Release 6.1 Pre 2 on 18 April 2021
Printer model
Creality CR6-SE
Electronics
BTT SKR CR6 motherboard, stock hotend, stock cooling fans, stock TFT. Users with Creality 4.5.3 boards also report this issue.
Add-ons
None
Your Slicer
Cura
Host Software
OctoPrint
The text was updated successfully, but these errors were encountered: