Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HB not heating during G29 #5698

Closed
lukeskymuh opened this issue Jan 14, 2017 · 44 comments
Closed

HB not heating during G29 #5698

lukeskymuh opened this issue Jan 14, 2017 · 44 comments

Comments

@lukeskymuh
Copy link

I tried to perform an G29 with a 10x10 matrix with the HB at 70°C to measure the distortion of my HB over temperature. After some pints I get an "ERR: MINTEMP BED". I asume that the HB thermal control is inactive during G29. Is this by purpose or a bug? (Is there a work around?)

@lukeskymuh
Copy link
Author

RC8 Bug fix 1.1.0

@ghost
Copy link

ghost commented Jan 14, 2017

I would check your thermistor wiring. It is likely broken somewhere.
I'll bet, for the short term, that it doesn't throw that error at the same point every time.

@chorca
Copy link

chorca commented Jan 14, 2017

I'm getting the same issue on my printer.
Randomly during printing I get mintemp errors. I set up my oscilloscope on the pin with a 2Gs/s capture rate and triggering at 4.65v, and didn't see a blip during the MINTEMP error.
Didn't have this issue with RC6, and even rewired my thermistor. Seems it may be in software.

@ghost
Copy link

ghost commented Jan 15, 2017

Hmmm. Well, I have a Prusa i3, which has a ramps 1.4 on an Arduino Mega 2560. I've had that issue a few times and it was a broken thermistor wire in both cases. I have run RC6, RC7, RC8 and RCBugFix. No issues with the firmware for the board I have.

So...IDK.

@Blue-Marlin
Copy link
Contributor

Blue-Marlin commented Jan 15, 2017

ERR: MINTEMP BED is thrown in the temperature interrupt. A relation to G29 is unlikely - except G29 is moving the bed and the wires connected to it. Even if during G29 the bed would not be heated, it seems to be unlikely your bed could cool down suddenly below MINTEMP - except you are printing in an environment far below mintemp - but then its unlikely you could start heating the bed at all.
What kind of levelling do you use with how many points to measure? If out of RAM every error is possible.
ERR: MINTEMP BED is ought to detect broken thermistor wires. I'd take that serious.

@ghost
Copy link

ghost commented Jan 15, 2017

Swap the hotend and bed thermistor plugs at the board. See if you get a MINTEMP instead of MINTEMP BED error.

@lukeskymuh
Copy link
Author

lukeskymuh commented Jan 15, 2017

My thermistor is working correctly and the temperature is appearing on the display. I also don't see the heat LED lighting up when performing G29. So I really think that when performing auto-level the heat bed control is stopped. Another user has just posted the same issue: #5702

@lukeskymuh
Copy link
Author

Blue-Marlin I have used 10x10 points so it takes a long while. to get a profile like this attached.
flattness of aluminium hb for more details see: https://3dprint.wiki/reprap/anet/a8/improvement/autobedleveling#sensor_and_sensor_support

@lukeskymuh
Copy link
Author

If I disable the HB before the G29 everything works but the HB drops fom 75° to below 50°C.

@ghost
Copy link

ghost commented Jan 15, 2017

I normally don't heat anything before a G29 and everything works fine. My printer sits in a room that drops the temps as cold as 4°C and it still works.

But, I'll heat my bed to 50°C (my normal running temp for PLA) and see what happens.

@lukeskymuh
Copy link
Author

I checked it again with the current BugFix (Downloaded 22.1.2017).
I have to correct: The HB switches on, so the HB is heating. But when setting the HB to 60°C and starting G29 (3x3) after the 3rd point I get the following error:

00:22:25.417 : G29 Auto Bed Leveling
00:22:36.284 : Error:MINTEMP triggered, system stopped! Heater_ID: bed
00:22:36.292 : Error:Printer halted. kill() called!

It should be easily be reproduced. Can anyone test it on another printer?

@lukeskymuh
Copy link
Author

Up to now I know that that the error is triggered because current_temperature_bed_raw(28137, 31846) i is above bed_minttemp_raw(16079)

@lukeskymuh
Copy link
Author

I deactiveted the the trigered alarm (see zip file). Now it is working. I see that there are random temperature spikes (see screenshot). Due to some reason "current_temperature_bed_raw" has random values sometimes.
A work around would be to extend MAX_CONSECUTIVE_LOW_TEMPERATURE_ERROR_ALLOWED to the HB temperature. But is would be nice to solve the problem. Thanks in advance.

The main question is: Is it my hardware(Anet A8) or the software. If someone could reproduce the error or show that it works on another printer that would be great.

The temperature.cpp file attached with deativated alarms is for test and debug only. I do not recomend deactivating the temperature alarms.

temperature.zip

screenovertemp

@chorca
Copy link

chorca commented Jan 22, 2017

My issue was similar, though while I was printing instead of while I was running a G29. I performed analysis of the pin and while I could not see any voltage change with my oscilloscope, the A/D value jumped up which seemed to indicate the temperature dropping very low suddenly. I haven't been able to resolve it and have switched to Smoothie for the time being.

@JohnOCFII
Copy link

If folks are getting around this by disabling the thermal protections -- I wonder if the thermal protection logic is faulty, or if that many of us are running faulty printers. I noticed that the older RC3 version I was running had the thermal protection commented out. Mine is dying either during G29 or after the print starts. I did a few prints without G29 on RC8 (non-bugfix) that worked fine. I'm not sure what my issue is. Mine looks like extruder 1 (my active extruder) having the issue:
READ: Error: Heating falied, system stopped! Heater_ID: 1 READ: Error: Printer halted. kill() called!

@schustercp
Copy link

schustercp commented Jan 28, 2017

I have recently upgraded 2 printers to RC8 and both printers are exhibiting the same issue. To reproduce I use pronterface to set the bed temp to 110 and then click the home button while the bed is heating up. The AirWolf 3D homes all axes and then displays the error and stops every thing. The PrusaI3 Homes X and Y and then Moves the Z axis to deploy the Z probe using the servo, then the mintemp error is shown and stops everything. If I let the bed heat to temp first the error does not get shown on either printer. If I roll back to RC6 then it works correctly again. I have not updated my Delta due to this issue.

@ghost
Copy link

ghost commented Jan 28, 2017

Up to now I know that that the error is triggered because current_temperature_bed_raw(28137, 31846) i is above bed_minttemp_raw(16079)

The bed temp is supposed to be above the mintemp, if not, the mintemp error will stop the printer.

Anyone consider faulty thermistors? or faulty thermistor wiring?

@chorca
Copy link

chorca commented Jan 28, 2017

It would be surprising if suddenly all these thermistors started having issues after the update, but not impossible. I hooked an oscilloscope to mine but couldn't see any change in the voltage when the MINTEMP was hit.

@lukeskymuh
Copy link
Author

Anything is possible, the question what is likely:

  • Faulty thermistors and wrong wiring would normally not lead to random spikes, but a constant offset or noise normally. But maybe there is a complex semiconductor effect. But see the argument from chora-> Unlikely
  • Bad contact coud be possible. But I would it is strange that someties I get max (low resistance -> short) and sometimes min (high resitance -> bad connection). This failure is -> unlikely.
  • Damaged board -> possible, but there are many people with this problem.
    Software -> possible, likely if there are many people withother printers but the same software with the same error. Otherwise unlikely.
  • Electromagnetic effect on the wiring -> I am not a expert but I would guess it is unlikely because of the low frequency.
  • Electromagnetic effect on the board -> I have no idea. But I would concider it unlikely because of the board heritage.

Any other sugestions? Or input?

@ghost
Copy link

ghost commented Jan 28, 2017

I have recently upgraded 2 printers to RC8 and both printers are exhibiting the same issue. To reproduce I use pronterface to set the bed temp to 110 and then click the home button while the bed is heating up. The AirWolf 3D homes all axes and then displays the error and stops every thing. The PrusaI3 Homes X and Y and then Moves the Z axis to deploy the Z probe using the servo, then the mintemp error is shown and stops everything. If I let the bed heat to temp first the error does not get shown on either printer. If I roll back to RC6 then it works correctly again. I have not updated my Delta due to this issue.

I cannot reproduce this issue on my Prusa i3 running the latest RCBugFix + some other unrelated PR's.

@ghost
Copy link

ghost commented Jan 28, 2017

Although I do not use ABL anymore, I did try heating the bed to 110C and while it was heating, initiated a G29. Points were accepted with no issues. I did not have any mintemp issues. All is ok for me.

With that, how can anyone say that this issue is a firmware issue?

What I CAN say is a firmware issue, is that when I was on my 9th probing point, I issued a M84 then a G28. The steppers turned off then everything homed. But, my LCD was still on the screen for probing the 9th point. I had to reset the printer to get it out of the MBL mode.

@ghost
Copy link

ghost commented Jan 29, 2017

Anyone using ABL....try the devel-ubl branch and see if it does it with that version.

@ghost
Copy link

ghost commented Jan 29, 2017

It would be surprising if suddenly all these thermistors started having issues after the update, but not impossible. I hooked an oscilloscope to mine but couldn't see any change in the voltage when the MINTEMP was hit.

All 5 thermistors started having issues after the update?
What about the HUNDREDS running the latest update that don't?

BTW, I am using a type 6 for the hotend and type 1 for the bed.
Is it possible that the wrong thermistor is being set?

@chorca
Copy link

chorca commented Jan 29, 2017

All 5 thermistors started having issues after the update?
What about the HUNDREDS running the latest update that don't?

I'm just saying it's possible that it's my board. I did everything I could to determine that it wasn't my wiring, hooking a 2GS/s oscilloscope up to it and triggering on a voltage change, but even when triggered, I'd validated that that line on the IC was still receiving the correct voltage. Unless my ATMEGA chip just decided to up and die/become intermittent when I loaded RC8, i'm not sure where else to look.

The fact that there's more than one or two people having this issue seems to indicate we have some common issue.

@ghost
Copy link

ghost commented Jan 29, 2017

@chorca: I have seen weird stuff that cannot be explained, in my days. Change the thermistor and see if that changes things.

@schustercp
Copy link

schustercp commented Jan 29, 2017

I put

      SERIAL_ECHO("min: ");
      SERIAL_ECHO(bed_minttemp_raw);
      SERIAL_ECHO(" Raw: ");
      SERIAL_ECHOLN(current_temperature_bed_raw);

above the line for the min temp test

      if (bed_minttemp_raw GEBED current_temperature_bed_raw && target_temperature_bed > 0.0f)      min_temp_error(-1);

In the Temperature::isr

I know serial port prints here are changing timing but I believe the HOME Function is either calling the ISR or causing the interrupt to signal too often.

In the result below you can see where the home function causes the glitch. If the bed is heating when this glitch happens too many ADC results get accumulated in the "current_temperature_bed_raw" and then the min test traps the error. (In the test below the bed is not heating so this is the readout of ambient temp and when the glitch happens the bed is not moving.)

min: 16063 Raw: 15600
min: 16063 Raw: 15601
min: 16063 Raw: 15599
min: 16063 Raw: 15601
min: 16063 Raw: 15600
min: 16063 Raw: 15597
min: 16063 Raw: 15601
min: 16063 Raw: 15599
min: 16063 Raw: 15601
min: 16063 Raw: 15601
min: 16063 Raw: 15601
min: 16063 Raw: 15601
echo:busy: processing
min: 16063 Raw: 15601
min: 16063 Raw: 15602
min: 16063 Raw: 15595
min: 16063 Raw: 15600
min: 16063 Raw: 15599
min: 16063 Raw: 15600
min: 16063 Raw: 15600
min: 16063 Raw: 15599
min: 16063 Raw: 15599
min: 16063 Raw: 15599
min: 16063 Raw: 15605
echo:busy: processing
min: 16063 Raw: 15601
min: 16063 Raw: min: 16063 Raw: min: min: 16063 Raw: min: 16063 Raw: 27304
27304
16063 Raw: 27304
27304
27304
min: 16063 Raw: 15598
min: 16063 Raw: 15598
min: 16063 Raw: 15600
echo:busy: processing
min: 16063 Raw: 15600
min: 16063 Raw: 15599

@Blue-Marlin
Copy link
Contributor

Looks as if the temperature ISR is biting its tail. Try

 ISR(TIMER0_COMPB_vect) { Temperature::isr(); }
 
 void Temperature::isr() {
   //Allow UART and stepper ISRs
-  CBI(TIMSK0, OCIE0B); //Disable Temperature ISR
-  sei();
+  //CBI(TIMSK0, OCIE0B); //Disable Temperature ISR
+  //sei();
 
   static uint8_t temp_count = 0;
   static TempState temp_state = StartupDelay;
   static uint8_t pwm_count = _BV(SOFT_PWM_SCALE);

(Serial output during an interrupt alters the interrupt timing. So the result we see here may be caused by the debug-output you added.
On the other hand i warned about reentering an interrupt before and asked for extra protection to protect from that.)

@Blue-Marlin
Copy link
Contributor

Alternatively you could try to deactivate ADVANCE or LIN_ADVANCE.

In the advance_isr_scheduler()

    // Restore original ISR settings
    cli();
    SBI(TIMSK0, OCIE0B);
    ENABLE_STEPPER_DRIVER_INTERRUPT();

while the temperature interrupt is running`, is NOT restoring the original ISR settings, but alters them.

@Blue-Marlin
Copy link
Contributor

To fool this

@@ -1486,11 +1486,15 @@ void Temperature::set_current_temp_raw() {
  *  - Check new temperature values for MIN/MAX errors
  *  - Step the babysteps value for each axis towards 0
  */
 ISR(TIMER0_COMPB_vect) { Temperature::isr(); }
 
+bool in_temp_isr = false;
+
 void Temperature::isr() {
+  if (in_temp_isr) return;
+  in_temp_isr = true;
   //Allow UART and stepper ISRs
   CBI(TIMSK0, OCIE0B); //Disable Temperature ISR
   sei();
 
   static uint8_t temp_count = 0;
@@ -1942,7 +1946,9 @@ void Temperature::isr() {
       endstop_monitor_count &= 0x7F;
       if (!endstop_monitor_count) endstop_monitor();  // report changes in endstop status
     }
   #endif
 
+  in_temp_isr = false;
   SBI(TIMSK0, OCIE0B); //re-enable Temperature ISR
+
 }

should be a bit more difficult.

@lukeskymuh
Copy link
Author

Wow, thank you schustercp and Blue Marlin. I didn't understand everything, but as far I understood this is caused by a interrupt interference. Is this a work aroud or a solution for the next RC?

@Blue-Marlin
Copy link
Contributor

1 and two are things to experiment with. If that works, 3 is possibly one of several solutions to your problem.
But as said before. @schustercp 's test code itself may be the reason for the failure it detects.

@FHeilmann
Copy link
Contributor

Just chiming in, I had a Mintemp halt today on a bed that never had any issues before. I occasionally see spikes in temperature readings which are otherwise rock-solid. I use Bilinear leveling, and just compiled LIN_Advance in during my last flash, however I did not use its "effects" (K=0).

@brainscan
Copy link

I've had mintemp errors also on hardware that's always been fine, it seems to happen if the bed is at or below 17°C when I hit print, the LCD briefly shows the temp drop to below 5° which triggers the mintemp error. I need to load my previous version to see if it still happens.

@ghost
Copy link

ghost commented Feb 2, 2017

I've started prints @ 3 degs and no issue. I dropped my mintemp to -10 because I didn't want to have the mintemp error if I started it too cold.
ATM, my printer is 15 degs. I know I can start it up with no issues.

@ghost
Copy link

ghost commented Feb 2, 2017

LCD briefly shows the temp drop to below 5° which triggers the mintemp

This tells me that you have a broken thermistor wire or a bad/weak connection.

@FHeilmann
Copy link
Contributor

So i recompiled the RCBugFix branch I had, and did nothing but switch off LIN_ADVANCE, and the print that previously failed consistently with MINTEMP errors went through just fine.

Seems like @Blue-Marlin is on to something

@thinkyhead
Copy link
Member

thinkyhead commented Feb 12, 2017

Could it be that LIN_ADVANCE is eating up too much CPU and preventing the temperature ISR from getting readings?

CC: @Sebastianv650

@Sebastianv650
Copy link
Contributor

I read through the issue, I think we should seperate two possible causes. In more than the first half of the issue, there is no mentioning of an active advance so I guess it wasn't enabled from the persons reporting there. If I'm wrong, please raise a hand.

Therefore I see maybe two issues:

  • The temperature routine returning a wrong raw value in some cases. I agree with @Blue-Marlin that this might have something todo with my ISR changes independent from an enabled LIN_ADVANCE.
    But I still don't believe it's biting it's tail:
void Temperature::isr() {
   //Allow UART and stepper ISRs
  CBI(TIMSK0, OCIE0B); //Disable Temperature ISR
  sei();

How it should be re-executed with this combination: The ISR is disabled, then afterwards global ISRs are enabled again. Nevertheless, it should be easy for someone having that problem to prove this by inserting the change Blue-Marlin provided above, checking for in_temp_isr is true and throwing an error message over serial if so.

But I can think of another scenario where the raw value might get garbled: The temperature ISR might become delayed due to serial events inside the stepper ISR or also due to stepper and serial events inside the temperature ISR that much so that the time frame between the first (delayed) loop where an ADC conversion is started and the next (maybe not delayed) loop where the result is captured gets too short. Somebody has in mind how much time "safety" Marlin has at the moment between two ADC conversions vs the time it needs to do one conversion? On a first look, Marlin isn't checking if the ADC conversion is finished when grabing the raw value.
Would that be reasonable?

  • LIN_ADVANCE is eating up quite some cycles, that's for sure. @thinkyhead if it's also too much, I can't say. I had not a single problem on my printer with the recent LIN_ADVANCE code base. But I also don't use a lot of features (no bed leveling, no kinematic). I also don't think there is a lot I could do to make it faster, therefore the "solution" is quite easy: If we come to a point where the printer becomes slow or something and it's not due to a bug, then LIN_ADVANCE can't be enabled any longer on that specific configuration with the current hardware.

@Sebastianv650
Copy link
Contributor

Just claimed I'm printing without errors since weeks - just guess what happened to my recent print: Mintemp error.

And I have to say there is in fact a way to re-enter the temperature ISR. I wasn't seeing it before.
If Marlin is inside the temperature ISR, the stepper ISR is enabled. If a stepper event is now happening Marlin will proceed with the stepper ISR. Now, at the end of the stepper ISR, the temperatre ISR gets enabled again. While Marlin proceed the rest of the temperature ISR, it's now vulnerable to a second ISR call.
In the opposite direction, re-entry of the stepper ISR, shouldn't be possible as we are not re-enabling this ISR inside another ISR.

I did the changes @Blue-Marlin wrote above to my local Marlin copy and print the part a second time now. I'm sorry, if that's the cause it's my fault for all the temperature errors and maybe also freezing issues!

@Blue-Marlin
Copy link
Contributor

@Sebastianv650
If we have the information if we are in the temp/interupt (boolt in_temp_isr) we can restore the correct state in the stepper interrupt.
AnHardt#74

@Sebastianv650
Copy link
Contributor

Sebastianv650 commented Feb 12, 2017

I see you already have an even better version. We only have to keep an eye on all the ISR re-enable sections. There is one inside the ISRhandler with advance enabled, but also 2 or 3 inside the stepper ISR if advance is disabled. And I would like to add another cli() at the end of the temperature ISR before re-enabling the temperature ISR again. Do you want to update your PR or should I implement your improvements into the one I just created?

Edit: Just recognised it's @AnHardt PR. I will update my PR just to have the changes in one place, but feel free to close it and use AnHardts if you want.

@Blue-Marlin
Copy link
Contributor

Blue-Marlin commented Feb 12, 2017

Got some calls keeping me busy at least the next 12 hours. So please do it yourself - you got the idea.

@thinkyhead
Copy link
Member

This may be fixed by #5829

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants