-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed limiting to reduce load #4931
Comments
Would a sensible thing to do then would be to do a sanity check on startup
|
@gruvin @AndyZuerich @Rerouter
I pretty much agree with everything else you said. But please let me nit-pick on this. I'm wondering if we can relax some constraints. I honestly don't believe delaying heater readings by 4 seconds is dangerous. Let's take a worst case example. Suppose I'm doing ABS at 240 C. And suppose because of what was going on like a very high extrusion rate (where I needed lots of power applied to the heater to keep up) and the last few readings make the PID feel like it was falling behind. In that scenario, maybe the PID puts out a 'Full Power' request. Even if the PID was putting out a max heat request... 4 seconds is only going to be a few degrees. And who knows, maybe it over shoots a few more degrees after the PID backs off. I'm kind of skeptical. I don't believe that is going to happen. (I'm just trying to help you under stand my motivation for the next part.) This might be a good time to ask the question: When doing a PID algorithm, do we need a uniform time difference between the readings? It would seem to me we don't need that. It would seem mathematically that we can just take the time from the last reading and use that when we update our results. Specifically... The P part is what ever it is when we take the reading. That doesn't change with the time period. The D part may be bigger if our time period stretches because we were busy doing other things. But that can be compensated by just dividing by the longer time period. And similarly, the I part is going to need to be bigger because we are integrating over a longer time period. We just multiply that by the longer time period. Admit-ably, we lose a small amount of resolution because we are not tracking the temperature curve quite as accurately. But really... my guess is it will hardly matter. Is it worth letting the PID update function just run when ever it can run, and make it smart so it scales each incremental update to how much time has passed? Or, what I'm really asking is this: Can't we just run the PID algorithm when we have time to run it? Do we need to be a slave to it and make sure it runs at uniform time increments? If the printer is busy doing other important things, can't we wait 500 ms to run the PID algorithm and just compensate for the delay? My gut hunch is the answer is "Yes!" |
I've not had a chance to dig too deep into the stepper library, but it seems there is a massive amount going on in the ISR, which is very different to what i am used to (get in, do its thing, get out as soon as possible) I have been looking at the PID and been thinking of making it use a discrete time implementation so being delayed by a block of moments causes less issues than the current approach that assumes its fired off exactly on time every time, (it already calculates the time scaling each loop because you can change the variables) In other words the code is already in place, i'll draw up another pull request if you guys are happy with how that sounds, As far as the temperature goes, we would need some filtering in any case, but we could likley get away with less than 325 reading required per second, I'm having a dig through the datasheet to see what options we have available |
To see what i mean about the PID time, the variable used is "PID_dT" , currently defined as |
Of course I mention the Shaving cycles throughout the planner and stepper ISR is going to help, of course. But that still leaves open the possibility of allowing configured speeds to go beyond rated limits. The most obvious case is when upgrading to 32x micro-stepping drivers, immediately doubling the load. The old speed limits may be too high. Understandably, nobody wants to lower their feed rates after an upgrade. But 32x only gives you quieter steppers, and are only worth it if you can afford the extra CPU overhead. It would be nice if we could use So it seems to make the most sense to dynamically throttle the stepper ISR speed. |
It would more be a case of adding a debug variables to count how much time the ISR consumes at a given feedrate at a given steps per mm, we can then assume the worst case for 5 axis's all moving at once and add it to sanity checking, we would then relax it when we later improve things in other areas |
Shouldn't be too difficult to maintain a minimum time between the stepper interrupts. |
This depends highly on the hardware setup. For example with my heated bed, within 4 seconds, not much will happen. But my hotend is quick - very quick. Within 4 seconds and full power, I will have a temperature overshoot of 30 degrees or more, which would be unacceptable. Maybe I did not understand the problem in total: The 32 Microstepping with high speeds consumes most of the computation power. But the thermal manager is not the limiting process, i.e. the thermal manager is in general fast enough. The only problem is that the thermal manager is not getting called at all when the Arduino is driven at its limit. Concerning PID with e.g. a 4 seconds (or variable) delay: Well, if it is known that the sampling time is variable, this could be considered in the thermal manager algorithms. Currently, I only see the following problems:
The default frequency for this maximum value limitation is (as far as I remember 490 Hz) Therefore, the first solution approach for fixing this would be to use pure hardware PWM for all heaters. I am doing this anyways, since I control my heaters with a 31 kHz buck converter and hardware PWM.
Did I understand this right: By default, the temperature readings are set 325 times per second? Why don't we just update the temperature readings once every time before the thermal manager is called? All the other temperature readings are then anyways thrown away. To do a short summary of my text above:
@thinkyhead Would it hurt a lot for the stepper action to call the temperature reading just before the thermal manager, and then call the thermal manager with a fixed sampling period? Or is the reading of ADC values very time consuming, or does it include another delay time that we have to consider? I have just recently upgraded my delta printer to the DRV8825. My printer is at the moment not in operation (only the heaters work). Additionally, I have upgraded to a 128x64 display. Therefore, I expect that I would run into the same issue when I finish my hardware within the next weeks. |
It's an ADC thing. You need to start ADC, wait a short period, then fetch the result. Only one ADC may be active at a time. So we start an ADC on one pass, then read the ADC result on the next pass. As there may be up to 6 different sensors, there are 12 passes before all sensors are ready. We can drop the extra passes for sensors that don't exist. On most setups this would drop from 12 down to 4, so the sensor values could be read at 976Hz instead of 325Hz. |
I thought that what @Blue-Marlin described was basically in place already. The code that sets the rate also limits it. Take a look at static FORCE_INLINE unsigned short calc_timer(unsigned short step_rate) {
unsigned short timer;
NOMORE(step_rate, MAX_STEP_FREQUENCY);
if (step_rate > 20000) { // If steprate > 20kHz >> step 4 times
step_rate >>= 2;
step_loops = 4;
}
else if (step_rate > 10000) { // If steprate > 10kHz >> step 2 times
step_rate >>= 1;
step_loops = 2;
}
else {
step_loops = 1;
}
NOLESS(step_rate, F_CPU / 500000);
step_rate -= F_CPU / 500000; // Correct for minimal speed
if (step_rate >= (8 * 256)) { // higher step rate
unsigned short table_address = (unsigned short)&speed_lookuptable_fast[(unsigned char)(step_rate >> 8)][0];
unsigned char tmp_step_rate = (step_rate & 0x00ff);
unsigned short gain = (unsigned short)pgm_read_word_near(table_address + 2);
MultiU16X8toH16(timer, tmp_step_rate, gain);
timer = (unsigned short)pgm_read_word_near(table_address) - timer;
}
else { // lower step rates
unsigned short table_address = (unsigned short)&speed_lookuptable_slow[0][0];
table_address += ((step_rate) >> 1) & 0xfffc;
timer = (unsigned short)pgm_read_word_near(table_address);
timer -= (((unsigned short)pgm_read_word_near(table_address + 2) * (unsigned char)(step_rate & 0x0007)) >> 3);
}
if (timer < 100) { // (20kHz - this should never happen)
timer = 100;
MYSERIAL.print(MSG_STEPPER_TOO_HIGH);
MYSERIAL.println(step_rate);
}
return timer;
} |
Synopsis:
The After NOLESS(OCR1A, TCNT1 + 16); This limits the timer compare so the stepper ISR won't run again immediately — or at least, within 128 CPU cycles. This correction will only apply at the highest step rates, which have already been capped to 20KHz – more likely to 10KHz. When a stepper is using 32x micro-stepping and 160 steps per mm, even with quad stepping this seems to make 250mm/s the upper speed limit. How long does the stepper ISR need to be before this cap is invoked? By my calculation, only 784 CPU cycles when running at 20KHz, or 1584 cycles at 10KHz. Increasing this to (All given a 500mm/s inherent speed limit with a common belt driven axis at 80 steps/mm, or 250mm/s inherent limit if you go to 32x or 160 steps/mm.) |
i agree with @emartinez167 some sort of check at boot that either warns or just cap the speed would be ideal so that bad prints could be avoided |
dont know the code at all, but: at least the ISR should be able to run completely in that window once. or can you swear, you did all push/pop stack savings right? if all the other issues mentioned are new on v2 im 90% sure its an ISR problem: The thing is, if you have multiple IRs at once, you have to save registers and stuff at the IRs beginning and restore them at end. otherwise you get that unexpacted behaviour. of course it worked long time, but everytime this values are not saved right at IRs you get a more ore less failure on some values or its just a timer issue, if some timers stop working while executing IRs, you find about that in the arduino reference i didnt really get all that calculations, but wonder if that 20kHz really is enough for over 100mm/s but practically i think too, its not acceptable to check temps less then 0.5s / or pid bouncing will surely ruin prints at 300° PC :-/ |
PS: really great work so far! that HALL thing is great and would be a huge effort for other projects. @Rerouter you seem to think in the same direction with IRs as said above: i belief its either: othe workarounds would be really nasty |
@thinkyhead is this one still relavant or has it served its purpose? |
another stale one and we could as well use discord: Marlin Discord server. Join link: https://discord.gg/n5NJ59y |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
3 similar comments
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
As more end-users upgrade to 32x micro-stepping with drivers like the DRV8825 we're seeing more issues where innocuous activities like homing the Z axis will cause the CPU to become so overtaxed that the watchdog timer isn't getting reset. The end result is that sometimes during a move that is too taxing for the CPU the machine will simply reboot.
This comes principally from tying the watchdog to the period of temperature reading. But of course that's the point of the watchdog, to ensure that temperature readings are not so out of date as to become meaningless and throw the PID tuner into a tailspin. The watchdog is a necessary thing. Delaying heater readings for more than 4 seconds is dangerous. The other readings —filament width, etc.— are also time-sensitive and useless once they become stale.
But the watchdog wasn't necessarily designed with Stepper ISR overdrive in mind. It's not intended that stepping should eat up so much time that sensors cannot be read.
The cause of reset is that
updateTemperaturesFromRawValues
isn't getting called often enough, becausetemp_meas_ready
isn't set for a long period of time… becauseset_current_temp_raw
isn't called nearly often enough… becausetemp_count
hasn't counted up toOVERSAMPLENR
(which is 16). And,temp_count
itself only increases once per 12 invocations ofTemperature::isr
– after all the ADC sensors have been read. The temperature ISR runs at 62.5 KHz, so normally the temperature readings are updated 325 times per second. The temperature ISR would need to be blocked pretty heavily to be delayed by over 1300 invocations.The temperature ISR also manages heater PWM, so there's another good reason it shouldn't get blocked or slowed significantly.
Is it possible to apply a speed limit to the Stepper ISR, perhaps predicting the maximum load and capping movement speed? We need to make sure the ISR finishes quickly and that there's a minimum period of time between invocations, to leave enough time for other activities, prevent the steppers from stuttering, etc.
Even with improvements in performance, it's still easy to overload the CPU by trying to move too fast. So we need to place a cap on movement speed based on some assessment of load. If the planner senses that the stepper ISR is getting too close to the limit, it could apply a speed limit to the feed rate on blocks.
If the planner takes care of the speed limiting, then the Stepper ISR only needs to count up its total number of cycles. The planner can divide by
millis()
to get a sense of the average number of cycles per millisecond, and the speed will just be capped to something like 12 million cycles per second — or 75% of the available CPU.In the documentation it should be mentioned that every time you double your micro-stepping or your gearing it also doubles the amount of CPU load. I'd like to evaluate how close we are to performance limits at this point and come up with a good margin so we know what's supportable.
The text was updated successfully, but these errors were encountered: