Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Flow randomly changed from 100% to 10% during a print #1228

Closed
digant73 opened this issue Nov 3, 2020 · 177 comments
Closed

[BUG] Flow randomly changed from 100% to 10% during a print #1228

digant73 opened this issue Nov 3, 2020 · 177 comments
Labels
bug Something isn't working

Comments

@digant73
Copy link
Contributor

digant73 commented Nov 3, 2020

Since different PRs now, a lot of people are complaining of random change of the flow from 100% to 10% during a print.
Below a description of the typical error observed:

I get all these echo: F0, echo: (blank), etc. POPUP notifications within like 30 minutes (10%) of printing, then the Flow changes from 100% to 10%, which ruins every print. The same thing happens with the latest FW from master.

As you can understand it is a major bug that should be verified and fixed. I will also try to have a look. However, I ask people that worked on printing feature to have to review the code. The problem seems present since the end of September, beginning of October

Steps to reproduce

  1. start a print
  2. after some random time the flow is changed from 100% to 10%

Expected behavior
The flow is not changed

Actual behavior
The flow is randomly changed from 100% to 10%

@digant73 digant73 added the bug Something isn't working label Nov 3, 2020
@oldman4U
Copy link
Contributor

oldman4U commented Nov 3, 2020

It is the first time I hear about this issue and haven't seen this behavior so far. Which reports do you mean?

@baselsw
Copy link

baselsw commented Nov 3, 2020

I am having this issue at the moment as well. It is definitely random but is mostly apparent when I cold start the printer (turn on the printer from the power switch). After running a print, which eventually fails (because of the changing flow from 100% to 10%), all I do is press the reset button on the TFT and re-print the same g-code file (which this time succeeds). I don't get any notifications when this happens or rather I have not seen any when I have noticed the problem. In summary: pressing the reset button after power-cycling the printer has made it possible to print without problems.

@digant73
Copy link
Contributor Author

digant73 commented Nov 3, 2020

@oldman4U
it was verified by me and confirmed by other people using the current BTT FW. It seems the problem is present since majors changes were made on FanController.c and all over the project due to the introduction on config.ini of the parameter:

Fan Controller Count

Options: 0 to 2

fan_ctrl_count:0
...
...

Fan Maximum PWM speed (0 to 255)

format [fan_max: F0: F1: F2: F3: F4: F5: CtL: CtI:]

fan_max:F0:255 F1:255 F2:255 F3:255 F4:255 F5:255 CtL:255 CtI:255

On my printer fan_ctrl_count is set to 0.

@baselsw. So is the reset always a permanent solution for you? I mean, you start the printer, reset the printer and then the problem is not present for sure?

@baselsw
Copy link

baselsw commented Nov 3, 2020

@digant73 It has worked for me three or four times so far. Now, if this actually solves the problem I can not say with 100% certainty because the actual problem appears randomly. However, as an addition to this. Once I have had a successful print, the problem will not appear again until I power-cycle the printer. So to me, it only happens ONCE for every power-cycle. As long as the printer is kept on after a successful print, I can run as many prints as I like without seeing the problem.

@oldman4U
Copy link
Contributor

oldman4U commented Nov 3, 2020

It is great that you have a feeling where the problem could come from, and I add this to the known bug list.

The question is, why it happens on some machines, but not on others. Wouldn't it be good to share the configurations used?

I assume you are printing from Touchscreen Mode, but does it happen also using Marlin mode? How many times it happens out of 10 prints approximately?

SKR E3 DIP 2208UART
Marlin 2.0.7.2
Serial Port 2
No second serial port defined
115200
TFT35 E3 v3
Firmware 20.10.2020
Cura 4.6.1

@digant73
Copy link
Contributor Author

digant73 commented Nov 3, 2020

@oldman4U
Attached my config.ini on TFT FW and the config file on Marlin 2.0.6.1.
config.zip
Marlin.zip

Yes, touch screen is used for the print (no Marlin mode is available on the printer). Marlin 2.0.6.1 is used.
Basically, in a long print it is sure you will have the problem.
There was no problem at all before the mentioned PR introducing the Fan Controller Count

In the recent time I spent more time developing than printing. However many people using the FW on the same printer as mine reported me the same bug

@oldman4U
Copy link
Contributor

oldman4U commented Nov 3, 2020

Maybe you can try 2.0.7.2.

My son had 2 successful prints last weekend almost 12h and 8h long.

@Guilouz
Copy link
Contributor

Guilouz commented Nov 3, 2020

I compiled the firmware for a friend's TFT70 and it reports the same issue.

@digant73
Copy link
Contributor Author

digant73 commented Nov 3, 2020

@Guilouz same bug on flow from 100 to 10?

@Guilouz
Copy link
Contributor

Guilouz commented Nov 3, 2020

@Guilouz same bug on flow from 100 to 10?

Yes he found his printer with a flow at 10% (1st printing with his new screen) and obviously failed print...

Me, I have not noticed it because I go through octoprint.

@Mactastic1-5
Copy link

Mactastic1-5 commented Nov 4, 2020

I compiled Marlin 2.0.7.2 based on @digant73 Marlin 2.0.6.1 firmware and it's the same result. The problem isn't with the Marlin firmware, but with the TFT firmware. #1209 has a CtL fan control bug - maybe the bug comes from the same origin?

@baselsw
Copy link

baselsw commented Nov 4, 2020

I have seen the same problem using Marlin 2.0.7.2 compiled by @digant73, stock Sidewinder X1 firmware and the waggster mod firmware. The problem is in the TFT and not the main board.

@sebsx
Copy link

sebsx commented Nov 4, 2020

I have the same bug and I cannot see a pattern unfortunately, sometimes works for 2-3 prints then fails. Also, sometimes I get the "echo" on the display and a short beep and the flow stays at 100% instead of dropping to 10% like it usually does.

My Marlin is built and updated about once a week from the bugfix-2.0.x branch if that's of any help.

Another thing I've noticed is that after the bug occurs, the steppers (TMC2100 drivers) start to make this grinding sound and I need to powercycle the printer.

@oldman4U
Copy link
Contributor

oldman4U commented Nov 4, 2020 via email

@digant73
Copy link
Contributor Author

digant73 commented Nov 4, 2020

another PR that was merged in the firmware since when the bug is present is TMC Hybrid Threshold Speed.

@oldman4U
Copy link
Contributor

oldman4U commented Nov 4, 2020 via email

@sebsx
Copy link

sebsx commented Nov 4, 2020

Not using hybrid mode either with my TMC2100. Plan to though once I upgrade to TMC2209.

@digant73
Copy link
Contributor Author

digant73 commented Nov 4, 2020

A test that can be done is during a print move to a different menu than the Printing menu (the menu reporting all the stats such as fan, speed, layer, pause, babystep buttons etc...). Just to verify if the bug is due to some query executed by that menu.

@sebsx
Copy link

sebsx commented Nov 4, 2020

In my case the bug happens without using any other menu, i.e. I power on the printer, select the file then print. Later on the bug happens and the print is ruined (early in the process if I'm lucky).

Next what I usually do is power cycle and try again. If the bug shows its ugly head again I generally try to re-slice the file and attempt the print again. Sometimes it works, sometimes it doesn't.

It's a really weird bug.

@digant73
Copy link
Contributor Author

digant73 commented Nov 4, 2020

@sebsx if you can, next time try to move and stay to the "More" menu when you start a print and verify if the bug happens again. It's just to see if the bug is related to some queries made by the "Printing" menu (this menu queries Marlin to update all the stats displayed in the menu)

@sebsx
Copy link

sebsx commented Nov 4, 2020

Will do and report back.

@J2J2
Copy link

J2J2 commented Nov 4, 2020

Hi, maybe my bug is linked or not :/.
On my printer (similar config has digant73 but enclosed if necessary) bug is on speed (not on flow).
Do some test on short print (<1h) does not bug during print but often at begining.
When it bug, most of time it get stuck on 10% speed.
Just catch a video of goes to 10% en then automatically goes back to 100% https://youtu.be/dx6haolZQ4I O_o first time I seen it.
It seems to occurs just after "M420 S1" command from my start G-code command configured in config.ini.
See at 0:18 just after the echo fade off in the video.

tft_config.zip
Marlin.zip

@digant73
Copy link
Contributor Author

digant73 commented Nov 4, 2020

@J2J2
Although it should not be a problem, remove the space before M420 in
start_gcode:G28 XY R20\n M420 S1\n

so you have will have
start_gcode:G28 XY R20\nM420 S1\n

@J2J2
Copy link

J2J2 commented Nov 4, 2020

Thanks for the advice.

Bug on flow occurs for the first time for me too :D ... (sad and happy in same time, I have also the flow bug)
In the middle of a 2h print.

I'm going back to previous firmware (mesh bed level editing will miss me :( ).

@Mactastic1-5
Copy link

Mactastic1-5 commented Nov 4, 2020

@sebsx if you can, next time try to move and stay to the "More" menu when you start a print and verify if the bug happens again. It's just to see if the bug is related to some queries made by the "Printing" menu (this menu queries Marlin to update all the stats displayed in the menu)

@digant73 I can confirm that no POPUP notifications appear that suggest an error and the Flow doesn’t change from 100% to 10% on it’s own while the More menu is present on screen for the duration of the print. I’ve printed using the same model that I’ve printed previously, which failed every time at around 10%, which is equivalent to 30 minutes.

@oldman4U
Copy link
Contributor

oldman4U commented Nov 4, 2020 via email

@sebsx
Copy link

sebsx commented Nov 4, 2020

Hey @oldman4U one of the issues is sometimes the same gcode file works on the second try.

I suspect there's a buffer misbehaving somewhere in the gcode feed and response parsing that sometimes corrupts gcode and sends the wrong flow rate.

I'm not familiar with this codebase at all but I intend to make some time this weekend to poke around.

@oldman4U
Copy link
Contributor

oldman4U commented Nov 4, 2020 via email

@radek8
Copy link
Contributor

radek8 commented Nov 21, 2020

The problem described in #1267 when shortening responses to the M503 command could be related to this problem.

@oldman4U
Copy link
Contributor

oldman4U commented Nov 21, 2020 via email

@kisslorand
Copy link
Contributor

The problem described in #1267 when shortening responses to the M503 command could be related to this problem.

It can't be, M220/M221 responses are very-very short compared to M503

But I can not see a PR for it. Do you know how this will be fixed?

I managed to make a PR: #1276

@oldman4U
Copy link
Contributor

oldman4U commented Nov 21, 2020 via email

@radek8
Copy link
Contributor

radek8 commented Nov 21, 2020

The problem described in #1267 when shortening responses to the M503 command could be related to this problem.

It can't be, M220/M221 responses are very-very short compared to M503

The response to the M220 / M221 is short, but the communication between Marlin and the TFT is much greater and messages are truncated randomly according to memory usage. Therefore, the problem occurs randomly depending on where the message is truncated.
It's just my guess, but it would be worth trying your fix #1276 to see if it solves the problem.
It solved my abbreviated messaging issue. Thank you

@kisslorand
Copy link
Contributor

I beg to differ. Marlin sends a package of multiple messages. Each message is terminated with "\n" therefore the whole package sent by Marlin must end with "\n". If not, the whole package, all the messages are dropped, they are not parsed. Packages from Marlin are received in DMA-L1. Once the package sending has ended (by silence from Marlin) the messages are parsed into DMA-L2 one by one. Each message within the package from Marlin must end with "\n". From DMA-L2 each message is interpreted and actions are taken accordingly.
For finding the bug of flow drop to 10% one must approach it step by step. When does the error occur? When parsing the package to individual messages or when sending FR back to Marlin?

As a side note, in the issue #1276 the beginning of the message was cut, not the end.

@radek8
Copy link
Contributor

radek8 commented Nov 21, 2020

kisslorand, Thank you for the explanation

@kisslorand
Copy link
Contributor

You should check PR #1283, it should eliminate the effect of the bug causing the drop of the Flow from 100% to 10%.

@digant73
Copy link
Contributor Author

digant73 commented Nov 22, 2020

I will try the effect of 1276 first (although it should not fix the problem).
I had a look to your 1283 but is seems you should also provide a speedGetCurrentPercent function and use that in the PrintingMenu (to display the current value returned by marlin instead of the target value returned by speedGetPercent).
Also in parseAck you continue to use speedSetPercent that is the cause for the problem (the loopSpeed periodically invoked by loopBackEnd will send a M220/221 to Marlin for setting the new target value).
The same is applicable to Fan (same handling as the speed/flow).
Also I would improve the Speed.c menu providing, as in other menus, an update function in the main loop to periodically send (e.g. every 2 seconds) the M220/M221. Currently the menu continuosly sends M220/M221 at every loop. During a print, moving to that menu can affect the quality of the print.

@kisslorand
Copy link
Contributor

kisslorand commented Nov 22, 2020

OK, made some adjustments, speedGetPercent() is out of the way when receiving response from Marlin, speedSetRcvPercent() is handling now the received response from Marlin . The Speed and Flow will be updated as it is received from Marlin, no M220.M221 is sent, as it was in my previous commit also. The same is applicable to Fan speed also.
You got it wrong where you say "the menu continously sends M220/M221 at every loop". speedQuery() is called only in StatusScreen.c and PrintingMenu.c and it is called every 2 seconds when (OS_GetTimeMs() > nextTime), where nextTime = OS_GetTimeMs() + update_time, where update_time = 2000, where the units are milliseconds.
speedQuery() sends M220/M221 only when the current and the target speed does not equal, which is made equal in speedSetRcvPercent().

Later edit: My bad, I mixed speedQuery() with loopSpeed(). loopSpeed() is indeed called every loop, however it is not executed while current and target speed are equal. Same goes for loopFan().

Recap:

  • Flowrate, Feedrate and Fanspeed are checked every 2 seconds
  • received values are not sent back to Marlin, they are only displayed on the TFT
  • if current and target are different, a command is sent immediately (within one loop) but only if target was set on the TFT, otherwise target and current are both set to the value received from Marlin

@digant73
Copy link
Contributor Author

I was referring to speed.c but it seems it was already fixed so forget it (I was the old code).
Yes, now the logic seems correct. however I won't remove "setIgnoreEcho" from parseAck as you did.

@kisslorand
Copy link
Contributor

kisslorand commented Nov 22, 2020

I won't remove "setIgnoreEcho" from parseAck as you did.

It does nothing, just check it.

There's a new commit in the PR, some stupid bugs sorted out and some subtle changes.

@sebsx
Copy link

sebsx commented Nov 24, 2020

Edit: disregard, failed pull.

@sebsx
Copy link

sebsx commented Nov 27, 2020

@kisslorand it looks alright now over here, I compiled a version with both #1276 and #1283 changes in and seems fine so far. I'm using an MKS TFT 1.4 and I actually left in the 4k DMA buffer instead of 3k and seems ok so far. My RAM usage is about 40%.

@kisslorand
Copy link
Contributor

kisslorand commented Nov 27, 2020

There are 3 boards that have the DMA buffer set to 3072. That is because their "TERMINAL_MAX_CHAR" has to be set lower than the other boards, otherwise you'll get RAM overflow. I saw no point in allocating 4072 DMA buffer for those 3 boards, I am not sure that during runtime, if needed, they will have enough RAM to fill all the allocated bytes. Even if they do it will not fit into the terminal buffer, it will overflow so you just wasted some RAM.

@sebsx
Copy link

sebsx commented Nov 27, 2020

Thanks, I'm going to check that setting as well. The combination I have MKS TFT32 and MKS GEN L cannot do the Marlin display simulator (missing I/O pins) so I wonder if that could yield some RAM savings.

I'm moving to an MKS SGEN L 2.0 32bit board soon so I guess I'll have to see how it goes for that one as well.

@digant73
Copy link
Contributor Author

@sebsx so with PR 1276 and 1283 did you also get no error message or you get the error messages but no flow/speed is set to 10% as it should be guarantee by PR 1283? If you get no error message at all, it means that PR 1276 fixes the truncated string received by Marlin

@sebsx
Copy link

sebsx commented Nov 27, 2020

@digant73 I didn't see any popups at all so it is possible it may have been fixed by that PR. I'll keep printing for a few more hours today and keep an eye on it.

@sebsx
Copy link

sebsx commented Nov 27, 2020

There are 3 boards that have the DMA buffer set to 3072. That is because their "TERMINAL_MAX_CHAR" has to be set lower than the other boards, otherwise you'll get RAM overflow. I saw no point in allocating 4072 DMA buffer for those 3 boards, I am not sure that during runtime, if needed, they will have enough RAM to fill all the allocated bytes. Even if they do it will not fit into the terminal buffer, it will overflow so you just wasted some RAM.

This is how I have it setup for my TFT target (MKS_32_V1_4)

#define TERMINAL_MAX_CHAR ((LCD_WIDTH / BYTE_WIDTH) * (LCD_HEIGHT / BYTE_HEIGHT) * 8)

I end up with this and haven't run into any issues so far. But I'll keep an eye on it since I don't know what happens at runtime.

RAM:   [====      ]  39.8% (used 26108 bytes from 65536 bytes)
Flash: [======    ]  62.4% (used 163692 bytes from 262144 bytes)

If I do the math for this board I get (320/8 * 240/16) * 8 which yields 4800. How does the DMA buffer play into this?

@kisslorand
Copy link
Contributor

I have the same display as you but with SKR 1.4Turbo and using the same config as you.

How does the DMA buffer play into this?

Sorry, I do not understand the question.

@sebsx
Copy link

sebsx commented Nov 30, 2020

@kisslorand Sorry, I should have been more clear. What's the relationship between, for example, 3k or 4k DMA buffer and TERMINAL_MAX_CHAR set to 4800? Should they match (or be multiples of) or doesn't matter?

@sebsx
Copy link

sebsx commented Nov 30, 2020

@digant73 and @oldman4U Ok so another check-in 6 hours of printing later, no problems to report, it seems #1276 and #1283 fixed the problem for good.

@kisslorand I've been running my MKS 1.4 with 4k DMA buffer and #define TERMINAL_MAX_CHAR ((LCD_WIDTH / BYTE_WIDTH) * (LCD_HEIGHT / BYTE_HEIGHT) * 8) with no memory issues so far. Thank you.

@kisslorand
Copy link
Contributor

@sebsx DMA is a buffer that receives 4k bytes, The terminal buffer is filled with chars. Usually a char is represented in 1 byte, but some of them need more than 1 byte. So it is an estimate that 1200 byte for chars is enough for 1024 bytes of ASCII received. Hence the relation between 4k and 4800, 3k and 3600.

@oldman4U
Copy link
Contributor

The problem is that the last round of PRs also introduced some new bugs. Hopefully I have some more information tomorrow.

@oldman4U
Copy link
Contributor

oldman4U commented Dec 2, 2020

Hi.

So all the new stuff and most of the bug fixes are available with the latest master firmware. Hopefully the last fix from @kisslorand which solves a problem accessing the on board sd card will be merged soon, so all the newly introduced bugs are fixed.

Whats about the issue related to this ticket?

@sebsx
Copy link

sebsx commented Dec 3, 2020

@oldman4U that's awesome, thanks. I personally didn't have any more issues after I applied @kisslorand 's changes so I'm happy now and this issue can be closed from my point of view.

@digant73 digant73 closed this as completed Dec 3, 2020
Copy link

github-actions bot commented Apr 3, 2024

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants