-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] (MAX31865 no longer works after recent changes) #23439
Comments
After some more diagnosis, can confirm that the firmware is trying to access the software SPI bus (turned on debug and added a short delay before the print statements so I could connect over serial before the board errors out). |
first of all, uncomment line 28 in temperature.cpp (#define IGNORE_THERMOCOUPLE_ERRORS), which will allow you debug the issue properly. Please keep in mind that this disables basic thermal runaway protection, so don't try to heat up your printer with this setting. Also enable MAX31865 debugging option (line 44 @ MAX31865.h). Which is the last marlin release that worked for you? Is the SPI you're trying to use a hardware SPI? If yes, you may also try #define TEMP_SENSOR_FORCE_HW_SPI to see if it works for you. I haven't had much luck using it with software SPI myself, but I didn't really look into it further, since I'm using HW SPI on my setup. Please provide some feedback on the questions above and we'll see what we can try next. |
I will do that and be careful of that. The Marlin release 2.0.9.1 is working (downloaded in late September of 21). I have enabled MAX31865 debugging. I'm using Software SPI. I used to the debugging for MAX31865 to confirm this as well. I can also consider what hardware SPI would look like for my board. I'm using a screen that I believe also uses that bus though. My only reason to upgrade is that I get the occasional amount of interference where the temp will spike around 30C. I've seen this go away on my pellet extruder printers by using bang bang heating instead of PID. The new options implemented in 2.0.9.3 look very intriguing so that I could use PID again with my smaller systems without any random spikes. I will try out all of these options tomorrow. Thank you! |
There are also some important changes regarding software SPI in #22682, which are not included in 2.0.9.1. Try a commit before #23215 first and before #22682 after that, to see which causes the problem. I'll try to help you pinpoint the issue and fix it. Regarding hw SPI, you can share an SPI with other devices (like the screen you're mentioning, that's what I'm doing), just use a different CS pin. |
Good info. This gives me a bunch of things to try. I will have some results to share later today. |
If you're thinking of trying hw SPI btw, take a look at this hack: GadgetAngel/Adafruit-MAX31865-V1.1.0-Mod-M#1 (comment) |
I flashed with everything prior to #23215 (Fix and improve MAX31865 by Zeleps). I get somewhat normal temps (room temp + 5C) for a few seconds and then Temp Measurement Errors. If I flash back to 2.0.9.1 (before #22682), then I get normal room temps and there is no errors. It looks like there was some renaming also done between #23215 and #22682. My working version is after #22660, so the problem is somewhere between then and now (likely #22682 or the renaming). This is still on the SW spi bus. I will next look at ordering parts for the hack Zeleps mentioned (unless I have other reasonable access on the board) there so that can be an option for me to try for now. |
That's good testing @kurtis-potier-geofabrica! Now that you mention it, the few-seconds-normal pattern has occurred in my tests as well, but I didn't test it further with sw SPI. I'll try some tests myself and see what I can find. |
That would be great! Thank you |
I switched to sw SPI, and just like @kurtis-potier-geofabrica, I am receiving normal values at first (for a few reads):
This is still being read after setup completes.
Most obvious explanation is that some process affects SPI. I'll try to pin it down and return with more info. |
Just confirmed that disabling SD support eliminates the issue. @kurtis-potier-geofabrica please verify this. I'll delve into this later. |
@zeleps I tried this, but it did not fix the problem for me. (I tried on both the non-working versions I've been experimenting with). My output from the SPI debug is a little different. I get the attached ouptut with ignore thermocouple errors on and max31865 debug on. Will look into the code more as well and see if I can find anything and try things out. I'm not the strongest with C++ but not totally illiterate either. |
The last bit of the LSB is an error flag, and in all the values you've posted the bit is set. From the messages I understand that this is the version prior to my changes, so could you please share the output you're getting with #23215? I took a look at your board's pinout and I see that you're using the stepper driver SPI pins. Are you using the SPI for controlling the stepper drivers as well? |
I did test both, but didn't log output. Here is the output after #23215. For my board, I am not using the spi bus for steppers. My drivers are external drivers. Edit: The extruder was at room temp when I booted. |
The line Obviously your case is different from my case. It would have some value if you could connect to the serial as soon as the screen is cleared on reboot, in order to catch the early initialization messages. This would show what the first few reads look like, and when faulty reads occur. It would also be a good indication for a possible issue with SoftwareSPI.cpp if you could hook up the sensor on the hw SPI (MISO=PB14, SCK=PB13, MOSI=PB15, any free pin will do for CS) and see if it works. There is also this text by @GadgetAngel, which is specific to your board, and it might give you some insight: bigtreetech/BIGTREETECH-SKR-PRO-V1.1#179 (comment) |
Ok, now T0 reads only 255 from the SPI (obvious read error) while T1 is working correctly. This is quite different than the screenshots you sent before, where both sensors read the room temperature with some fault indication. What was the exact setup you used this time? (meaning pin assignments) |
I did the SPI pins as you suggested above and left my chip selects as PD4 and PC12. |
You must omit the definitions for sck/miso/mosi though, in order to enable hw SPI (or uncomment |
Yeah, I had #define TEMP_SENSOR_FORCE_HW_SPI uncommented. |
I'll get a second sensor on Monday and do some tests with two sensors. Since one of them works, this could have something to do with the SPI initialization being done more than once. I'll get back to you on this. Meanwhile, you could unplug one of the sensors and see if the other works. |
@kurtis-potier-geofabrica have you tried 1-shot mode? (commenting out |
Yeah, I've tried both modes to see if that mattered. Didn't change anything for me unfortunately. |
I got a second sensor and run some tests. It seems that the second sensor does not initialize correctly in auto mode (it reads the correct temperature, but returns the fault bit set, and the fault register returns 64 - low threshold), but it does work after a while in 1-shot mode. I tried a hack that seems to work on my system (run a fault detection cycle during initialization), and now both sensors work ok in any mode under hw / sw SPI (sw SPI still conflicts with SD support). @kurtis-potier-geofabrica could you try this: zeleps@5d745e4 and see if it works for your setup? |
I remember seeing this problem before. If I am reading this correctly, the current problem is that with SPI and the MAX31865 you get an error bit set when doing TEMP readings. I believe the problem is a startup issue between the two modules talking to each other (the MAX31865 module and the Printer's mother board). The TEMP readings are correct but for some reason the MAX31865 module sets the error bit. I believe this is a buffering problem on the side of the MAX31865 module. Since we can not fix the firmware on the MAX31865 board, I messed around with the idea of doing a short temperature reading dump (5 TEMP readings and dump the values) after the first TEMP read and throw them away if they are determined to be obvious errors generated by the MAX31865 module. I tried to watch the TEMP readings and set a acceptable range that I would expect the TEMP values to fall in. So If a TEMP readings come in and the next TEMP value is 10 degree higher then this is a obvious temperature error because temperature does not change values instantaneously when you are reading temperature values in milliseconds. I determined later that Marlin does do a check of temperature readings and I could not get it working. But I honestly believe the MAX31865 module is setting the error bit and it should not be doing this. The error bit could also being set due to noise on the SPI lines. When I did my testing I had bought two Logic Analyer devices so I could look at the actual data bits coming from the MAX31865 modules Logic Pro 8 (Red) - Saleae 8-Channel Logic Analyzer and DreamSourceLab DSLogic Plus USB-Based Logic Analyzer with 400MHz Sampling Rate . Try dumping or ignoring the first 10 reading coming from the MAX31865 board (ignore the error bits) and then look at the error bit from temperature sample 11 and on. This might fix the issue. |
@GadgetAngel, this really clears things up, thanks! I found out all sorts of random things happening on power on, so I'm now running an automatic fault detection cycle and manually set threshold registers (I found out that sometimes they initialize to wrong values, e.g. I had low threshold = 0xffff after power on). This seems to tackle all issues so far and the cost is minimal (less than 1ms), so I'm not making it optional. I've also implemented the hack you mentioned - just in case - so now you can @kurtis-potier-geofabrica please try this: https://github.com/zeleps/Marlin/tree/bugfix-2.0.x-FixMAX31865 |
I will give this a shot. These solutions look like they have a good chance of fixing the problems. My cables are overall pretty solid (running shielded cables the whole way, properly grounded). Even with the current working version, I still see some random spikes and get the occasional error. I don't see this when reading through a Raspberry Pi, so it definitely has something to do with how the MAX31865 and the board talk to each other. |
Judging from your screenshots, I think that resetting the threshold registers will do the trick for you. Please remember to uncomment MAX31865.h line 44 ( |
Can you please try and capture serial output right after boot? It starts emitting as soon as the display goes blank (before boot screen appears) and there's some relevant data logged at that time that would be useful... |
Yep, I will do this. I have a print running for a couple hours right now with the output going to pronterface and auto report of temps on (and MAX debugging) so I can look and see how the performance is over a print. After, I will convert back and try to log the output again. Will probably have to throw the delay in again on startup since the connection doesn't happen too fast. |
What I want to see is a line like I'd also suggest you use AUTO_MODE. There's no sense in 1-shot mode in marlin, just waste of time. |
Thank you so much for all the trouble you went through! Unfortunately, the numbers don't clear up things, in fact it seems that the temperatures read are off the charts at first (1st sensor), then they settle towards room temperature. I added some more debug info, in case of fault. If you could do one last test (I promise I won't ask for more after this) with sw SPI and AUTO_MODE with my latest commit, that would be great (unfortunately, I cannot reproduce the issue you're having with sw SPI on my board). |
Also, a |
This does not make sense, at least not with what can be made out of the sensor's datasheet. It appears that the sensor is in an invalid state. My only guess is that the sw SPI does not handle signaling as it should in your case, but that's not something I can help with, at least not without a way to reproduce the error, which probably means I need to get the same board as you have. So, since hw SPI is performing well now, I'll leave the sw SPI problem to someone else to tackle. It would still be worthwhile to look at your pin states, you can enable M43 by uncommenting Thanks again for the tests, at least we managed to find and correct some of the issues. |
Got that turned on and went back to the software spi quickly. Here is the output, it is very lengthy: |
The only thing that sticks out is that you're using the same pin for E2_CS and TEMP1_CS:
This shouldn't be a problem, since you're not actually using E2 or TMC driver control, but I guess you could use any other unused pin instead and see if it helps. Did you ever try using only one of the sensors (and completely unhook the other)? Anyway, since you're moving to hw SPI and I've run out of ideas, let's leave it at that. Unless someone else has something to add / try. |
Did not get around to do only the one sensor. Also, agreed, no driver is hooked up to those pins, so not shared. |
Hey @kurtis-potier-geofabrica, I just finished refactoring the sw SPI code in MAX31865. Now it works fine on my board, even with SD support enabled, with one or two sensors hooked up. I fixed a lot of stuff that had to do with timing, trying to be as close to the MAX31865 datasheet as possible. If you're still eager to help out, please check it out here. |
@zeleps I have a BTT GTR board I can send to you (if you live in the USA) so you can debug the issue. Where do you live? @zeleps Do you really live in Greece? I can ship the GTR board to you via USPS but I will need your address and it will take about a week to get to you (I think). How do we get in touch with each other? Are you on the Voron Discord server? If you are you can direct message me at "GadgetAngel#8701" that is my discord handle @zeleps Here is an invite to the Voron Discord channel |
Mine's zeleps#8169, I'll dm you later when I'm somewhere more comfortable. |
I can definitely try this out. My wiring is a little more permanently switched to HW SPI now, but converting back for a test shouldn't be too bad. |
Technically, you don't have to change your wiring, just declare the hw SPI sck / miso / mosi pins in your config as TEMP_0_SCK_PIN etc. This will start the SPI in sw mode (obviously, you need to comment out FORCE_HW_SPI). If this doesn't work with your current version but works with the latest commit, this is good news. |
Of course, if it works either way, it would be helpful to try the old wiring with the new code. |
Had some other things to work on with the printer, but I did get around to testing this today with SW SPI. It did not work. I did convert back to the old physical setup for this as well. Since this works on your board, it could be something unique to my setup. HW SPI is still working well though. |
Ok, I've ordered a BTT GTR 1.0 to see what's going on, this has intrigued me a lot. It's shipping from China (couldn't find one anywhere else), so it'll take a couple of weeks. I'll get back with results when I get it. |
Hey all, just got the GTR, tested with @kurtis-potier-geofabrica pin setup (sw SPI @ PD4, PB3, PB6, PG15), works fine with one sensor (TEMP_0) on the first try. This is a bare minimum setup, nothing else on the board but the sensor and a touch TFT (not even stepper drivers). Will try with two sensors later, but I'm not expecting anything less. |
@kurtis-potier-geofabrica I did a test with your config / pins files unchanged (as you posted them here), but with one sensor connected. The connected sensor worked fine (the other obviously did not). You don't have any jumpers on the stepper UART/SPI selection headers, do you? This is how I wired the board, do you think there's something I missed or should also try? |
Update: 2 sensors also work fine. |
@zeleps I have the SPI jumpers on and then connect where you would normally connect a stepper driver. I don't believe this is any different to your setup. With this being said, I think its fair to push the changes to main. Since switching to hardware spi on your branch, I really can't say I've had any errors or issues. |
I am so glad that this is finally working. |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Did you test the latest
bugfix-2.0.x
code?Yes, and the problem still exists.
Bug Description
Marlin now reports 989 as the temperature for both E0 and E1 after the recent changes to the MAX31865.cpp and associated files. With the older versions of Marlin, we were able to read temperatures successfully with the same hardware. Currently running on a Bigtreetech GTR V1.0 board using SPI pins as follows:
These pins are different from the defaults in the firmware, but did work on previous versions of Marlin. Also have confirmed that SPI bus can be read by a Raspberry pi directly before entering the board. I can revert to the older firmware and get readings.
Bug Timeline
new
Expected behavior
Read temperatures correctly.
Actual behavior
Temperatures read 989 on boot. Err is then displayed,
Steps to Reproduce
No response
Version of Marlin Firmware
2.0.9.3
Printer model
No response
Electronics
BTT GTR V1.0 STM32F4
Add-ons
M5 extender
Bed Leveling
No response
Your Slicer
No response
Host Software
No response
Additional information & file uploads
MAX31865_issue.zip
The text was updated successfully, but these errors were encountered: