-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESP32-S3 I2C instability v ESP32 when using i2c_master_xxx()
API (IDFGH-10122)
#11397
Comments
i2c_master_xxx()
APIi2c_master_xxx()
API (IDFGH-10122)
Hello @RobMeades , To me the issue you are encoutnering sounds very similar to this one: #9777 In fact, we found a bug in the I2C HAL layer that also affected the S3, a fix was pushed on In the meanwhile, you can cherry-pick it and see if it solves your issue:
|
Hi, and thanks for the swift response. I had discounted that particular issue since there was no error on the I2C bus preceding these errors. However, I can try just doing the same thing on headrev of your |
Unfortunately, moving to the current |
In case it helps, find below a high resolution trace (with your Here is the "sending to device" case, i.e. zooming into here: ...and here is the "receiving from device" case, i.e. zooming into here: Corresponding Saleae logic traces also attached: esp32s3_i2c_error_send_detail.zip and esp32s3_i2c_error_receive_detail.zip. Looks pretty clean, electrically speaking, to me. |
@RobMeades Thank you for all the details and screenshots. I tried on my side with your I suspect it's a timing configuration issue because, as you said, electrically speaking it is good, the signals seem clean to me.
The data sample time seems too big, may be due to the fact that you retrieved them with the older invalid commit. For comparison, and make sure we are using the same values, here are the detailed timings I am using:
To get them, here is the diff to apply: Check the timings you get and make sure we have the same configuration. If they are the same and you still get the same problem, try to isolate the issue with a small ESP-IDF example that uses the I2C API wrappers we provide, something like:
As a side note, your |
Yes, the timing values I posted in the "More Information" section above were retrieved on the 5.0.1 release, done by issuing As you can see from the first trace above, the error occurs about ~20 times in what must be several thousand transactions of various types over a 6 minute period, the real scenario; without knowing the exact circumstances that cause the problem, I suspect that using a simple small example, even repeated many times, won't reproduce it (though thank you for trying!). Is there any instrumentation I can add to the code I am running which would print out stuff of interest to you when the problem occurs? State information, that kind of thing? I will also look into your EDIT: seems that I am using the same values as you when on
If I repeat the calls to
...so the data sample time parameter has certainly halved since the 5.0.1 release, but the change has not helped with my problem I'm afraid. |
Hi, @o-marshmallow just to mention that in this forum entry: https://www.esp32.com/viewtopic.php?f=2&t=33583 |
Seems I was wrong. I tried your example, using Here is the code:
...here is what it printed:
...and here is what the logic analyzer shows: Zooming-in on the first, successful case: ...that looks good, and zooming-in on the second, fail case: ...it seems to have just stopped clocking after sending the device address for the read part; the Saleae probe believes there has been an ack. I tried removing my GPIO stuff, in case toggling pin 12 was somehow having an adverse effect, but that made no difference to the outcome. On other runs I also saw it stop after sending the device address for the write: Unh? Just to be completely clear, if I switch back to my original Find attached a ZIP file containing EDIT: if I keep retrying, pressing the reset button on the board (I've done it maybe 100 times now), very occasionally (e.g. one in 20 times) it goes around the loop 3 or 4 times, once it has gone around the loop 64 times and once, just now, I saw it go around the loop 140 times, but the other 93% of the time it only went around the loop one or two times; the error is 263 in all cases. |
Hi @RobMeades , I am not sure why using Anyway, I am still looking at the timing configuration, checking each value. In the Technical Reference Manual, the ESP32-S3 I2C hardware component expects that I2C timing specification states that Another thing that I have noticed is that Overall, here is the diff to apply these changes (including the I tested this on my end, with the new SCL pin you gave, and with more receive data, and it still works on my end: Here is the output on init:
If all of that doesn't work, the problem may not be in the configuration. How many devices have you connected to the I2C bus? Do you have any external pull-up resistors on your board? If not, can you try adding 4.7KOhm pull-up resistors on both SCL and SDA? |
Hold on a sec': since we've never tried using your Let me (a) try your most recent patch above on ESP32-S3 but with the 6 minute test (that uses the underlying I2C functions rather than the wrapper) and, if that doesn't work, (b) write a simple example using the underlying I2C functions that I know works on ESP32 (non-S3) and try that out again on ESP32-S3. EDIT: (a) shows the same results as before I'm afraid, setting up (b) now... |
Oh, and @o-marshmallow, don't forget @AndreasKohn's question which might have been lost between my ramblings above. |
Sorted things out now: the I2C device needed to be reset and allowed to settle, something our core code does and the simple example needed to be made to do. Below is the simple example code but, of course, it runs fine, doesn't show the problem: when the GNSS chip is "just ticking over" like this there are only a few hundred bytes of data to be read every second; there is a lot more variability/loading in the 6 minutes of regression tests that we run routinely. It seems quite likely to me that, since the error we are seeing is a timeout, and the maximum Is there any way to make
|
@AndreasKohn Sorry for the late reply, yes you should create a new issue (I see that you did that already, thank you!) @RobMeades How did you calculate the 0.387 microseconds timeout? The value set as a timeout is a power of two of SCLK clock cycles. In other words, if the timeout is set to As stated in the technical reference manual, the maximum value for timeout is |
Oh! I had read the ESP-IDF documentation as saying that the value is in 80 MHz APB clock cycles. If it is in a power of two of 40 MHz that makes rather a large difference :-). I have been changing a lot of things at once, and in different scenarios, over the last two days; in particular realising that the device I am talking to over I2C is a lot slower to respond in the first few seconds after power on: it is a complicated beast and I guess is booting etc. rather busily to begin with; it is around here that most of the timeouts are occurring (we power the device up and down many times during the 6 minute test). That, and I have been tweaking many variables on the ESP32S3 side. Since I had thought the timeout value was so small as to be useless, I had left it at the default of 12. Last night I ran the 6 minute test with the timeout set to what I perceived to be the maximum value of to 0x1F again, having checked i2c_reg.h for ESP32S3 (versus i2c_reg.h for for ESP32); the Summary: I think we're nearly there, thank you for your expert attention, I would just like to make sure I have references to the bits of ESP-IDF documentation where the value set using |
I understand the confusion about the timeout. On the ESP32, the I2C was clocked by APB indeed, which is 80MHz: But on the ESP32-S3, the I2C SCL_FSM (responsible for the SCL signal) is clocked by the SCLK clock: That SCLK clock can be either sourced from XTAL clock (40MHz) or from the RTC Fast Clock (17.5MHz). The resulted frequency is calculated as followed: In the previous examples, the divider was 1, so the SCLK frequency was the same as the source one, which was 40MHz XTAL. The divider was also calculated in I start wondering whether it could be your device that does clock stretching (pulls the SCL line low) in order to tell the ESP32-S3 to slow down on the transfer because it (device) is busy treating the previous command. If you are seeing this issue only when stressing the bus, I think it could make sense. |
That's excellent stuff: the other critical thing here, I think, is whether the value is a multiplier of the clock or a power-of-two of the clock: I've been reading the ESP32 technical document and the ESP32S3 technical document, plus the ESP-IDF description of the
|
Correct me if my link is wrong but I think I Just found the Integration manual for your device: https://content.u-blox.com/sites/default/files/MAX-M10S_IntegrationManual_UBX-20053088.pdf It is clearly stated that the device will perform clock stretching when the inner CPU is busy servicing an interrupt: When you perform a lot of transactions one after the other, the probability that the CPU will be servicing an interrupt while an I2C transfer occurs is very high, mainly at 100KHz (because the transfer last longer). This is why you see clock stretching and thus, get a timeout. |
Indeed, so what it comes down to is that, based on my experience with ESP32, I had taken the ESP-IDF documentation literally: On the face of it, there is no difference for ESP32S3, except that The key missing phrase, I think, is "power of two"! |
@RobMeades Good catch! This is indeed a mistake in the ESP32-S3 TRM, not only the clock source is wrong but the power of two is missing. The TRM talks about the power of two but for the FSM timeout ( The function
Else you are right, a 5-bit value wouldn't make sense as a timeout, the maximum would have been way too small 😄 |
Perfect: last question then - is there a way to determine, at ESP-IDF level, the current I2C clock? EDIT: I see the Guessing that an RC network is the RTC clock source and hence, if you set |
@RobMeades You guessed right, the clock is selected based on the The flag that correpsonds to the RC clock is |
Got it, thanks again, closing this issue. |
Answers checklist.
IDF version.
v5.0.1 (also occurs on
master
as at 54576b7)Operating System used.
Windows
How did you build your project?
Command line with idf.py
If you are using Windows, please specify command line type.
CMD
Development Kit.
ESP32-S3-DevKitC-1 v1.0
Power Supply used.
USB
What is the expected behavior?
Code written for ESP-IDF I2C
i2c_master_xxx()
API and runs without error on ESP32 should also run without error when compiled for ESP32-S3.What is the actual behavior?
[A few of your customers are seeing this problem, discussed here: https://www.esp32.com/viewtopic.php?f=2&t=33583 but we need attention from an Espressif person, hence moving it here.]
We have a GNSS device connected over I2C to an ESP32 (I2C HW block 0, 100 kHz clock, no clock stretching, SDA pin 18, SCL pin 19). Our automated test system checks that our code works perfectly all the time; the tests run for about 6 minutes. This works flawlessly.
However, when we build for and run on ESP32-S3 in exactly the same situation, we see instability:
i2c_master_cmd_begin()
sometimes returns error 263 (timeout) or occasionally -1 (no ack from device); it never passes the set of test.To illustrate the problem, see below a Saleae Logic trace of an entire test run where the red line indicates that
i2c_master_cmd_begin()
is running, the orange and yellow blips indicate that a timeout error has been returned byi2c_master_cmd_begin()
(orange for when the ESP32-S3 is doing a "send" to the device, yellow when the ESP32-S3 is doing a "receive" from the device) and the green blips indicate that "no ack from device" has been returned byi2c_master_cmd_begin()
:Zooming in on the first of these errors (
.sal
file also attached below):...you can see that a write has been set up to address 0x42 and the device has acked it (at least as far as the Saleae probe can tell) but then there is no clocking for the rest of the write, which I guess results in the timeout. All of the timeout errors on send, the yellow blips, look like this.
Zooming in on the second of these errors (
.sal
file also attached below):...again, the Saleae probe believes that a read has been set up to address 0x42 and the first byte (which is all zeroes) has been received and acked but then the clock stops again and
i2c_master_cmd_begin()
returns timeout. Curiously, 16 ms later, and not whilei2c_master_cmd_begin()
is running, the clock returns high, which the slave probably sees as a signal to get on with things and pulls the data bus low, but then the clock stays high (I guess this is some sort of error recovery thing, rather than a clock edge). In some cases this confuses the devil out of the bus and results in a nack the next time. All of the timeout errors on receive, the orange blips, look like this.The failures occur randomly, there is no discernable pattern to them.
For reference, here is a zoom into the same Saleae trace of a success case for send:
...and a success case for receive:
What could be causing this behaviour? Do we need to setup something differently to use the
i2c_master_xxx()
API on ESP32-S3? We've tried settingi2c_set_timeout()
to the max value of 0x1f (see post here asking why the register has shrunk in size so much for ESP32-S3 versus ESP32) but, as expected, adding such a very short delay makes no difference to the outcome.Steps to reproduce.
The entire code can be found here: https://github.com/u-blox/ubxlib/tree/debug_esp32s3_i2c, where this is the function that opens the I2C bus, this is the function that does a send, and this is the function that does a receive.
Also find attached segments of the Saleae probe trace for the send and receive cases (including the analogue waveform, which I didn't bother including above).
esp32s3_i2c_error_send.zip
esp32s3_i2c_error_receive.zip
Debug Logs.
See Saleae logic probe snapshots and actual
.sal
trace segments above. The entire Saleae logic trace is also available but is too big to attach here (261 Mbytes): let me know if you want it.More Information.
Just out of interest, I added some code to read the configuration values (the ones we don't change, assuming the defaults are good, plus the timeout value which we set on ESP32 but don't bother setting on ESP32-S3 because its range is too small to make any difference). Not sure if any of these matter - should we be changing them on ESP32-S3 for I2C to operate correctly?
The text was updated successfully, but these errors were encountered: