Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I2C driver bug (IDFGH-2966) #4999

Closed
fpgamaster opened this issue Mar 26, 2020 · 40 comments
Closed

I2C driver bug (IDFGH-2966) #4999

fpgamaster opened this issue Mar 26, 2020 · 40 comments
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally

Comments

@fpgamaster
Copy link

  • IDF version: v4.2-dev-792-g6330b3345

components/driver/i2c.c:

esp_err_t i2c_master_cmd_begin(i2c_port_t i2c_num, i2c_cmd_handle_t cmd_handle, TickType_t ticks_to_wait)
{
....
TickType_t wait_time = xTaskGetTickCount();
if (wait_time - ticks_start > ticks_to_wait) { // out of time
wait_time = I2C_CMD_ALIVE_INTERVAL_TICK;
} else {
wait_time = ticks_to_wait - (wait_time - ticks_start);
if (wait_time < I2C_CMD_ALIVE_INTERVAL_TICK) {
wait_time = I2C_CMD_ALIVE_INTERVAL_TICK;
}
}
....
}
This code results in wait_time = I2C_CMD_ALIVE_INTERVAL_TICK.
For situations when a timeout is needed it defaults to 1s (I2C_CMD_ALIVE_INTERVAL_TICK)!

A simple workaround is to replace this part of code with:
wait_time = ticks_to_wait;

@github-actions github-actions bot changed the title I2C driver bug I2C driver bug (IDFGH-2966) Mar 26, 2020
@koobest
Copy link
Contributor

koobest commented Mar 27, 2020

Hi, @fpgamaster
Thanks reporting this. but this timeout is not for the whole transmission, but just to ensure that the FSM is not stuck.

thanks !!

@fpgamaster
Copy link
Author

Hi @koobest ,
Yes, but if a device will respond with a delay you need to set a timeout and to poll the device again.
In this case no matter what timeout you set below I2C_CMD_ALIVE_INTERVAL_TICK it defaults to I2C_CMD_ALIVE_INTERVAL_TICK.
I hit this issue playing with a SHT20 sensor.

Best Regards

@koobest
Copy link
Contributor

koobest commented Apr 2, 2020

Hi,
I refer to its data sheet. When using Hold Master mode, the sensor will pull down the SCL (SCL stretch), but due to the hardware limitation, the stretching time cannot exceed 1048576 * Cycle_apb (13ms), otherwise it will cause timeout. This timeout is not the timeout set by i2c_master_cmd_begin. it is seted by i2c_set_timeout.
I also see that there is a No Hold Master mode, is it possible to use this mode?

thanks

@fpgamaster
Copy link
Author

Hi @koobest,
I forgot to mention that all test were in 'No hold' mode.
Best regards

@DavidJRobertson
Copy link

DavidJRobertson commented Dec 27, 2020

I am encountering this problem as well @koobest

The device I am using is the LTC2499 ADC. With this part, a conversion is initiated by addressing the device for a write (and optionally writing config data) - the stop bit of the write cycle initiates the conversion. Then, at a later time, a read can be performed to retrieve the result. However, while the conversion is in progress, the device does not respond to reads at all (i.e. it does not acknowledge the address with read flag).

If a sufficient amount of time is allowed before attempting the read, then everything works. But, if you try to poll the device, you will run into this bug. No matter what value you give for ticks_to_wait when calling i2c_master_cmd_begin, it will block the task for a minimum of one second.

Probing the bus with a logic analyzer, I see the following:
i2c-bug

Oddly enough, though, it does not appear to happen when the i2c address byte has the r/w bit set to write, rather than read.

@tcbennun
Copy link

tcbennun commented Jan 8, 2021

I'm suffering from this bug as well.

I have a system where multiple devices sit on the same I2C bus, and I poll each on a regular basis, at well over 100 Hz. The timing here needs to be precise.

Occasionally, one of the devices has an unavoidable failure and will not respond on the I2C bus for some time.

I don't understand why the timeout specified in I2C_CMD_ALIVE_INTERVAL_TICK is not configurable. It is unacceptable that I cannot use the I2C bus for 1 second if a device failure occurs. I need to communicate with every other device on the bus several times during that interval.

The I2C_CMD_ALIVE_INTERVAL_TICK parameter, in my case, should be no more than 10 ms in order to manage these timeouts acceptably.

@andycarle
Copy link
Contributor

To add a vote here: the Moddable SDK team would also like to see a way to customize I2C_CMD_ALIVE_INTERVAL_TICK.

We have a software example for developers that automatically scans the I2C bus for connected devices. Because we attempt a read from every possible I2C peripheral address, it encounters many expected device failures during normal operation. On other platforms we can scan every address in less than a second, but due to this issue it takes more than a minute to do the same on ESP32.

Because individual developers are building with our tools and their own ESP-IDF clones, we cannot easily simply modify i2c.c. It would be very helpful if there were a way to set I2C_CMD_ALIVE_INTERVAL_TICK in the sdkconfig or if i2c_master_cmd_begin respected the ticks_to_wait argument even if it was shorter than I2C_CMD_ALIVE_INTERVAL_TICK.

@JesusFreke
Copy link

I just ran into this as well. The effect of this value is that it's impossible to use a timeout less than a full 1 second in i2c_master_cmd_begin, because anything less than this value will get replaced here.

@RFgermany
Copy link

Hello,
I also do have this problem and can not understand, why to put a 1sec delay in a place instead of fixing this FSM glitches? In i2c.c there are multiple places where there seem to be workarounds for the FSM problems...
Every couple hundred transactions there are missing stop bits! and that results in always 1 second delay, and messages get corrupted!

@dhalbert
Copy link

dhalbert commented Feb 3, 2022

We are looking at this in adafruit/circuitpython#5908.

I went back in time, and when I2C_CMD_ALIVE_INTERVAL_TICK was first introduced, it was a ceiling, not a floor:
https://github.com/espressif/esp-idf/blame/9075b507b54a99ce732fc2819ba17274d9a69ab5/components/driver/i2c.c#L1142

        TickType_t wait_time = (ticks_to_wait < (I2C_CMD_ALIVE_INTERVAL_TICK) ? ticks_to_wait : (I2C_CMD_ALIVE_INTERVAL_TICK));

@simondddd
Copy link

simondddd commented Feb 11, 2022

I thing this commit is where it gets wrong
d913fff#diff-380dc90b7523e5ecbb9c3fb9799a4c3db81ef22a126b3ffac560c7750ab72258L1142-R1151

@me-no-dev
Copy link
Member

I can confirm this is an issue as well. Now that Arduino uses IDF's I2C driver, we are getting reports and I was able to reproduce it. I2C_CMD_ALIVE_INTERVAL_TICK should be configurable, though in reality the code should not wait longer than the provided ticks_to_wait. Please fix and backport this issue.

KlausPopp added a commit to ci4rail/esp-idf that referenced this issue May 18, 2022
KlausPopp added a commit to ci4rail/esp-idf that referenced this issue May 18, 2022
Fix problem that i2c_master_cmd_begin() waits longer (at least 1 second) than specified in ticks_to_wait, in case the bus is stuck.
@dhalbert
Copy link

dhalbert commented Jun 1, 2022

@KlausPopp Would you be interested in proposing this as a PR to https://github.com/espressif/esp-idf ?

@theotherjenkutler
Copy link

Hi Is there any update on this? I am having this same issue.

@fpgamaster
Copy link
Author

It is amazing how long this issue can persist :)

@theotherjenkutler
Copy link

@fpgamaster yes its been a hopeful few years for reliable i2c communication.

@chuyec
Copy link

chuyec commented Dec 15, 2022

I have this problem too. Hope for a speedy solution

@fpgamaster
Copy link
Author

Hi @chuyec

Try my ugly workaround. I use it for a while :)
The diff against esp-idf v4.4.3:

diff --git a/components/driver/i2c.c b/components/driver/i2c.c
index f4beb3bcf1..d2c9da4448 100644
--- a/components/driver/i2c.c
+++ b/components/driver/i2c.c
@@ -1477,6 +1477,7 @@ esp_err_t i2c_master_cmd_begin(i2c_port_t i2c_num, i2c_cmd_handle_t cmd_handle,
     i2c_cmd_evt_t evt;
     while (1) {
         TickType_t wait_time = xTaskGetTickCount();
+#if 0
         if (wait_time - ticks_start > ticks_to_wait) { // out of time
             wait_time = I2C_CMD_ALIVE_INTERVAL_TICK;
         } else {
@@ -1485,6 +1486,9 @@ esp_err_t i2c_master_cmd_begin(i2c_port_t i2c_num, i2c_cmd_handle_t cmd_handle,
                 wait_time = I2C_CMD_ALIVE_INTERVAL_TICK;
             }
         }
+#else
+       wait_time = ticks_to_wait;
+#endif
         // In master mode, since we don't have an interrupt to detective bus error or FSM state, what we do here is to make
         // sure the interrupt mechanism for master mode is still working.
         // If the command sending is not finished and there is no interrupt any more, the bus is probably dead caused by external noise.

Best regards

@chuyec
Copy link

chuyec commented Dec 16, 2022

@fpgamaster Thanks a lot for your suggestion!

I have already applied this fix from @KlausPopp: https://github.com/ci4rail/esp-idf/pull/5/files
It seems to have helped.

@chuyec
Copy link

chuyec commented Dec 16, 2022

@KlaussPopp Would you be interested in proposing this as a PR to https://github.com/espressif/esp-idf ?

Hi @dhalbert!

You made a mistake in mentioning the author. Of course, if he hasn't changed his nickname.🤔 He may not have seen your question.
The correct mention is @KlausPopp
🙂

Best regards

@AxelLin
Copy link
Contributor

AxelLin commented Dec 17, 2022

Hi @chuyec

Try my ugly workaround. I use it for a while :) The diff against esp-idf v4.4.3:

diff --git a/components/driver/i2c.c b/components/driver/i2c.c
index f4beb3bcf1..d2c9da4448 100644
--- a/components/driver/i2c.c
+++ b/components/driver/i2c.c
@@ -1477,6 +1477,7 @@ esp_err_t i2c_master_cmd_begin(i2c_port_t i2c_num, i2c_cmd_handle_t cmd_handle,
     i2c_cmd_evt_t evt;
     while (1) {
         TickType_t wait_time = xTaskGetTickCount();
+#if 0
         if (wait_time - ticks_start > ticks_to_wait) { // out of time
             wait_time = I2C_CMD_ALIVE_INTERVAL_TICK;
         } else {
@@ -1485,6 +1486,9 @@ esp_err_t i2c_master_cmd_begin(i2c_port_t i2c_num, i2c_cmd_handle_t cmd_handle,
                 wait_time = I2C_CMD_ALIVE_INTERVAL_TICK;
             }
         }
+#else
+       wait_time = ticks_to_wait;
+#endif
         // In master mode, since we don't have an interrupt to detective bus error or FSM state, what we do here is to make
         // sure the interrupt mechanism for master mode is still working.
         // If the command sending is not finished and there is no interrupt any more, the bus is probably dead caused by external noise.

Best regards

@o-marshmallow @mythbuster5
Seems this issue was reported by multiple users but there is no response from espressif.
Can you take a look at this issue? Or redirect this to someone who is working on I2C driver?

hennejg added a commit to pulsatrix-emobility/esp-idf that referenced this issue Dec 29, 2022
@tracestep
Copy link

I am having the same issue... the thing is I am using an environment in which I can't simply patch the idf library. Espressif, please fix this.

@not-my-real-name
Copy link

This is still an issue in 4.4.5. A quick glance at the master branch makes me think it will still be an issue there too.

I2C_CMD_ALIVE_INTERVAL_TICK is internal to i2c.c and is only used for the modbus master in one place. It looks like all it is doing is making sure the i2c event queue doesn't wait less than a minimum amount of time. It looks like this has been done because there is a generic event in the interrupt saying some sort of master activity has occured. If it doesn't have some sort of activity it does a low level reset on the i2c state machine. I don't see why it has to be so long though? If you know your I2C is slow, or going to clock stretch a long time, you would have to make your total timeout for that command longer than the maximum clock stretch period you can expect anyway? If you are going fast you could quite reasonably consider the i2c hardware dead enough to need a reset after much less than 1 second.

I might be missing something, but it looks reducing I2C_CMD_ALIVE_INTERVAL_TICK to a lower value would work fine? Even 50ms seems generous? I have only done about 5 minutes testing with 50ms but it seems to work ok so far?

@AxelLin
Copy link
Contributor

AxelLin commented Nov 9, 2023

@tracestep it’s been years that I’ve been waiting for i2c to work reliably with anything @espressif makes.

@igrr This issue was reported by 2020 and people keep reporting similar issue. Any one can take a look at the issue?

Hi @igrr
The issue status becomes "Reviewing" since Aug 30 but I don't find anyone assigned to this issue.
Is it possible to post the fix here for testing?

@AxelLin
Copy link
Contributor

AxelLin commented Nov 10, 2023

The same issue: #12079 #8718

@AxelLin
Copy link
Contributor

AxelLin commented Nov 16, 2023

#11605

@konwektor
Copy link

konwektor commented Nov 18, 2023

go this same bug. Transmission over i2c between esp32 and arduino leonardo for bluetooth controllers support. During gameplay stops reacting fo 1 sec. i dont have this problem with xbox one s/x series x/s gamnepads but all others supported by blueretro, from time to time doesnt react 1 sec.

#ogx360_i2c.c

@qqqlab
Copy link

qqqlab commented Dec 4, 2023

I ran into the same problem: IMU sensor hanging for 1 second a couple times per minute...

As a workaround I've created a fast bit-bang I2C lib, maybe it is of use for somebody else: https://github.com/qqqlab/ESP32_SoftWire

@dhalbert
Copy link

dhalbert commented Dec 4, 2023

We upgraded to ESP-IDF 5.1 and I2C issues that we had been seeing with clock-stretching in 4.4.x went away (e.g. with LC709203F). We are still seeing problems with BNO055 and related chips, but these peripherals do not obey the I2C protocol spec and don't work well.on several other chip families as well.

Note also that the upcoming 5.2 has redone the I2C driver, at least from the API point of view:
https://docs.espressif.com/projects/esp-idf/en/latest/esp32/migration-guides/release-5.x/5.2/peripherals.html

@not-my-real-name
Copy link

I had a look at the new API a little while ago and was disappointed to see it doesn't look like you can do partial transactions like the old one lets you do.

We have an I2C slave where the request changes part way through depending on the reply of the slave. The read length is not known/fixed at compile time.

E.g. Slave replies with a number of bytes to follow. On the current cmd list system I send two separate parts. One from start to the length byte, and then another to read the payload and send the stop etc.

In our case we could do a repeat start i guess at the expense of a little speed, but the new API, while simple, is more abstracted than the old one. Actually, looking at it I don't think you can do that either. You can only do a repeat start on a single transmit/receive operation...

The old API is a bit long winded, but it does give a lot more control.

I'm having good luck so far with just using I2C_CMD_ALIVE_INTERVAL_TICK at 50ms by the way.

@konwektor
Copy link

go this same bug. Transmission over i2c between esp32 and arduino leonardo for bluetooth controllers support. During gameplay stops reacting fo 1 sec. i dont have this problem with xbox one s/x series x/s gamnepads but all others supported by blueretro, from time to time doesnt react 1 sec.

#ogx360_i2c.c

Small update; I think problem is triggered when Bluetooth classic is used in the same time with I2c.
Mine Esp32 is working as a master <=> level (5V-3,3V)shifter<=> arduino pro micro (leonardo version 5V 16Mhz).
Gamepad (BLE, or classic) <=> Esp32 master I2C <=> Leonardo I2C Slave <=> USB host (Xbox)
In my case I dont have 1sec. timeout when im using Gamepads from Xbox One S/X or Xbox Series X/S.
All those are working after software update some time ago only in BLE mode..

Mine PS4 Gamepad, and Wii U are BT classic only. When I using those, timeout is triggered.

@AxelLin
Copy link
Contributor

AxelLin commented Dec 16, 2023

@tracestep it’s been years that I’ve been waiting for i2c to work reliably with anything @espressif makes.

@igrr This issue was reported by 2020 and people keep reporting similar issue. Any one can take a look at the issue?

Hi @igrr The issue status becomes "Reviewing" since Aug 30 but I don't find anyone assigned to this issue.

Now I'm wondering if this will be fixed in 2023?

@timoxd7
Copy link
Contributor

timoxd7 commented Feb 22, 2024

@tracestep it’s been years that I’ve been waiting for i2c to work reliably with anything @espressif makes.

@igrr This issue was reported by 2020 and people keep reporting similar issue. Any one can take a look at the issue?

Hi @igrr The issue status becomes "Reviewing" since Aug 30 but I don't find anyone assigned to this issue.

Now I'm wondering if this will be fixed in 2023?

@AxelLin Now its 2024, still waiting. Need the patch too... #12079

holdersn pushed a commit to pulsatrix-emobility/esp-idf that referenced this issue Mar 6, 2024
@AxelLin
Copy link
Contributor

AxelLin commented Mar 11, 2024

@tracestep it’s been years that I’ve been waiting for i2c to work reliably with anything @espressif makes.

@igrr This issue was reported by 2020 and people keep reporting similar issue. Any one can take a look at the issue?

Hi @igrr The issue status becomes "Reviewing" since Aug 30 but I don't find anyone assigned to this issue.

Now I'm wondering if this will be fixed in 2023?

@AxelLin Now its 2024, still waiting. Need the patch too... #12079

I'm pretty sure the espressif's developers know current behavior and this issue.
Just no idea why there is no response from espressif.
@igrr

@simondddd
Copy link

I thing this commit is where it gets wrong d913fff#diff-380dc90b7523e5ecbb9c3fb9799a4c3db81ef22a126b3ffac560c7750ab72258L1142-R1151

I insist, I think in line 1148 should be >, not <

@dhalbert
Copy link

@simonddd that was my conclusion above as well

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: NA Issue resolution is unavailable and removed Status: Reviewing Issue is being reviewed labels Mar 15, 2024
holdersn pushed a commit to pulsatrix-emobility/esp-idf that referenced this issue Mar 28, 2024
espressif-bot pushed a commit that referenced this issue Apr 17, 2024
Fixes #4999

Former usage of I2C_CMD_ALIVE_INTERVAL_TICK macro overrode the ticks_to_wait
parameter when the latter was too big
@AxelLin
Copy link
Contributor

AxelLin commented Apr 21, 2024

@o-marshmallow
still needs to backport the fix for v4.4, v5.0 and v5.2.

espressif-bot pushed a commit that referenced this issue Apr 26, 2024
Fixes #4999

Former usage of I2C_CMD_ALIVE_INTERVAL_TICK macro overrode the ticks_to_wait
parameter when the latter was too big
espressif-bot pushed a commit that referenced this issue Apr 26, 2024
Fixes #4999

Former usage of I2C_CMD_ALIVE_INTERVAL_TICK macro overrode the ticks_to_wait
parameter when the latter was too big
espressif-bot pushed a commit that referenced this issue Jul 9, 2024
Fixes #4999

Former usage of I2C_CMD_ALIVE_INTERVAL_TICK macro overrode the ticks_to_wait
parameter when the latter was too big
andreichalapco pushed a commit to pulsatrix-emobility/esp-idf that referenced this issue Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests