Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I2C reads from touchscreen fail (clock stretch issues, gets SCL stuck high on client?) #727

Closed
spapadim opened this issue Aug 28, 2015 · 4 comments

Comments

@spapadim
Copy link

spapadim commented Aug 28, 2015

I have a Digole LCD + touchscreen module (this one, protocol description here), using it in I2C mode (7-bit address of 0x29). I2C writes work fine, but reads get the bus stuck. Everything works fine on an AVR. Display draws ~60mA (according to my bench supply), so power should not be an issue (my ESP is a Sparkfun Thing -- also tried an Adafruit breakout very briefly).

Everything here is from the Sparkfun Thing (for ESP) and RedBoard (for AVR). Gist with logic analyzer captures (readable with the OLS client) is: https://gist.github.com/6f17e71df82975ee9fdf

The example I'm going with here is very basic, to illustrate issue: an infinite loop that does a non-blocking read of an X,Y touch coordinate. This is done as follows: master writes the string RPNXYI, then issues two 2-byte reads, which correspond to 16bit ints for X and Y, respectively (MSB-first). If no finger is touching, then at least one coordinate returned will be out-of-screen-bounds.

A couple of issues:

1 - Starting from the AVR, which always works (running the exact same Arduino code), I found out (after some quick exchanges with Digole's support) that the display stretches the clock while waiting for a touchscreen ADC conversion. This can take up to 2ms:
touch-avr-nodelay-zoom
Didn't take long to realize that the software I2C implementation can't handle such a long stretch. However, the 100us limit is well below the WDT timeout -- and even then, if the bus is never released, the device is hosed anyway, so it might as well reboot. But that's a secondary point.

2 - Luckily, it turns out that it's the RPNXYI write that triggers the ADC conversion, so simple solution: delay(5) between writing that string and issuing the I2C read request. Here is what it looks like on the AVR:
touch-avr-delay5-zoom
Problem solved, I thought.

3 - Taking this code with delay(5) and recompiling for ESP, I get this instead:
touch-esp-delay5-zoom
As far as I can tell, after ACKing the read request address byte (0x4f), the display module never pulls SCL down ever again. Just gets stuck. So I always read back 0xff 0xff from the first read, and then it's NACKs forever. [Notes: (1) in case you notice clock "period": I set twi_dcount to 56, for ~50KHz reduced clock, as part of earlier debugging; (2) if you see the raw OLS captures, pls ignore initial commands to clear screen (CL), set display orientation (SD3), and set color (ESCxxx), -- these are part of setup(), and here I had to start capturing from beginning, since bus gets stuck almost immediately. ]

For the life of me, I can't see what the difference is between these two captures (then again, I've been staring at these and more for more than a day, so... ;) ).

FWIW, I also tried upping #define TWI_CLOCK_STRETCH to 6400000 [sic] from 800, that didn't do anything for the stretch. In fact, the clock stretch seems to be completely ignored by the I2C implementation on the ESP (it doesn't even try to delay for 100us or something, just goes ahead? -- unless my eyes are really crossed, which is likely ;) ). In fact, if I try to run the code without delay on the ESP, the capture looks pretty much the same as the third case (the ESP master pulls up SCL as if nothing happened).

PS. On the stretch issue, the same Digole module also implements a blocking read for X, Y (RPNXYW). Didn't pursue this as I don't like blocking (I need to update UI based on touch-down, drag, and touch-up events, so I wrote code for that, which works on the AVR -- this example is stripped from there), so not sure how it works, but I suspect that it blocks by stretching the clock... that might be a real issue with the WDT on the ESP, but I would probably argue it's poor API design on the display device.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@spapadim
Copy link
Author

Forgot to say that I noticed #698. Haven't tried that yet (will at some point), but I don't think it's related since SCL is never pulled down by client after the first read request -- so problem happens earlier than the NACK (but where?..).

@spapadim
Copy link
Author

For ease of reference, here are the relevant I2C messages (from 2nd and 3rd capture above) -- also in the gist.

Same code, running on AVR (works fine):
touch-avr-delay5-i2c

and on ESP (gets bus stuck):
touch-esp-delay5-i2c

I just realized that I wasn't touching my finger, so even when the bus works correctly, the reads return high values (eg > 0xff00), which isn't great for illustration. However, on ESP slave always returns exactly 0xff 0xff (ie SDA always high) on the first read, then gets stuck.

@spapadim spapadim changed the title I2C reads from touchscreen fail (clock stretch issues, SCL stuck high on client?) I2C reads from touchscreen fail (clock stretch issues, gets SCL stuck high on client?) Aug 28, 2015
@spapadim
Copy link
Author

In the interest of sanity, I'll just switch the display over to UART mode and give up. Before doing that, a final comment to "archive" all the observations I made.

TL;DR: The master can pull SCL low, but the slave cannot (current best guess of what's happening). Just changing SCL pin makes (small) difference. If anyone can think of causes, I'd be happy to revisit this issue.

I may be going bonkers, but upon closer inspection I noticed that there are still differences in the clock between the AVR and the ESP: (i) there is still a 2-3us stretch on the first bit read on the AVR; (ii) not sure if relevant, but on the ESP the last bit read/written the SCL HIGH is -2us longer.

After giving up on twi_read_bit() and clock stretch there (seems fine!), I thought that maybe the slave can't electrically pull down SCL. So, I swapped pins: the Thing uses SDA=2 and SCL=14. Moved to SDA=0 and SCL=4. Well, now I got a few reads complete successfully, but still the bus got stuck eventually. Don't know if it matters, but 14 is also the HW SPI CLK and my program links in the SPI library (but never calls it). Just to make sure it wasn't static initialization issues, I wrote a standalone program that only uses Wire.

One other difference: when SCL=14, whenever I try to flash the ESP, espcomm_sync fails unless I pull the display module's ground. When I move to SCL==4, I can succesfully flash the ESP without having to disconnect the display. So something is being shorted somewhere/somehow?

For the record, my wire setup is: a self-crimped daisy-chain with ~2in long sections (total length ~4.5in). ESP and module connected to ends of daisy chain, and various pullups on a breadboard connected through the middle daisychain socket and short wires. Currently using 4k7 external pullups, but tried various values from 10k down to 2k2. For the record, 2k2 pullups make the AVR have occasional bus-errors, not sure why (2k2 should be within Rmin/Rmax specs).

Edit: BTW, just for completeness (not sure if relevant, not an expert, but at this point not sure what might be), the board looks like this throughout (apologies for poor quality, this is iPad over a loupe :):
digole-pcb
I've never seen this kind of solder "fuzz" (then I'm no electronics expert, learning as necessary, not even sure what it's called :). Is it normal? Is it possible it's causing any of this weirdness? And if so, why only on the ESP, and not the AVR (5V)?

For the record, UART does not work either. It does on the AVR (as usual). Which is weird, because I've been using the UART all week just fine, for debugging printf()s, uploading code, etc. So, I'm totally ditching this display (at least for now).

@devyte
Copy link
Collaborator

devyte commented Oct 20, 2017

Thanks for the detailed analysis, I expect it will help somebody.
There is now a method for clock stretching in latest git.
Closing per #3655 .

@devyte devyte closed this as completed Oct 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants