Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS_IOT_ERROR ('Error Connection to AWS IoT: ', MQTTException('Repeated connect failures',)) #23

Closed
mlnrt opened this issue May 5, 2023 · 31 comments · Fixed by adafruit/Adafruit_CircuitPython_MiniMQTT#218

Comments

@mlnrt
Copy link

mlnrt commented May 5, 2023

Hello,
5 months ago, I had build a demo reusing the code in the PyPortal IoT Plant Monitor with AWS IoT and CircuitPython which I had modified.
Everything was working fine.
I was trying to recreate this demo and now the Adafruit Pyportal is not able to connect to AWS IoT. Even when trying to recreate the PyPortal IoT Plant Monitor with AWS IoT and CircuitPython demo instead of my modified demo it fails with the same error:

File "adafruit_aws_iot.py", line 145, in connect
AWS_IOT_ERROR: ('Error connecting to AWS IoT: ', MMQTTException('Repeated connect failures',))

Could it be because of this issue AWS IoT SDK Python V2 Support

Thank you in advance

@jersu11
Copy link
Contributor

jersu11 commented May 7, 2024

for what it is worth, I'm having the exact same problem, exactly 1 year after this issue was created. I'm following the same demo code, getting the same 'repeated connect failures' error. It's the reason I found this issue. Did you have any luck fixing this?

@justmobilize
Copy link
Collaborator

I would suggest passing in a logger and seeing if you get any more helpful errors

@justmobilize
Copy link
Collaborator

@mlnrt and @jersu11 would you be willing to try this version of MiniMQTT? It's possible that you are getting auth errors that weren't being passed down correctly

@mlnrt
Copy link
Author

mlnrt commented May 15, 2024

@justmobilize I should be able to test in a few days

@justmobilize
Copy link
Collaborator

@justmobilize I should be able to test in a few days

Awesome!

@jersu11
Copy link
Contributor

jersu11 commented May 17, 2024

Hi, I've had a chance to try with the new library tonight and I'm seeing what looks like the same error. This is with the PyPortal (M4 + ESP32) - I mention that only because I've tried to see if there is a difference connecting to wifi via the connection manager when the esp32 is a coprocessor. I don't think that's the issue. Here's the log output I get after boiling down the code to about the bare minimum

code.py output:
Connecting to WiFi...
Connected!
Attempting to connect to asxxxxxxoy.iot.us-west-2.amazonaws.com
676.379: DEBUG - Attempting to connect to MQTT broker (attempt #0)
676.381: DEBUG - Attempting to establish MQTT connection...
676.620: DEBUG - Sending CONNECT to broker...
676.622: DEBUG - Fixed Header: bytearray(b'\x10(')
676.625: DEBUG - Variable Header: bytearray(b'\x00\x04MQTT\x04\x02\x00<')
676.864: DEBUG - Receiving CONNACK packet from broker
686.886: INFO - MMQT error: No data received from broker for 10 seconds.
686.888: DEBUG - Reconnect timeout computed to 2.00
686.890: DEBUG - adding jitter 0.50 to 2.00 seconds
686.893: DEBUG - Attempting to connect to MQTT broker (attempt #1)
686.894: DEBUG - Attempting to establish MQTT connection...
686.897: DEBUG - Sleeping for 2.5 seconds due to connect back-off
689.402: WARNING - Socket error when connecting: Socket already connected to mqtt://asxxxxxxxxoy.iot.us-west-2.amazonaws.com:8443
689.404: DEBUG - Resetting reconnect backoff
689.407: DEBUG - Attempting to connect to MQTT broker (attempt #0)
689.408: DEBUG - Attempting to establish MQTT connection...
689.607: WARNING - Socket error when connecting: Socket already connected to mqtt://asxxxxxxxxxoy.iot.us-west-2.amazonaws.com:8443
689.609: DEBUG - Resetting reconnect backoff
689.612: DEBUG - Attempting to connect to MQTT broker (attempt #0)
689.613: DEBUG - Attempting to establish MQTT connection...
689.617: WARNING - Socket error when connecting: Socket already connected to mqtt://asxxxxxxxxxxoy.iot.us-west-2.amazonaws.com:8443
689.822: DEBUG - Resetting reconnect backoff
689.825: DEBUG - Attempting to connect to MQTT broker (attempt #0)
689.827: DEBUG - Attempting to establish MQTT connection...
689.831: WARNING - Socket error when connecting: Socket already connected to mqtt://asxxxxxxxxxxoy.iot.us-west-2.amazonaws.com:8443
Traceback (most recent call last):
  File "code.py", line 160, in <module>
  File "adafruit_aws_iot.py", line 145, in connect
AWS_IOT_ERROR: ('Error connecting to AWS IoT: ', MMQTTException('Repeated connect failures',))

@justmobilize
Copy link
Collaborator

@jersu11 looks like after a timeout, it doesn't fully close out the socket. I will figure out how to reproduce this...

@jersu11
Copy link
Contributor

jersu11 commented May 17, 2024

Thanks @justmobilize - let me know if I can help. I'm happy to share my code or potentially the IoT thing certs for debugging. I did spend some time digging around documentation and hacking away at the libraries. Because it seems the CONNACK message never comes back, I double checked that the cert/key added via esp.set_certificate(DEVICE_CERT) and esp.set_private_key(DEVICE_KEY) in the code.py file were correct. In the AWS console, I checked that I had attached a permissive AWS Policy to the cert and finally checked again that it was all attached to the thing and that everything was in an active state. I had been playing with the port as well - I just noticed that the log output above is for 8443, which is the HTTPS Publish only port. But, switching back to the proper MQTT Pub/Sub port of 8883 still has the exact same failures.

@justmobilize
Copy link
Collaborator

@jersu11 a few other small things to try:

  1. Have you updated the firmware on the ESP yet?
  2. When calling MQTT.MQTT.connect, what happens if you pass in a larger value (say 60) to keep_alive?
  3. Have you tried this on a more powerful chip, like a ESP32S3? Or enen on your laptop?

@jersu11
Copy link
Contributor

jersu11 commented May 17, 2024

@justmobilize, good suggestions

for 1: I believe I have. when I print bytes(esp.firmware_version).decode("utf-8") it returns 1.7.7, and as far as versions go, I'm also running Adafruit CircuitPython 9.0.4 on 2024-04-16; Adafruit PyPortal with samd51j20
for 3: yes, have been able to connect from my laptop with an MQTT client using the AWS cert/key pair

I've learned a bit more. I started my project with the example code, such as that found in the '/examples/aws_iot_simpletest.py' found in this library and also in the Plant Monitor code

# Set up a new MiniMQTT Client
client = MQTT.MQTT(
    broker=secrets["broker"],
    client_id=secrets["client_id"],
    socket_pool=pool,
    ssl_context=ssl_context,
)

# Initialize AWS IoT MQTT API Client
aws_iot = MQTT_CLIENT(client)

When I ran this code initially, I would get an error WARNING - Socket error when connecting: Timed out waiting for SPI char. I found that by adding port=8883 in the MQTT.MQTT client init, it fixed that error but then returned the time out errors from above.

I had been a little confused about why I needed to be explicit about the port, since within the MQTT library, MQTT_TLS_PORT should have already been set when 'is_ssl' is True. It occurred to me today that I should set is_ssl=True during the client init, and good news, that one change got it to work! At least for a moment ..

It feels like we're a lot closer to finding a solution. However, the code still lands on the same MMQTTException of 'Repeated connect failures'. I've run this with both the latest release of the adafruit_minimqtt library and a second time with your recent changes/commit to adaruit_minimqtt.py. Same result for both.

code.py output:
nina-fw version: 1.7.7
Connecting to WiFi...
Connected!
Attempting to connect to asxxxxxxxxxhoy-ats.iot.us-west-2.amazonaws.com
5945.189: DEBUG - Attempting to connect to MQTT broker (attempt #0)
5945.191: DEBUG - Attempting to establish MQTT connection...
5952.613: DEBUG - Sending CONNECT to broker...
5952.615: DEBUG - Fixed Header: bytearray(b'\x10(')
5952.619: DEBUG - Variable Header: bytearray(b'\x00\x04MQTT\x04\x02\x00<')
5952.857: DEBUG - Receiving CONNACK packet from broker
5953.078: DEBUG - Got message type: 0x20 pkt: 0x20
Connected to MQTT Broker!
Flags: 0
 RC: 0
Subscribing to topic circuitpython/aws
5953.084: DEBUG - Sending SUBSCRIBE to broker...
5953.088: DEBUG - Fixed Header: bytearray(b'\x82\x16')
5953.297: DEBUG - Variable Header: b'\x00\x01'
5953.504: DEBUG - SUBSCRIBING to topic circuitpython/aws with QoS 1
5953.508: DEBUG - payload: b'\x00\x11circuitpython/aws\x01'
5953.721: DEBUG - Got message type: 0x90 pkt: 0x90
Subscribed to circuitpython/aws with QOS level 1
5953.932: DEBUG - Sending PUBLISH
Topic: circuitpython/aws
Msg: b'{"message": "Hello from AWS IoT CircuitPython"}'                            
QoS: 1
Retain? False
5954.164: WARNING - Socket error when connecting: pystack exhausted
5954.166: DEBUG - Resetting reconnect backoff
5954.168: DEBUG - Attempting to connect to MQTT broker (attempt #0)
5954.170: DEBUG - Attempting to establish MQTT connection...
5954.174: WARNING - Socket error when connecting: Socket already connected to mqtt://asxxxxxxxxxhoy-ats.iot.us-west-2.amazonaws.com:8883
5954.176: DEBUG - Resetting reconnect backoff
5954.178: DEBUG - Attempting to connect to MQTT broker (attempt #0)
5954.377: DEBUG - Attempting to establish MQTT connection...
5954.381: WARNING - Socket error when connecting: Socket already connected to mqtt://asxxxxxxxxxhoy-ats.iot.us-west-2.amazonaws.com:8883
5954.383: DEBUG - Resetting reconnect backoff
5954.385: DEBUG - Attempting to connect to MQTT broker (attempt #0)
5954.387: DEBUG - Attempting to establish MQTT connection...
5954.391: WARNING - Socket error when connecting: Socket already connected to mqtt://asxxxxxxxxxhoy-ats.iot.us-west-2.amazonaws.com:8883
5954.590: DEBUG - Resetting reconnect backoff
5954.592: DEBUG - Attempting to connect to MQTT broker (attempt #0)
5954.594: DEBUG - Attempting to establish MQTT connection...
5954.598: WARNING - Socket error when connecting: Socket already connected to mqtt://asxxxxxxxxxhoy-ats.iot.us-west-2.amazonaws.com:8883
Traceback (most recent call last):
  File "code.py", line 147, in <module>
  File "adafruit_aws_iot.py", line 145, in connect
AWS_IOT_ERROR: ('Error connecting to AWS IoT: ', MMQTTException('Repeated connect failures',))

@justmobilize
Copy link
Collaborator

Oh awesome. And with that error, we can fix that.

For this: 5954.164: WARNING - Socket error when connecting: pystack exhausted, checkout this setting. Set:

CIRCUITPY_PYSTACK_SIZE=3072

In your settings.toml, And you should be good!

And I'll go look at docs. How MiniMQTT and SSL evolved, some documentation needs some help...

@jersu11
Copy link
Contributor

jersu11 commented May 17, 2024

Woohoo! That pystack setting did the trick. Thanks for your commitment to this issue the last few days. I really appreciate that.

I noticed a couple other things, now that it's executing code in this library (Adafruit_CircuitPython_AWS_IOT). There's probably been a little drift between this lib and the MQTT lib. This library calls a client.loop_forever() method which doesn't exist in MQTT, and the call to client.loop() inherits the default MQTT timeout value of zero, which causes an error.

I've made the changes/fixes in my local copy, both of which were trivial. I'm happy to make a PR, which could also include the addition of the 'is_ssl=True' to the example code.

@justmobilize
Copy link
Collaborator

@jersu11 please do, although once you help, you'll become like me and want to help more... ;)

Both this one and the Azure one, can need pystack changes. Might also be a good add in the documentation

@justmobilize
Copy link
Collaborator

@jersu11 if you have time, would you be willing to open up 2 issues in MiniMQTT?

  1. pystack error keeps retrying
  2. Something about not setting is_ssl fails with timeout and non-descriptive error

This way I can focus on each and get PRs into the main library.

@dhalbert
Copy link
Contributor

@tannewt Should we consider raising the default PYSTACK?

@justmobilize
Copy link
Collaborator

@dhalbert do you know why you would get this exception on some devices but not all?

@dhalbert
Copy link
Contributor

dhalbert commented May 18, 2024

Which devices have you tested that are fine and which aren't? Different boards have different PYSTACK limits, based on RAM.

@justmobilize
Copy link
Collaborator

I will let you know. I'm setting stuff up this weekend to test the PR that's being put together.

I'll test both this and AzureIoT to see which have errors and which don't.

I have pretty much all the common boards.

@dhalbert
Copy link
Contributor

The default CIRCUITPY_PYSTACK_SIZE is 1536 for all boards except one specialized board, and MICROPY_ENABLE_PYSTACK is enabled on all CircuitPython boards, it appears. So I'm surprised it's only failing on some boards. For Espressif boards, see if there's any difference between no PSRAM and some PSRAM. Or maybe this is ESP32SPI vs native wifi.

@justmobilize
Copy link
Collaborator

Do you know which common ones have no PSRAM?

@dhalbert
Copy link
Contributor

dhalbert commented May 18, 2024

There are Feather ESP32-S3 boards with 8MB flash and no PSRAM, and other with 4MB flash and 2MB PSRAM.

@justmobilize
Copy link
Collaborator

Dang it. The one type of board I don't think I have...

@justmobilize
Copy link
Collaborator

@jersu11 when you have time, can you please share your code? I can't actually get the pystack error on my pyportal (or any device)

@dhalbert
Copy link
Contributor

Those of you who are having trouble: what version of NINA-FW are you running on the PyPortal ESP32? Try updating it to the latest. New root certificates have been added. This may be a cert issue, but with poor error recovery mixed in.

@jersu11
Copy link
Contributor

jersu11 commented May 19, 2024

@justmobilize , here's the code that hits the pystack error for me. It's a slightly modified version of the examples/aws_iot_simpletest.py. I've tested a few times: it works when CIRCUITPY_PYSTACK_SIZE=3072 is set in my settings.toml and fails with the pystack exhasuted message when that var is not set. I'm not sure if the PyPortal hardware has been upgraded over time. My unit is about 5 years old.

Adafruit CircuitPython 9.0.4 on 2024-04-16; Adafruit PyPortal with samd51j20

code.py.txt

@mlnrt
Copy link
Author

mlnrt commented May 22, 2024

My apologies for not being able to follow-up and test, but I ended up not having the time at all. Thank you for the follow-up. I'll try the fix.

@mlnrt
Copy link
Author

mlnrt commented May 27, 2024

It was a quite some work to update everything in my old demo and I am not fully done yet, but updating everything on my Adafruit Pyportal allows me to subscribe successfully to my MQTT topic even without the CIRCUITPY_PYSTACK_SIZE=3072 parameter in the settings.toml file. Now I still have to update my Pyportal application code to make it fully work but that is another problem I think. Thank you for the help.

@justmobilize
Copy link
Collaborator

Awesome!

@Shek20
Copy link

Shek20 commented Jul 5, 2024

excuse me I am having the similar issue and I struggle following what the solution is here. can you explain please what I need to do to solve the issue. @justmobilize @mlnrt
`

@mlnrt
Copy link
Author

mlnrt commented Jul 8, 2024

@Shek20 I just updated to CiryuitPython 9 and updated the ESP32 and nina firmware to the latest

@Shek20
Copy link

Shek20 commented Jul 12, 2024

Thank you! got it. I was having another issue based on the SSL configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants