-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CoreConnectTimeoutError: Could not connect with the Tribler Core within 120 seconds: ConnectionRefusedError (code 1) #7137
Comments
Looking at the last core output, it seems the core is running. Looking at the error code
For the first case, it can happen if the retry_port is enabled and the core starts on another port within the retry limit (which is 10 by default). Why the core is not able to bind to the allocated port is something to be investigated. To confirm this suspicion, we could enrich the sentry report to include:
Further, as a temporary ugly fix, an idea on the GUI side could be to try to connect the core in the next 10 ports if the core fails to connect within the first 2 mins. If it fails then, it could fail just like now. |
Seems it is possible to get the connection ports of a process with given pid, the actual core port can be obtained. Then this port can be compared with the port set from the GUI to check for then difference and update the request manager accordingly. This should likely fix this issue. |
Related to #7065 |
This |
The issue is still present: Sentry Issue #1352, occurring in the most recent 7.13.1. |
We can start using a "shift-right testing" approach by adding a new checkbox to the error report dialog, "next time, gather detailed information about the error". This way, the user can allow Tribler to gather and send an extended error report to Sentry with debugging information enabled. This is an opt-in approach when a user explicitly agrees to send detailed information to developers, and some users who experience the bug can be motivated enough to enable sending detailed information about the error. |
Investigating this issue further, there are no new instances of this issue on 7.13.1 reported for Windows. However, there are reports on macOS and Linux (both Debian and Flatpak) version. My suspicion is on the filelock mechanism to checking existing process. |
Investigating the logs further, I find that there are multiple core processes in the running state (no exit code or finished timestamp). Trying to reproduce the issue, I was able to on Flatpak environment. There was a previous attempt at fixing the CoreConnectTimeoutError using FileLock. This works nicely for the normal scenario preventing double instances of Tribler running at the same time. However, for any reason, if the core process is terminated without a clean exit, it leaves the process database entry for the core in inconsistent state. This causes the new GUI instance to wrongly select the previously terminated core process and try to connect to Core Port of that process. Eventually, GUI fails to connect after the timeout period. Under the assumption that GUI process is run first and it spawns the Core, the process entries on the process database should be sequential, GUI rowid first then the Core rowid. When the GUI process tries to get the core process, if it is checked to ensure the core process rowid should be higher than the GUI rowid, the correct core process will be returned and the connection will be successful. This is proposed in PR #7915. |
[7.14.0] EVENTS: 5 Sentry issue: TRIBLER-1PM |
The core connection mechanism has changed on the new GUI, therefore this issue is no longer relevant. Closing for now. |
Sentry Issue: TRIBLER-J3
Last Core output:,
Related: #7032
The text was updated successfully, but these errors were encountered: