Skip to content
This repository has been archived by the owner on Apr 5, 2024. It is now read-only.

Fix panics in the mtcp-client #39

Merged
merged 1 commit into from
May 21, 2021
Merged

Fix panics in the mtcp-client #39

merged 1 commit into from
May 21, 2021

Conversation

CryptoCopter
Copy link
Member

If the write to reportChan is deferred, and reportChang has already been
closed, then the resulting panic won't be recovered by the deferred
recover-function.

Copy link
Member

@oxzi oxzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks a lot!

However, after applying your suggest changes, an error will not be sent for recovered panics. But especially those errors might be very important.
Thus, I would suggest some kind of multiple stacked defers.

  1. defer client.mutex.Unlock(), as before
  2. recover and assign errors to err
  3. if err != nil {client.reportChan <- cla.NewConvergencePeerDisappeared(client, client.GetPeerEndpointID()) }
  4. another recover to catch panics due to an already closed channel.

Due to its stack nature, the defers needs to be applied in reverse.
I sketched this one here: https://play.golang.org/p/JWkgNRg84s7

Of course, this might not be the most beautiful solution but should cover all possibilities.

pkg/cla/mtcp/client.go Outdated Show resolved Hide resolved
@CryptoCopter
Copy link
Member Author

come to think of it, I believe that I originally included the recoveronly because of the occasional panics due to sending a message to an already closed reportChan - so so I believe we only need three layers:

  1. client.mutex.Unlock()
  2. if err != nil {client.reportChan <- cla.NewConvergencePeerDisappeared(client, client.GetPeerEndpointID()) }
  3. recover in case reportChan is already closed

@CryptoCopter CryptoCopter requested a review from oxzi May 21, 2021 16:19
If the write to reportChan is deferred, and reportChang has already been
closed, then the resulting panic won't be recovered by the deferred
recover-function.
@CryptoCopter CryptoCopter self-assigned this May 21, 2021
Copy link
Member

@oxzi oxzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start with some history. The first deferred recover was introduced in early 2019 by fa14812 to mitigate a panic within go-codec, I guess. However, in the meantime the whole functional body of this method was changed completely.

I think you are totally right that one recover is enough.
The point about two recovers was that a panic from within the method's body could still be sent into the channel. However, I don't see any "panicable" code there anymore.

LGTM, thanks again!

@oxzi oxzi merged commit fca8607 into dtn7:master May 21, 2021
@CryptoCopter CryptoCopter deleted the mtcp-panic-fix branch May 22, 2021 18:24
adur1990 pushed a commit that referenced this pull request Aug 16, 2021
As can be seen in recent issues and PRs (#39, #40, #41, #42, #43, #45), there
is an issue in dtn7-go that results in a deadlock after some time, making it
impossible to sent bundles. To be able to test and debug this issue, some more
extensive tests are required. Thus, we developed a test infrastructure to
simulate multiple virtual nodes using the
[CORE emulator](https://github.com/coreemu/core).
This tests environment is located in the
[dtn7-playground](https://github.com/dtn7/dtn7-playground) repository. To be
able to trigger these tests, this PR adds a workflow file, that uses the
[repository_dispatch event](https://docs.github.com/en/actions/reference/events-that-trigger-workflows#repository_dispatch)
sent to the dtn7-playground repo.
oxzi pushed a commit that referenced this pull request Aug 16, 2021
* Workflow for extended tests

As can be seen in recent issues and PRs (#39, #40, #41, #42, #43, #45), there
is an issue in dtn7-go that results in a deadlock after some time, making it
impossible to sent bundles. To be able to test and debug this issue, some more
extensive tests are required. Thus, we developed a test infrastructure to
simulate multiple virtual nodes using the
[CORE emulator](https://github.com/coreemu/core).
This tests environment is located in the
[dtn7-playground](https://github.com/dtn7/dtn7-playground) repository. To be
able to trigger these tests, this PR adds a workflow file, that uses the
[repository_dispatch event](https://docs.github.com/en/actions/reference/events-that-trigger-workflows#repository_dispatch)
sent to the dtn7-playground repo.

* Fix copyright header

Co-authored-by: Artur Sterz <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants