Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit all commands for goroutine leaks #4414

Closed
2 of 12 tasks
Stebalien opened this issue Nov 21, 2017 · 12 comments
Closed
2 of 12 tasks

Audit all commands for goroutine leaks #4414

Stebalien opened this issue Nov 21, 2017 · 12 comments
Labels
exp/novice Someone with a little familiarity can pick up help wanted Seeking public contribution on this issue

Comments

@Stebalien
Copy link
Member

Stebalien commented Nov 21, 2017

We're leaking goroutines on canceled commands all over the place.

To do this audit, I recommend looking at all channel writes and I mean all.

A quick grep indicates that the following files may have issues (should, at least, be looked at):

Basically, anytime we write to a channel that may block (i.e., doesn't have a buffer at least as large what we're writing), we should select on the channel write and <-req.Context().Done().

@Stebalien Stebalien added exp/novice Someone with a little familiarity can pick up help wanted Seeking public contribution on this issue P0 Critical: Tackled by core team ASAP and removed P0 Critical: Tackled by core team ASAP labels Nov 21, 2017
@keks
Copy link
Contributor

keks commented Nov 22, 2017

@Stebalien Regarding your question in #4413:

Should we be calling res.SetError(req.Context().Err(), cmdkit.ErrNormal) when the context is canceled or can we just bail (what I'm doing here).

When we cancel, the error should either be correctly handled or the command should return an error indicating the command was cancelled. Maybe even specify "go-ipfs-cmds".Canceled of type cmdkit.Error, analogue to "context".Canceled. If this doesn't happen, we should add something like

select {
case <-ctx.Done():
  res.SetError(ctx.Err(), cmdkit.ErrNormal)
case default:
}

at the end of the respective commands. Does that sound reasonable to you?

@Stebalien
Copy link
Member Author

Does that sound reasonable to you?

It's reasonable, just verbose and painful; in some cases, we realize that the context has been canceled in multiple places concurrently (thread safety issues). If the context has been canceled/expired, I assume the client isn't waiting for a response anyways. If they are, can the commands package not handle this?

@Stebalien
Copy link
Member Author

Stebalien commented Jan 17, 2018

Note: We should not be closing the channel on context cancel. Otherwise, the other side might think that we're done sending the response. We need to fix the PR #4413.

lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and closes the channel.
- Consumer sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and DOES NOT close the
  channel.
- Consumer sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and closes the channel.
- Consumer sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and DOES NOT close the
  channel.
- Consumer sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and closes the channel.
- Consumer sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and DOES NOT close the
  channel.
- Consumer sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and closes the channel.
- Consumer sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and DOES NOT close the
  channel.
- Consumer sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and closes the channel.
- Consumer sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and DOES NOT close the
  channel.
- Consumer sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and closes the channel.
- Consumer sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and DOES NOT close the
  channel.
- Consumer sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and closes the channel.
- Consumer sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Producer creates channel.
- Consumer reads from channel.
- Producer observes that the context has errored, and DOES NOT close the
  channel.
- Consumer sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and closes the channel.
- Receiver sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and DOES NOT close the
  channel.
- Receiver sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and closes the channel.
- Receiver sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and DOES NOT close the
  channel.
- Receiver sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and closes the channel.
- Receiver sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and DOES NOT close the
  channel.
- Receiver sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and closes the channel.
- Receiver sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and DOES NOT close the
  channel.
- Receiver sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
lgarron added a commit to lgarron/go-ipfs that referenced this issue Jan 18, 2018
…imed out).

Previously, the following could happen:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and closes the channel.
- Receiver sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and DOES NOT close the
  channel.
- Receiver sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
Stebalien pushed a commit to lgarron/go-ipfs that referenced this issue Jan 26, 2018
…imed out).

Previously, the following could happen:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and closes the channel.
- Receiver sees the closed channel before seeing the context error, and assumes
  that command completed without a context error.

This change ensures the following:
- Sender creates channel.
- Receiver reads from channel.
- Sender observes that the context has errored, and DOES NOT close the
  channel.
- Receiver sees that the context has errored, and handles the channel
  appropriately.

Note that per the definition of context [1], a context is "done" iff it has not
"errored" (i.e. ctx.Err() is non-nil).

[1] https://godoc.org/context#Context

Updates call sites touched by ipfs#4413 and addresses ipfs#4414.

License: MIT
Signed-off-by: Lucas Garron <[email protected]>
@forstmeier
Copy link
Contributor

forstmeier commented May 6, 2018

Hi, all! Is this issue still active? I'd like to get involved with some entry-level work on the project to get my hands dirty and this issue seemed like it still needed some work done.

@alecbrick
Copy link
Contributor

@forstmeier I'm interested in getting into IPFS as well, and I'm also interested in working on this issue. It doesn't seem super active, but if you're interested, we might be able to jointly look into this. Some of the files seem clean, but I'm not super experienced at Go, and I'm not great at intuitively knowing when a goroutine might hang.

@forstmeier
Copy link
Contributor

forstmeier commented May 14, 2018

Hey, @alecbrick, I saw that you've submitted fixes for dag.go and pin.go - do you want me to look through refs.go? Also, I agree that some of these files listed in the issue appear to be clean (as far as I can tell).

djdv pushed a commit that referenced this issue Jun 27, 2018
Description:
This addresses one of the listed problem files in #4414. I chose to
keep the return statement outside of the select statement on line
132 since that behavior was already there following the write to
out.

License: MIT
Signed-off-by: John Forstmeier <[email protected]>
@kjzz
Copy link
Contributor

kjzz commented Sep 5, 2018

@Stebalien there are one goroutine leaks in filestore.go.So please help me review pr #5427 ,Thx a lot

@tarekbadrsh
Copy link
Contributor

Hi @Stebalien,
I tried to contribute in this issue, found four file still remaining.

the code in those file is the same.
As you can see no context need to check if it's done.
The enhacemnet we can do is the code is duplicate, we can make it once in "commands.go".


so please tell me if you need more help in this issue or any other, I am looking for contribute with you.

@Stebalien
Copy link
Member Author

The unwrapOutput cases are actually fine. Those are only called on the client and we close the channel when the request finishes or is canceled.

I wouldn't bother deduplicating them either. Instead, we should focus on switching to the new Commands API that doesn't need that helper (you'll notice that some commands don't use SetOutput or Output, those are the ones using the new API).


That last add one looks fine. We're waiting on the context at the bottom. Am I missing something?

@tarekbadrsh
Copy link
Contributor

Hi @Stebalien,

  • No you don't miss any point
  • if you need help in this point tell me more, I need to help.

we should focus on switching to the new Commands API that doesn't need that helper (you'll notice that some commands don't use SetOutput or Output, those are the ones using the new API).

@Stebalien
Copy link
Member Author

It looks like there are some cases in core/commands/pin.go and one in core/commands/refs.go (WriteEdge "checks" the context but the check is racy).

@Stebalien
Copy link
Member Author

This was fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exp/novice Someone with a little familiarity can pick up help wanted Seeking public contribution on this issue
Projects
None yet
Development

No branches or pull requests

6 participants