Blocking XGroupCreateMkStream does not interrupt on context cancellation #2276

jgirtakovskis · 2022-11-05T01:13:22Z

When XGroupCreateMkStream is called in blocking mode (Block = 0), call does not get interrupted by cancelling context.

Expected Behavior

Blocking function interrupts when context is cancelled

Current Behavior

Function continues to block after context cancellation

Possible Solution

Unsure yet

Steps to Reproduce

package main

import (
	"context"
	"fmt"
	"sync"
	"time"

	"github.com/go-redis/redis/v9"
	"github.com/google/uuid"
)

func main() {
	rdb := redis.NewUniversalClient(&redis.UniversalOptions{
		Addrs:    []string{"localhost:6379"},
		Password: "", // no password set
		DB:       0,  // use default DB
	})

	defer rdb.Close()

	ctx, cancelFn := context.WithCancel(context.Background())

	go func() {
		for idx := 0; idx < 5; idx++ {
			fmt.Printf("Waiting %v...\n", idx)
			time.Sleep(time.Second)
		}
		cancelFn()
		fmt.Printf("Cancelled context and now expect blocking XGroupCreateMkStream to be interrupted...\n")
	}()

	name := "blag"
	streamName := name
	groupName := name + "-blah"

	_, err := rdb.XGroupCreateMkStream(ctx, streamName, groupName, "0").Result()
	fmt.Printf("%v\n", err)

	var wg sync.WaitGroup

	wg.Add(1)
	go func() {
		defer wg.Done()
		objs, err := rdb.XReadGroup(ctx, &redis.XReadGroupArgs{
			Group:    groupName,
			Consumer: uuid.NewString(),
			Streams:  []string{streamName, ">"},
			Count:    100,
			Block:    0,
		}).Result()
		fmt.Printf("%v, %v\n", err, objs)
	}()

	wg.Wait()
	fmt.Printf("Done.\n")
}

Context (Environment)

I have two goroutines concurrently performing XREADGROUP and XADD in blocking mode. XADD is triggered by external events and is not guaranteed to add items to the stream at any particular cadence or pattern. Shutting down reading goroutine is not possible due to the blocking call that does not get interrupted by context concellation.

Detailed Description

Blocking calls should interrupt when context is cancelled and connection closed.

Possible Implementation

N/A

The text was updated successfully, but these errors were encountered:

berndverst · 2022-12-02T04:28:31Z

Dapr maintainer here. I also need this fixed.

Alger7w · 2022-12-07T04:03:44Z

I have same problem too. version: v9.0.0-rc.2

latolukasz · 2023-01-04T11:36:06Z

The same problem, quite a bit blocker for me:(

brettmorien · 2023-02-09T01:57:59Z

~~We've built a workaround for this using samber/lo, but agree that this should really be handled by the library level.~~

❗❗ Turns out this leaks a lot of goroutines and shouldn't be used. ❗❗

var cmd *redis.XStreamSliceCmd

select {
case cmd = <-lo.Async(func() *redis.XStreamSliceCmd {
	return c.Client.XReadGroup(ctx, &redis.XReadGroupArgs{
		Group:    "group",
		Consumer: "consumerID",
		Streams:  []string{"stream", ">"},
		Count:    1,
	})
}):
case <-ctx.Done():
	return ctx.Err()
}

streams, err := cmd.Result()

armsnyder · 2023-02-09T02:10:56Z

Note that the workaround above will leak goroutines if you run the code repeatedly, so it's really only viable for handling app shutdown.

armsnyder · 2023-02-09T04:21:06Z

I was able to contribute test cases in #2432. However I'm less confident in providing a fix. The code would need to safely dispose of the connection if the context is canceled, such as removing it from the connection pool. It would also need to not interfere with the expected behavior of ContextTimeoutEnabled (#2243).

monkey92t · 2023-02-09T14:37:18Z

This feels complicated, but net.Conn is hard to be controlled by ctx.
net.Conn uses Deadline instead of ctx...

var conn net.Conn // go-redis pool.Conn
ctx := context.Background()

processBlockCmd := func() <-chan *redis.XStreamSliceCmd {
	ch := make(chan *redis.XStreamSliceCmd)
	go func() {
		cmd := &redis.XStreamSliceCmd{}
		// write...
		if _, err := conn.Read(nil); err != nil {
			// check conn timeout?
			if err.Error() == "i/o timeout" && errors.Is(ctx.Err(), context.Canceled) {
				cmd.SetErr(err)
			}
		}
		ch <- cmd
		close(ch)
	}()
	return ch
}


select {
case cmd := <-processBlockCmd():
	return cmd
case <-ctx.Done():
	conn.SetDeadline(time.Now())
	return <-processBlockCmd()
}

armsnyder · 2023-02-10T01:48:18Z

@monkey92t Right, deadlines on net.Conn are best used when you know the timeout ahead of time. The code you shared makes sense, and it is very similar to the change I just proposed in #2433 at a library level. I believe this belongs in the library since I would prefer the redis client to manage the connection for me. If users would prefer not to cancel redis commands with a context, then they can pass context.Background() to redis commands.

EDIT: One change between your code and mine is that you used SetDeadline whereas I closed the connection. Is there a meaningful difference there?

monkey92t · 2023-02-10T05:11:20Z

@armsnyder We still need to think more. If goroutine is used every time a command is executed, it will have an impact on performance. I haven't thought of a good solution yet.

armsnyder · 2023-02-10T11:11:40Z

Here's a benchstat comparing master with my PR.

https://gist.github.com/armsnyder/40aca6ea480bf53434d1e41c663e1550

We could optimize by running a goroutine per connection rather than per command. The connection goroutine would handle all I/O, with the command communicating to it over a channel.

brettmorien · 2023-02-21T19:19:44Z

Hi folks (@monkey92t)!

I'm checking in to see if this is still on the radar. Not being able to cancel out of a blocking call is a deal breaker for gracefully shutting down our apps. We are currently operating on a July 2022 beta of this library until we can upgrade to a properly working release version.

monkey92t · 2023-02-22T07:42:20Z

Hi folks (@monkey92t)!

I'm checking in to see if this is still on the radar. Not being able to cancel out of a blocking call is a deal breaker for gracefully shutting down our apps. We are currently operating on a July 2022 beta of this library until we can upgrade to a properly working release version.

Thank you for your attention....I've done related tests and it's hard to choose:

Adding the chan+goroutine method (such as @armsnyder's example), we will pay a huge cost for each command executed.
When <-ctx.Done(), we will close a network connection, because we can't trust the state of this connection, which will cause a chain reaction (Constantly Reestablishing Connections to AWS ElastiCache Redis in Cluster Mode (Continued) #2046)

No matter how we do it, it will cause a lot of side effects because of context. I haven't found a better solution. A similar approach is also used in the net(*netFD) package.

I'm trying more solutions and benchmark tests, such as letting users choose whether to pay for listening to ctx, like #2243.

berndverst · 2023-02-27T19:36:27Z

Hi folks (@monkey92t)!
I'm checking in to see if this is still on the radar. Not being able to cancel out of a blocking call is a deal breaker for gracefully shutting down our apps. We are currently operating on a July 2022 beta of this library until we can upgrade to a properly working release version.

Thank you for your attention....I've done related tests and it's hard to choose:

Adding the chan+goroutine method (such as @armsnyder's example), we will pay a huge cost for each command executed.

When <-ctx.Done(), we will close a network connection, because we can't trust the state of this connection, which will cause a chain reaction (Constantly Reestablishing Connections to AWS ElastiCache Redis in Cluster Mode (Continued) #2046)

No matter how we do it, it will cause a lot of side effects because of context. I haven't found a better solution. A similar approach is also used in the net(*netFD) package.

I'm trying more solutions and benchmark tests, such as letting users choose whether to pay for listening to ctx, like #2243.

With respect to (1) I am concerned about goroutines being leaked, or GC overhead. This would be a deal breaker for our use in Dapr (Distributed Application Runtime - github.com/dapr/dapr). We are very performance conscious as our project runs on a variety of targets included embedded systems.

I can't speak in favor of (2), but my vote is against (1).

monkey92t · 2023-02-27T20:03:14Z

@berndverst What do you think of #2455 ？

berndverst · 2023-02-27T21:38:20Z

@berndverst What do you think of #2455 ？

Let me loop in one of my co-maintainers - @ItalyPaleAle thoughts on #2455 for addressing the issue discussed here?

brettmorien · 2023-03-15T18:26:30Z

Hi folks. Any movement on this issue?

wk8 · 2023-04-27T02:27:04Z

This is a locker for us, could you please look at #2455? Thank you!

kkkbird · 2023-05-18T08:49:45Z

really need this fix, any update?

armsnyder mentioned this issue Feb 9, 2023

test: Add tests for cancelled context #2432

Draft

armsnyder linked a pull request Feb 10, 2023 that will close this issue

fix: Blocking commands respect canceled context #2433

Open

monkey92t mentioned this issue Feb 22, 2023

feat: watch <-context.Done #2455

Open

FLAGLORD mentioned this issue Apr 23, 2023

Context cancel not working on blocking operations in v9 #2556

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blocking XGroupCreateMkStream does not interrupt on context cancellation #2276

Blocking XGroupCreateMkStream does not interrupt on context cancellation #2276

jgirtakovskis commented Nov 5, 2022

berndverst commented Dec 2, 2022

Alger7w commented Dec 7, 2022

latolukasz commented Jan 4, 2023

brettmorien commented Feb 9, 2023 •

edited

Loading

armsnyder commented Feb 9, 2023

armsnyder commented Feb 9, 2023

monkey92t commented Feb 9, 2023

armsnyder commented Feb 10, 2023 •

edited

Loading

monkey92t commented Feb 10, 2023

armsnyder commented Feb 10, 2023

brettmorien commented Feb 21, 2023

monkey92t commented Feb 22, 2023

berndverst commented Feb 27, 2023

monkey92t commented Feb 27, 2023

berndverst commented Feb 27, 2023

brettmorien commented Mar 15, 2023

wk8 commented Apr 27, 2023

kkkbird commented May 18, 2023

Blocking XGroupCreateMkStream does not interrupt on context cancellation #2276

Blocking XGroupCreateMkStream does not interrupt on context cancellation #2276

Comments

jgirtakovskis commented Nov 5, 2022

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context (Environment)

Detailed Description

Possible Implementation

berndverst commented Dec 2, 2022

Alger7w commented Dec 7, 2022

latolukasz commented Jan 4, 2023

brettmorien commented Feb 9, 2023 • edited Loading

armsnyder commented Feb 9, 2023

armsnyder commented Feb 9, 2023

monkey92t commented Feb 9, 2023

armsnyder commented Feb 10, 2023 • edited Loading

monkey92t commented Feb 10, 2023

armsnyder commented Feb 10, 2023

brettmorien commented Feb 21, 2023

monkey92t commented Feb 22, 2023

berndverst commented Feb 27, 2023

monkey92t commented Feb 27, 2023

berndverst commented Feb 27, 2023

brettmorien commented Mar 15, 2023

wk8 commented Apr 27, 2023

kkkbird commented May 18, 2023

brettmorien commented Feb 9, 2023 •

edited

Loading

armsnyder commented Feb 10, 2023 •

edited

Loading