Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support graceful shutdown on sigint/sigterm #385

Merged
merged 15 commits into from
Sep 12, 2022
61 changes: 55 additions & 6 deletions internal/app/server/open_saves_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (
"reflect"
"runtime"
"strings"
"syscall"
"testing"
"time"

Expand Down Expand Up @@ -62,19 +63,31 @@ const (
blobKind = "blob"
)

func TestOpenSaves_HealthCheck(t *testing.T) {
var (
serviceConfig *config.ServiceConfig
)

func getServiceConfig() (*config.ServiceConfig, error) {
if serviceConfig != nil {
return serviceConfig, nil
}
configPath := cmd.GetEnvVarString("OPEN_SAVES_CONFIG", "../../../configs/")
viper.Set(config.OpenSavesBucket, testBucket)
viper.Set(config.OpenSavesProject, testProject)
viper.Set(config.OpenSavesPort, testPort)
cfg, err := config.Load(configPath)
sc, err := config.Load(configPath)
serviceConfig = sc
return sc, err
}

func TestOpenSaves_HealthCheck(t *testing.T) {
serviceConfig, err := getServiceConfig()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Parallel()?

if err != nil {
t.Fatalf("got err loading config: %v", err)
t.Fatalf("getServiceConfig err: %v", err)
}

ctx := context.Background()
ctx, cancel := context.WithCancel(context.Background())
go func() {
if err := Run(ctx, "tcp", cfg); err != nil {
if err := Run(ctx, "tcp", serviceConfig); err != nil {
log.Errorf("got err calling server.Run: %v", err)
}
}()
Expand All @@ -99,6 +112,42 @@ func TestOpenSaves_HealthCheck(t *testing.T) {
if want := healthgrpc.HealthCheckResponse_SERVING; got.Status != want {
t.Errorf("hc.Check got: %v, want: %v", got.Status, want)
}
// Cancel here to stop the server. This frees up the port for later tests.
cancel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to cancel the context? Do we want to test the behavior of the server after cancel()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment here to address this. I also added tests after this to test NOT_SERVING.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

ctx2 := context.Background()
got, err = hc.Check(ctx2, &healthgrpc.HealthCheckRequest{
Service: serviceName,
})
if err != nil {
t.Errorf("healthClient.Check err: %v", err)
}
if want := healthgrpc.HealthCheckResponse_NOT_SERVING; got.Status != want {
t.Errorf("hc.Check got: %v, want: %v", got.Status, want)
}
}

func TestOpenSaves_RunServer(t *testing.T) {
serviceConfig, err := getServiceConfig()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Parallel()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to run these tests in parallel since they use the same port. The timeouts are short enough that there shouldn't be much delay here anyway. Same response for the remaining tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah that makes sense, thanks

if err != nil {
t.Fatalf("getServiceConfig err: %v", err)
}
ctx := context.Background()
t.Run("cancel_context", func(t *testing.T) {
ctx, cancel := context.WithTimeout(ctx, 2*time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Parallel()?

defer cancel()
if err := Run(ctx, "tcp", serviceConfig); err != nil {
t.Errorf("got err calling server.Run: %v", err)
}
})
t.Run("sigint", func(t *testing.T) {
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Parallel()?

time.Sleep(2 * time.Second)
syscall.Kill(syscall.Getpid(), syscall.SIGINT)
}()
if err := Run(ctx, "tcp", serviceConfig); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's feasible to test the status change from SERVING to NOT_SERVING? Looks like GracefulStop doesn't return until all RPC calls are served, so timing wise it's a bit tricky.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could, but I added the test to TestOpenSaves_HealthCheck

t.Errorf("got err calling server.Run: %v", err)
}
})
}

func getOpenSavesServer(ctx context.Context, t *testing.T, cloud string) (*openSavesServer, *bufconn.Listener) {
Expand Down
19 changes: 19 additions & 0 deletions internal/app/server/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ package server
import (
"context"
"net"
"os"
"os/signal"
"syscall"

"github.com/googleforgames/open-saves/internal/pkg/config"

Expand Down Expand Up @@ -46,6 +49,9 @@ func Run(ctx context.Context, network string, cfg *config.ServiceConfig) error {
}
}()

sigs := make(chan os.Signal, 1)
signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)

s := grpc.NewServer()
healthcheck := health.NewServer()
healthcheck.SetServingStatus(serviceName, healthgrpc.HealthCheckResponse_SERVING)
Expand All @@ -57,5 +63,18 @@ func Run(ctx context.Context, network string, cfg *config.ServiceConfig) error {
pb.RegisterOpenSavesServer(s, server)
reflection.Register(s)

go func() {
select {
case s := <-sigs:
log.Infof("received signal: %v\n", s)
case <-ctx.Done():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear when you need to cancel the context. Currently it's only in the tests, but do you have any plan to support cancellation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm supporting this here since we pass in a ctx to Run. There's no immediate plan to support this. If you prefer, I can remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

log.Infoln("context cancelled")
}
log.Infoln("stopping open saves server gracefully")
healthcheck.SetServingStatus(serviceName, healthgrpc.HealthCheckResponse_NOT_SERVING)
s.GracefulStop()
log.Infoln("stopped open saves server gracefully")
}()

return s.Serve(lis)
}