Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gitpod OIDC Identity Provider #16482

Merged
merged 15 commits into from
Mar 3, 2023
Merged

Gitpod OIDC Identity Provider #16482

merged 15 commits into from
Mar 3, 2023

Conversation

csweichel
Copy link
Contributor

@csweichel csweichel commented Feb 20, 2023

Note This is a PoC, and not necessarily a feature that will land soon (or ever).

Description

Private Video - Watch Video

This PR implements an OIDC identity provider, and adds an IDToken function to the API surface. This function is served from public-api-server and delegates authZ/authN to server by calling a no-op GetIDToken function.

Private keys used for signing the JWT are only held in memory and are ephemeral. A Redis-backed public key cache ensures that the JSON Web Key Set contains all public keys currently in circulation. The issued ID tokens have a TTL of one hour, as do the public keys whose TTL is refreshed whenever they're used.

The IDToken API call is available through gp idp which is currently hidden, because this is not a released feature. To this end, the workspace token gained the GetIDToken function scope.

The new code has a test coverage > 70%.

Release Notes

NONE

Documentation

Build Options:

  • /werft with-github-actions
    Experimental feature to run the build with GitHub Actions (and not in Werft).
  • leeway-no-cache
    leeway-target=components:all
  • /werft no-test
    Run Leeway with --dont-test
Publish Options
  • /werft publish-to-npm
  • /werft publish-to-jb-marketplace
Installer Options
  • with-ee-license
  • with-slow-database
  • with-dedicated-emulation
  • with-ws-manager-mk2
  • workspace-feature-flags
    Add desired feature flags to the end of the line above, space separated

Preview Environment Options:

  • /werft with-local-preview
    If enabled this will build install/preview
  • /werft with-preview
  • /werft with-large-vm
  • /werft with-gce-vm
    If enabled this will create the environment on GCE infra
  • /werft with-integration-tests=all
    Valid options are all, workspace, webapp, ide, jetbrains, vscode, ssh

@csweichel
Copy link
Contributor Author

csweichel commented Mar 1, 2023

/werft with-preview

👎 unknown command: with-preview
Use /werft help to list the available commands

@csweichel
Copy link
Contributor Author

csweichel commented Mar 1, 2023

/werft run with-preview

👍 started the job as gitpod-build-cw-idp.13
(with .werft/ from main)

@csweichel csweichel force-pushed the cw/idp branch 2 times, most recently from bd01a4e to 1f0c626 Compare March 3, 2023 00:23
@@ -0,0 +1,19 @@
syntax = "proto3";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, I'd name the file in full - identity-provider. Just because it avoids an extra mental lookup


service IDPService {
// GetIDToken produces a new IT token
rpc GetIDToken(GetIDTokenRequest) returns (GetIDTokenResponse) {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rpc GetIDToken(GetIDTokenRequest) returns (GetIDTokenResponse) {};
rpc GetToken(GetTokenRequest) returns (GetTokenResponse) {};

It's misleading that it says ID, when the request/response objects don't refer to an id property

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned token is an IDToken in the OIDC sense. I added a link to the spec in the comment.

@@ -26,5 +29,11 @@ type Configuration struct {
// Path to directory containing database configuration files
DatabaseConfigPath string `json:"databaseConfigPath"`

// RedisAddress configures the redis connection of this component
RedisAddress string `json:"redisAddress"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a struct, which has an address as a property. I can imagine we may want to pass additional info (like creds) and having these sit under a single redis property makes it easier.

Redis struct {
	Address string `json:...`
}

Server *baseserver.Configuration `json:"server,omitempty"`
}

type RedisConfiguration struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not anymore :)

}

if len(req.Msg.Audience) < 1 {
return nil, connect.NewError(connect.CodeInvalidArgument, fmt.Errorf("must have at least one audience entry"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, connect.NewError(connect.CodeInvalidArgument, fmt.Errorf("must have at least one audience entry"))
return nil, connect.NewError(connect.CodeInvalidArgument, fmt.Errorf("Must have at least one audience entry"))

For consistency with other APIs here

})
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
err := redisClient.Ping(ctx).Err()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't dug into this, but is the Redis client reconnecting? If so, we shouldn't need to do a ping here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just creating the client doesn't check if the connection works. I'd rather the component fails to start than fails to function if Redis is unavailable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but this only checks at connection time (server startup time), not subsequently so the value we get from this remains limited

@@ -114,13 +131,26 @@ func Start(logger *logrus.Entry, version string, cfg *config.Configuration) erro

oidcService := oidc.NewService(cfg.SessionServiceAddress, dbConn, cipherSet, stateJWT)

var idpCache idp.KeyCache
if redisClient == nil {
log.Info("No Redis cache configured - caching IDP public keys in memory.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually want to allow this mode? Given we run multiple replicas of pub-api, this is bound to cause issues. I'd rather we say we have a dependency on Redis and we do not try to cache in memory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed it

rootHandler.Mount(idpRoute, idpServiceHandler)

// IDP well-known and keys routes
rootHandler.Mount("/idp", deps.idpService.Router())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what stops us serving these on the idpServicehandler rather than having to have another endpoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment

@@ -1507,6 +1507,7 @@ export class WorkspaceStarter {
"function:getTeams",
"function:trackEvent",
"function:getSupportedWorkspaceClasses",
"function:getIDToken",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's quite opaque, can we be more specific? Like mentioning IDP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment

var _ v1connect.IDPServiceHandler = ((*IDPService)(nil))

// GetIDToken implements v1connect.IDPServiceHandler
func (srv *IDPService) GetIDToken(ctx context.Context, req *connect.Request[v1.GetIDTokenRequest]) (*connect.Response[v1.GetIDTokenResponse], error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to have a unit test (like the other API packages) for this just to avoid setting the precedent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@csweichel csweichel force-pushed the cw/idp branch 3 times, most recently from f6bb261 to 0f47870 Compare March 3, 2023 11:16
@csweichel csweichel marked this pull request as ready for review March 3, 2023 11:28
@csweichel csweichel requested review from a team March 3, 2023 11:28
@github-actions github-actions bot added team: webapp Issue belongs to the WebApp team team: IDE labels Mar 3, 2023
@csweichel
Copy link
Contributor Author

/hold

@csweichel
Copy link
Contributor Author

/hold cancel

(verified that the Redis readiness probe works and public-api-server starts)

Use: "aws",
Short: "Login to AWS",
RunE: func(cmd *cobra.Command, args []string) error {
cmd.SilenceUsage = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if the aws binary exists to avoid confusing errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the behaviour if the AWS binary is not available:
image

How should the error look like instead?

Copy link
Member

@aledbf aledbf Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of requesting to install the CLI, but that is just fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could link to the AWS Docs. Arguably though, if someone's using this feature they're deep enough into it that they'll have looked at our (yet to be written) docs which will need to state that the AWS CLI needs to be installed.

@csweichel
Copy link
Contributor Author

/hold cancel

Redis isn't a stateful set yet, which means that when Redis gets restarted we loose the keys. That's likely to happen around deployments - and expected.

@csweichel
Copy link
Contributor Author

/hold

Updating the key cache mechanism to make it more resilient against Redis failures

@csweichel
Copy link
Contributor Author

/hold cancel

// vault write auth/jwt/login role=demo jwt=$TKN -format=json
out, err := exec.Command("vault", "write", "-format=json", "auth/jwt/login", "role="+idpLoginVaultOpts.Role, "jwt="+tkn).CombinedOutput()
if err != nil {
return fmt.Errorf("%w: %s", err, string(out))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("%w: %s", err, string(out))
return xerrors.Errorf("%w: %s", err, string(out))

RunE: func(cmd *cobra.Command, args []string) error {
cmd.SilenceUsage = true
if idpLoginAwsOpts.RoleARN == "" {
return fmt.Errorf("missing --role-arn or IDP_AWS_ROLE_ARN env var")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("missing --role-arn or IDP_AWS_ROLE_ARN env var")
return xerrors.Errorf("missing --role-arn or IDP_AWS_ROLE_ARN env var")

awsCmd := exec.Command("aws", "sts", "assume-role-with-web-identity", "--role-arn", idpLoginAwsOpts.RoleARN, "--role-session-name", fmt.Sprintf("gitpod-%d", time.Now().Unix()), "--web-identity-token", tkn)
out, err := awsCmd.CombinedOutput()
if err != nil {
return fmt.Errorf("%w: %s", err, string(out))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("%w: %s", err, string(out))
return xerrors.Errorf("%w: %s", err, string(out))

@@ -114,13 +129,22 @@ func Start(logger *logrus.Entry, version string, cfg *config.Configuration) erro

oidcService := oidc.NewService(cfg.SessionServiceAddress, dbConn, cipherSet, stateJWT)

if redisClient == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need check here, we have err = redisClient.Ping(ctx).Err() in above, if redisClient is nil, I think it will failed or panic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RedisClient is never nil and without the Ping call we'll fail at runtime if redis isn't available. This way the service doesn't start rather than fail strangely when running.

@roboquat roboquat merged commit da4cafd into main Mar 3, 2023
@roboquat roboquat deleted the cw/idp branch March 3, 2023 16:11
@iQQBot
Copy link
Contributor

iQQBot commented Mar 3, 2023

Ah forgot to add hold, but it's small thing

@roboquat roboquat added deployed: webapp Meta team change is running in production deployed: IDE IDE change is running in production deployed Change is completely running in production labels Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: IDE IDE change is running in production deployed: webapp Meta team change is running in production deployed Change is completely running in production release-note-none size/XXL team: IDE team: webapp Issue belongs to the WebApp team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants