Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql,server: fetch privileges for virtual tables on startup #93444

Closed
wants to merge 3 commits into from

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Dec 12, 2022

This is a reworking of #93437
fixes #93182

This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run SHOW TABLES it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be very slow in an MR cluster.

This patch attempts to somewhat alleviate this pain by scanning the table eagerly during server startup.

Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup to mitigate the latency when accessing pg_catalog right after the server boots up.

@rafiss rafiss requested a review from a team as a code owner December 12, 2022 18:32
@rafiss rafiss requested a review from a team December 12, 2022 18:32
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much better, thanks for doing it! Just small stuff from me

Comment on lines 965 to 968
start := timeutil.Now()
var totalErr error
if totalErr != nil {
log.Warningf(ctx, "failed to warm privileges for virtual tables: %v", totalErr)
} else {
log.Infof(ctx, "warmed privileges for virtual tables in %v", timeutil.Since(start))
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part needs to be reworked to achieve the intended behavior with regards to logging.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops i meant to use defer

`SELECT path, username, privileges, grant_options FROM system.%s WHERE path LIKE '/%s/%%'`,
syntheticprivilege.VirtualTablePathPrefix,
)
totalErr = ief.DescsTxnWithExecutor(ctx, db, nil /* sessionData */, func(ctx context.Context, txn *kv.Txn, descsCol *descs.Collection, ie sqlutil.InternalExecutor) (retErr error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can just call this err now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

user := tree.MustBeDString(it.Cur()[1])
privArr := tree.MustBeDArray(it.Cur()[2])
grantOptionArr := tree.MustBeDArray(it.Cur()[3])
privileges, err := synthesizePrivilegeDescriptorFromSystemPrivilegesRow(privilege.VirtualTable, user, privArr, grantOptionArr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there's a row in the system.privileges table per user, but then this overwrites the privilege descriptor for each row. Am I missing something, or do we need to collect the rows for all of the users for a given table together?

grant select on crdb_internal.tables to root;
grant select on crdb_internal.tables to foo; 
select * from system.privileges;                                                             
  username |             path             | privileges | grant_options
-----------+------------------------------+------------+----------------
  foo      | /vtable/crdb_internal/tables | {SELECT}   | {}
  root     | /vtable/crdb_internal/tables | {SELECT}   | {}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see. i missed that in the existing usage of synthesizePrivilegeDescriptorFromSystemPrivileges -- it makes a *PrivilegeDescriptor and accumulates everything into that.

catconstants.SystemPrivilegeTableName,
syntheticprivilege.VirtualTablePathPrefix,
)
err = ief.DescsTxnWithExecutor(ctx, db, nil /* sessionData */, func(ctx context.Context, txn *kv.Txn, descsCol *descs.Collection, ie sqlutil.InternalExecutor) (retErr error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wrap

}
if systemPrivDesc.IsUncommittedVersion() {
// This shouldn't ever happen, but if it does somehow, then we can't pre-warm the cache.
return errors.Newf("%s is at an uncommitted version", syntheticprivilege.SystemPrivilegesTableName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

report or panic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -32,9 +32,12 @@ const VirtualTablePrivilegeType = "VirtualTable"

var _ Object = &VirtualTablePrivilege{}

// VirtualTablePathPrefix is the prefix used for virtual table privileges in system.privileges.
var VirtualTablePathPrefix = "vtable"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1519,6 +1519,11 @@ func (s *SQLServer) preStart(
)

scheduledlogging.Start(ctx, stopper, s.execCfg.DB, s.execCfg.Settings, s.internalExecutor, s.execCfg.CaptureIndexUsageStatsKnobs)
sql.WarmSyntheticPrivilegeCacheForVirtualTables(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we make this async? Adding a synchronous round-trip is sort of lame. I suppose the counter-argument is that if we don't make it async then we'll have a hazard that a concurrent query which needs the privileges will go and fetch them. Think there's some way we can make this synchronize with the single-flight or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, well i could add a WaitGroup on the cache to signal that warming is still happening. let me give that a shot

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this wouldn't be particularly hard if the cache were a struct and we had some channel or something for the initial population. I think the raw cacheutil.Cache is making things seem more complex than they are.

return privDesc, err
}

func accumulatePrivilegeDescriptorFromSystemPrivilegesRow(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how would you feel about making a little struct and making a method to add a new row to it? There's something a little offputting about the way we keep passing the same object here but then we also stick it into the map. If we had some sort of accumulator struct, it'd feel a little cleaner to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that seems good

var tableVersions []descpb.DescriptorVersion
vtablePathToPrivilegeDescs := make(map[string]*catpb.PrivilegeDescriptor)
query := fmt.Sprintf(
`SELECT path, username, privileges, grant_options FROM system.%s WHERE path LIKE '/%s/%%'`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that we don't have an index on path. That seems bad in general -- it means we always do a full table scan.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filed #93525

Comment on lines 1016 to 965
path := tree.MustBeDString(it.Cur()[0])
user := tree.MustBeDString(it.Cur()[1])
privArr := tree.MustBeDArray(it.Cur()[2])
grantOptionArr := tree.MustBeDArray(it.Cur()[3])
privDesc, ok := vtablePathToPrivilegeDescs[string(path)]
if !ok {
privDesc = &catpb.PrivilegeDescriptor{}
}
if err := accumulatePrivilegeDescriptorFromSystemPrivilegesRow(privilege.VirtualTable, privDesc, user, privArr, grantOptionArr); err != nil {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my vision on this is:

type syntheticPrivilegeAccumulator struct {
   desc *catpb.PrivilegeDescriptor
   objectType privilege.ObjectType
}

func (s *syntheticPrivilegeAccumulator) addRow(row tree.Datums) error {
   if len(row) != 4 {
      // assertion failure
   }
   path := tree.MustBeDString(row[0])
	 user := tree.MustBeDString(row[1])
	 privArr := tree.MustBeDArray(row[2])
	 grantOptionArr := tree.MustBeDArray(row[3])

   //... to modify the privilege descriptor
}

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe forgot to push?

@@ -1519,6 +1519,11 @@ func (s *SQLServer) preStart(
)

scheduledlogging.Start(ctx, stopper, s.execCfg.DB, s.execCfg.Settings, s.internalExecutor, s.execCfg.CaptureIndexUsageStatsKnobs)
sql.WarmSyntheticPrivilegeCacheForVirtualTables(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this wouldn't be particularly hard if the cache were a struct and we had some channel or something for the initial population. I think the raw cacheutil.Cache is making things seem more complex than they are.

fixes cockroachdb#93182

This commit attempts to alleviate the pain caused by synthetic virtual table
privileges introduced in 22.2. We need to fetch privileges for virtual tables
to determine whether the user has access. This is done lazily. However, when
a user attempts to read a virtual table like pg_class or run `SHOW TABLES` it
will force the privileges to be determined for each virtual table (of which
there are 290 at the time of writing). This sequential process can be somewhat
slow in a single region cluster and will be *very* slow in an MR cluster.

This patch attempts to somewhat alleviate this pain by scanning the
table eagerly during server startup.

Release note (performance improvement): In 22.2 we introduced privileges on
virtual tables (system catalogs like pg_catalog, information_schema, and
crdb_internal). A problem with this new feature is that we now must fetch those
privileges into a cache before we can use those tables or determine their
visibility in other system catalogs. This process used to occur on-demand, when
the privilege was needed. Now we'll fetch these privileges eagerly during
startup to mitigate the latency when accessing pg_catalog right after the
server boots up.
@rafiss
Copy link
Collaborator Author

rafiss commented Dec 13, 2022

It's pushed now. I added the WaitGroup to the cache, but if you think I should, I can make it so there are more functions on the cache to help with using it.

@rafiss rafiss requested a review from ajwerner December 13, 2022 19:36
@@ -28,6 +29,7 @@ import (
// during user authentication and session initialization.
type Cache struct {
syncutil.Mutex
WarmingGroup *sync.WaitGroup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with WaitGroup is that it doesn't compose with context cancellation. It also feels a little weird to me to attach it to cacheutil.Cache when this is the only one

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thought was that any cache should be able to implement cache warming logic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair enough, but I still think there's lots of value in wrapping the generic mechanism even if it's a thin wrapper.

@@ -1519,6 +1519,9 @@ func (s *SQLServer) preStart(
)

scheduledlogging.Start(ctx, stopper, s.execCfg.DB, s.execCfg.Settings, s.internalExecutor, s.execCfg.CaptureIndexUsageStatsKnobs)
if err := stopper.RunAsyncTask(ctx, "warm-synthetic-privilege-cache", s.execCfg.WarmSyntheticPrivilegeCacheForVirtualTables); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessarily safe because you're incrementing the waitgroup counter in a goroutine. Doesn't the waitgroup not like it when you add after calls to Wait?

},
)
if err != nil {
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think naked returns aid in readability here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't intentional

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i misread which line this was on. this function doesn't have a return value. how would you suggest i short-circuit here? do you think i should add an error return value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops my bad

Callers can use the new Warm function to warm the cache. Warming will be
cancelled if the stopper is stopped.

Release note: None
@rafiss
Copy link
Collaborator Author

rafiss commented Dec 13, 2022

I still think there's lots of value in wrapping the generic mechanism even if it's a thin wrapper.

there's a thin wrapper now. i've also fixed it so it respects context cancellation. I'm not very happy with this WaitForWarm thing... technically we don't even need it and we could just not wait for warming to finish.

@rafiss rafiss requested a review from ajwerner December 13, 2022 22:27
@ajwerner
Copy link
Contributor

I don't think I was clear enough about whose context cancellation I was worried about. I was worried about the concurrent queries more than I was worried about the warming routine. It's the Wait call which doesn't compose with cancellation.

@ajwerner
Copy link
Contributor

I was having a hard time exactly verbalizing the refactor I was looking for in a way other than showing. Can you take a look at #93557 and see if any part of that refactor offends your sensibilities?

@rafiss
Copy link
Collaborator Author

rafiss commented Dec 13, 2022

eh.. then maybe we just get rid of the Wait calls. all it means is a small chance of redundant work during the first 100ms of server startup (server startup, not session startup), only if the user uses a virtual table during that window.

@rafiss
Copy link
Collaborator Author

rafiss commented Dec 14, 2022

replaced by #93557

@rafiss rafiss closed this Dec 14, 2022
@rafiss rafiss deleted the warm-vtable-privs branch December 14, 2022 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: leasing of virtual table privileges adds to latency on first use
3 participants