Skip to content

Commit

Permalink
sql: introduce crdb_internal.index_usage_stats virtual table
Browse files Browse the repository at this point in the history
This commit introduce crdb_internal.index_usage_stats virtual
table that is backed by new clusterindexusagestats package. This
new package implements a variant of the indexusagestats interface
and serves the data by issuing cluster RPC fanout.

Addresses cockroachdb#64740

Followup to cockroachdb#66639

Release note (sql change): introduce crdb_internal.index_usage_statistics
virtual table to surface index usage statistics.
sql.metrics.index_usage_stats.enabled cluster setting can be used to
turn on/off the subsystem. It is default to true.
  • Loading branch information
Azhng committed Aug 4, 2021
1 parent 875b969 commit 4e36b23
Show file tree
Hide file tree
Showing 19 changed files with 1,717 additions and 1,582 deletions.
1 change: 1 addition & 0 deletions pkg/cli/testdata/zip/partial1
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ debug zip --concurrency=1 --cpu-profile-duration=0s /dev/null
[cluster] retrieving SQL data for crdb_internal.partitions... writing output: debug/crdb_internal.partitions.txt... done
[cluster] retrieving SQL data for crdb_internal.zones... writing output: debug/crdb_internal.zones.txt... done
[cluster] retrieving SQL data for crdb_internal.invalid_objects... writing output: debug/crdb_internal.invalid_objects.txt... done
[cluster] retrieving SQL data for crdb_internal.index_usage_statistics... writing output: debug/crdb_internal.index_usage_statistics.txt... done
[cluster] requesting nodes... received response... converting to JSON... writing binary output: debug/nodes.json... done
[cluster] requesting liveness... received response... converting to JSON... writing binary output: debug/liveness.json... done
[node 1] node status... converting to JSON... writing binary output: debug/nodes/1/status.json... done
Expand Down
1 change: 1 addition & 0 deletions pkg/cli/testdata/zip/partial1_excluded
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ debug zip /dev/null --concurrency=1 --exclude-nodes=2 --cpu-profile-duration=0
[cluster] retrieving SQL data for crdb_internal.partitions... writing output: debug/crdb_internal.partitions.txt... done
[cluster] retrieving SQL data for crdb_internal.zones... writing output: debug/crdb_internal.zones.txt... done
[cluster] retrieving SQL data for crdb_internal.invalid_objects... writing output: debug/crdb_internal.invalid_objects.txt... done
[cluster] retrieving SQL data for crdb_internal.index_usage_statistics... writing output: debug/crdb_internal.index_usage_statistics.txt... done
[cluster] requesting nodes... received response... converting to JSON... writing binary output: debug/nodes.json... done
[cluster] requesting liveness... received response... converting to JSON... writing binary output: debug/liveness.json... done
[node 1] node status... converting to JSON... writing binary output: debug/nodes/1/status.json... done
Expand Down
1 change: 1 addition & 0 deletions pkg/cli/testdata/zip/partial2
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ debug zip --concurrency=1 --cpu-profile-duration=0 /dev/null
[cluster] retrieving SQL data for crdb_internal.partitions... writing output: debug/crdb_internal.partitions.txt... done
[cluster] retrieving SQL data for crdb_internal.zones... writing output: debug/crdb_internal.zones.txt... done
[cluster] retrieving SQL data for crdb_internal.invalid_objects... writing output: debug/crdb_internal.invalid_objects.txt... done
[cluster] retrieving SQL data for crdb_internal.index_usage_statistics... writing output: debug/crdb_internal.index_usage_statistics.txt... done
[cluster] requesting nodes... received response... converting to JSON... writing binary output: debug/nodes.json... done
[cluster] requesting liveness... received response... converting to JSON... writing binary output: debug/liveness.json... done
[node 1] node status... converting to JSON... writing binary output: debug/nodes/1/status.json... done
Expand Down
1 change: 1 addition & 0 deletions pkg/cli/testdata/zip/testzip
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ debug zip --concurrency=1 --cpu-profile-duration=1s /dev/null
[cluster] retrieving SQL data for crdb_internal.partitions... writing output: debug/crdb_internal.partitions.txt... done
[cluster] retrieving SQL data for crdb_internal.zones... writing output: debug/crdb_internal.zones.txt... done
[cluster] retrieving SQL data for crdb_internal.invalid_objects... writing output: debug/crdb_internal.invalid_objects.txt... done
[cluster] retrieving SQL data for crdb_internal.index_usage_statistics... writing output: debug/crdb_internal.index_usage_statistics.txt... done
[cluster] requesting nodes... received response... converting to JSON... writing binary output: debug/nodes.json... done
[cluster] requesting liveness... received response... converting to JSON... writing binary output: debug/liveness.json... done
[cluster] requesting CPU profiles
Expand Down
3 changes: 3 additions & 0 deletions pkg/cli/testdata/zip/testzip_concurrent
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,9 @@ zip
[cluster] retrieving SQL data for crdb_internal.cluster_transactions...
[cluster] retrieving SQL data for crdb_internal.cluster_transactions: done
[cluster] retrieving SQL data for crdb_internal.cluster_transactions: writing output: debug/crdb_internal.cluster_transactions.txt...
[cluster] retrieving SQL data for crdb_internal.index_usage_statistics...
[cluster] retrieving SQL data for crdb_internal.index_usage_statistics: done
[cluster] retrieving SQL data for crdb_internal.index_usage_statistics: writing output: debug/crdb_internal.index_usage_statistics.txt...
[cluster] retrieving SQL data for crdb_internal.invalid_objects...
[cluster] retrieving SQL data for crdb_internal.invalid_objects: done
[cluster] retrieving SQL data for crdb_internal.invalid_objects: writing output: debug/crdb_internal.invalid_objects.txt...
Expand Down
1 change: 1 addition & 0 deletions pkg/cli/zip_cluster_wide.go
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ var debugZipTablesPerCluster = []string{
"crdb_internal.partitions",
"crdb_internal.zones",
"crdb_internal.invalid_objects",
"crdb_internal.index_usage_statistics",
}

// collectClusterData runs the data collection that only needs to
Expand Down
12 changes: 6 additions & 6 deletions pkg/server/index_usage_stats_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -220,24 +220,24 @@ func TestStatusAPIIndexUsage(t *testing.T) {
}, /* expectedKeys */ 4 /* expectedEventCnt*/, 5*time.Second /* timeout */)

// First node should have nothing.
stats := firstLocalStatsReader.Get(indexKeyA)
stats := firstLocalStatsReader.Get(indexKeyA.TableID, indexKeyA.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 1, but found %v", stats)

stats = firstLocalStatsReader.Get(indexKeyB)
stats = firstLocalStatsReader.Get(indexKeyB.TableID, indexKeyB.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 1, but found %v", stats)

// Third node should have nothing.
stats = thirdLocalStatsReader.Get(indexKeyA)
stats = thirdLocalStatsReader.Get(indexKeyA.TableID, indexKeyA.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 3, but found %v", stats)

stats = thirdLocalStatsReader.Get(indexKeyB)
stats = thirdLocalStatsReader.Get(indexKeyB.TableID, indexKeyB.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 1, but found %v", stats)

// Second server should have nonempty local storage.
stats = secondLocalStatsReader.Get(indexKeyA)
stats = secondLocalStatsReader.Get(indexKeyA.TableID, indexKeyA.IndexID)
compareStatsHelper(t, expectedStatsIndexA, stats, time.Minute)

stats = secondLocalStatsReader.Get(indexKeyB)
stats = secondLocalStatsReader.Get(indexKeyB.TableID, indexKeyB.IndexID)
compareStatsHelper(t, expectedStatsIndexB, stats, time.Minute)

// Test cluster-wide RPC.
Expand Down
1 change: 1 addition & 0 deletions pkg/sql/catalog/catconstants/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ const (
CrdbInternalGossipLivenessTableID
CrdbInternalGossipNetworkTableID
CrdbInternalIndexColumnsTableID
CrdbInternalIndexUsageStatisticsTableID
CrdbInternalInflightTraceSpanTableID
CrdbInternalJobsTableID
CrdbInternalKVNodeStatusTableID
Expand Down
57 changes: 57 additions & 0 deletions pkg/sql/crdb_internal.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/schemaexpr"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc"
"github.com/cockroachdb/cockroach/pkg/sql/idxusage"
"github.com/cockroachdb/cockroach/pkg/sql/pgwire/pgcode"
"github.com/cockroachdb/cockroach/pkg/sql/pgwire/pgerror"
"github.com/cockroachdb/cockroach/pkg/sql/privilege"
Expand Down Expand Up @@ -110,6 +111,7 @@ var crdbInternal = virtualSchema{
catconstants.CrdbInternalGossipLivenessTableID: crdbInternalGossipLivenessTable,
catconstants.CrdbInternalGossipNetworkTableID: crdbInternalGossipNetworkTable,
catconstants.CrdbInternalIndexColumnsTableID: crdbInternalIndexColumnsTable,
catconstants.CrdbInternalIndexUsageStatisticsTableID: crdbInternalIndexUsageStatistics,
catconstants.CrdbInternalInflightTraceSpanTableID: crdbInternalInflightTraceSpanTable,
catconstants.CrdbInternalJobsTableID: crdbInternalJobsTable,
catconstants.CrdbInternalKVNodeStatusTableID: crdbInternalKVNodeStatusTable,
Expand Down Expand Up @@ -4838,3 +4840,58 @@ CREATE TABLE crdb_internal.lost_descriptors_with_data (
return nil
},
}

var crdbInternalIndexUsageStatistics = virtualSchemaTable{
comment: `cluster-wide index usage statistics (in-memory, not durable).` +
`Querying this table is an expensive operation since it creates a` +
`cluster-wide RPC fanout.`,
schema: `
CREATE TABLE crdb_internal.index_usage_statistics (
table_id INT NOT NULL,
index_id INT NOT NULL,
total_reads INT NOT NULL,
last_read TIMESTAMPTZ
)
`,
generator: func(ctx context.Context, p *planner, dbContext catalog.DatabaseDescriptor, stopper *stop.Stopper) (virtualTableGenerator, cleanupFunc, error) {
// Perform RPC Fanout.
stats, err :=
p.extendedEvalCtx.SQLStatusServer.IndexUsageStatistics(ctx, &serverpb.IndexUsageStatisticsRequest{})
if err != nil {
return nil, nil, err
}
indexStats := idxusage.NewLocalIndexUsageStatsFromExistingStats(&idxusage.Config{}, stats.Statistics)

row := make(tree.Datums, 4 /* number of columns for this virtual table */)
worker := func(pusher rowPusher) error {
return forEachTableDescAll(ctx, p, dbContext, hideVirtual,
func(db catalog.DatabaseDescriptor, _ string, table catalog.TableDescriptor) error {
tableID := table.GetID()
return catalog.ForEachIndex(table, catalog.IndexOpts{}, func(idx catalog.Index) error {
indexID := idx.GetID()
stats := indexStats.Get(roachpb.TableID(tableID), roachpb.IndexID(indexID))

lastScanTs := tree.DNull
if !stats.LastRead.IsZero() {
lastScanTs, err = tree.MakeDTimestampTZ(stats.LastRead, time.Nanosecond)
if err != nil {
return err
}
}

row = row[:0]

row = append(row,
tree.NewDInt(tree.DInt(tableID)), // tableID
tree.NewDInt(tree.DInt(indexID)), // indexID
tree.NewDInt(tree.DInt(stats.TotalReadCount)), // total_reads
lastScanTs, // last_scan
)

return pusher.pushRow(row...)
})
})
}
return setupGenerator(ctx, worker, stopper)
},
}
85 changes: 64 additions & 21 deletions pkg/sql/idxusage/local_idx_usage_stats.go
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,20 @@ func NewLocalIndexUsageStats(cfg *Config) *LocalIndexUsageStats {
return is
}

// NewLocalIndexUsageStatsFromExistingStats returns a new instance of
// LocalIndexUsageStats that is populated using given
// []roachpb.CollectedIndexUsageStatistics. This constructor can be used to
// quickly aggregate the index usage statistics received from the RPC fanout
// and it is more efficient than the regular insert path because it performs
// insert without taking the RWMutex lock.
func NewLocalIndexUsageStatsFromExistingStats(
cfg *Config, stats []roachpb.CollectedIndexUsageStatistics,
) *LocalIndexUsageStats {
s := NewLocalIndexUsageStats(cfg)
s.batchInsertUnsafe(stats)
return s
}

// Start starts the background goroutine that is responsible for collecting
// index usage statistics.
func (s *LocalIndexUsageStats) Start(ctx context.Context, stopper *stop.Stopper) {
Expand Down Expand Up @@ -159,11 +173,13 @@ func (s *LocalIndexUsageStats) record(ctx context.Context, payload indexUse) {
}

// Get returns the index usage statistics for a given key.
func (s *LocalIndexUsageStats) Get(key roachpb.IndexUsageKey) roachpb.IndexUsageStatistics {
func (s *LocalIndexUsageStats) Get(
tableID roachpb.TableID, indexID roachpb.IndexID,
) roachpb.IndexUsageStatistics {
s.mu.RLock()
defer s.mu.RUnlock()

table, ok := s.mu.usageStats[key.TableID]
table, ok := s.mu.usageStats[tableID]
if !ok {
// We return a copy of the empty stats.
emptyStats := emptyIndexUsageStats
Expand All @@ -173,7 +189,7 @@ func (s *LocalIndexUsageStats) Get(key roachpb.IndexUsageKey) roachpb.IndexUsage
table.RLock()
defer table.RUnlock()

indexStats, ok := table.stats[key.IndexID]
indexStats, ok := table.stats[indexID]
if !ok {
emptyStats := emptyIndexUsageStats
return emptyStats
Expand Down Expand Up @@ -209,7 +225,7 @@ func (s *LocalIndexUsageStats) ForEach(options IteratorOptions, visitor StatsVis
s.mu.RUnlock()

for _, tableID := range tableIDLists {
tableIdxStats := s.getStatsForTableID(tableID, false /* createIfNotExists */)
tableIdxStats := s.getStatsForTableID(tableID, false /* createIfNotExists */, false /* unsafe */)

// This means the data s being cleared before we can fetch it. It's not an
// error, so we simply just skip over it.
Expand All @@ -231,6 +247,20 @@ func (s *LocalIndexUsageStats) ForEach(options IteratorOptions, visitor StatsVis
return nil
}

// batchInsertUnsafe inserts otherStats into s without taking on write lock.
// This should only be called during initialization when we can be sure there's
// no other users of s. This avoids the locking overhead when it's not
// necessary.
func (s *LocalIndexUsageStats) batchInsertUnsafe(
otherStats []roachpb.CollectedIndexUsageStatistics,
) {
for _, newStats := range otherStats {
tableIndexStats := s.getStatsForTableID(newStats.Key.TableID, true /* createIfNotExists */, true /* unsafe */)
stats := tableIndexStats.getStatsForIndexID(newStats.Key.IndexID, true /* createIfNotExists */, true /* unsafe */)
stats.Add(&newStats.Stats)
}
}

func (s *LocalIndexUsageStats) clear() {
s.mu.Lock()
defer s.mu.Unlock()
Expand All @@ -241,8 +271,8 @@ func (s *LocalIndexUsageStats) clear() {
}

func (s *LocalIndexUsageStats) insertIndexUsage(idxUse *indexUse) {
tableStats := s.getStatsForTableID(idxUse.key.TableID, true /* createIfNotExists */)
indexStats := tableStats.getStatsForIndexID(idxUse.key.IndexID, true /* createIfNotExists */)
tableStats := s.getStatsForTableID(idxUse.key.TableID, true /* createIfNotExists */, false /* unsafe */)
indexStats := tableStats.getStatsForIndexID(idxUse.key.IndexID, true /* createIfNotExists */, false /* unsafe */)
indexStats.Lock()
defer indexStats.Unlock()
switch idxUse.usageTyp {
Expand All @@ -259,15 +289,21 @@ func (s *LocalIndexUsageStats) insertIndexUsage(idxUse *indexUse) {
}
}

// getStatsForTableID returns the tableIndexStats for the given roachpb.TableID.
// If unsafe is set to true, then the lookup is performed without locking to the
// internal RWMutex lock. This can be used when LocalIndexUsageStats is not
// being concurrently accessed.
func (s *LocalIndexUsageStats) getStatsForTableID(
id roachpb.TableID, createIfNotExists bool,
id roachpb.TableID, createIfNotExists bool, unsafe bool,
) *tableIndexStats {
if createIfNotExists {
s.mu.Lock()
defer s.mu.Unlock()
} else {
s.mu.RLock()
defer s.mu.RUnlock()
if !unsafe {
if createIfNotExists {
s.mu.Lock()
defer s.mu.Unlock()
} else {
s.mu.RLock()
defer s.mu.RUnlock()
}
}

if tableIndexStats, ok := s.mu.usageStats[id]; ok {
Expand All @@ -286,15 +322,22 @@ func (s *LocalIndexUsageStats) getStatsForTableID(
return nil
}

// getStatsForIndexID returns the indexStats for the given roachpb.IndexID.
// If unsafe is set to true, then the lookup is performed without locking to the
// internal RWMutex lock. This can be used when tableIndexStats is not being
// concurrently accessed.
func (t *tableIndexStats) getStatsForIndexID(
id roachpb.IndexID, createIfNotExists bool,
id roachpb.IndexID, createIfNotExists bool, unsafe bool,
) *indexStats {
if createIfNotExists {
t.Lock()
defer t.Unlock()
} else {
t.RLock()
defer t.RUnlock()
if !unsafe {
if createIfNotExists {
t.Lock()
defer t.Unlock()
} else {
t.RLock()
defer t.RUnlock()
}

}

if stats, ok := t.stats[id]; ok {
Expand Down Expand Up @@ -329,7 +372,7 @@ func (t *tableIndexStats) iterateIndexStats(
}

for _, indexID := range indexIDs {
indexStats := t.getStatsForIndexID(indexID, false /* createIfNotExists */)
indexStats := t.getStatsForIndexID(indexID, false /* createIfNotExists */, false /* unsafe */)

// This means the data is being cleared before we can fetch it. It's not an
// error, so we simply just skip over it.
Expand Down
2 changes: 1 addition & 1 deletion pkg/sql/idxusage/local_index_usage_stats_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ func TestIndexUsageStatisticsSubsystem(t *testing.T) {
t.Run("point lookup", func(t *testing.T) {
actualEntryCount := 0
for _, index := range indices {
stats := localIndexUsage.Get(index)
stats := localIndexUsage.Get(index.TableID, index.IndexID)
require.NotNil(t, stats)

actualEntryCount++
Expand Down
1 change: 1 addition & 0 deletions pkg/sql/logictest/testdata/logic_test/crdb_internal
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ crdb_internal gossip_liveness table NULL NULL NULL
crdb_internal gossip_network table NULL NULL NULL
crdb_internal gossip_nodes table NULL NULL NULL
crdb_internal index_columns table NULL NULL NULL
crdb_internal index_usage_statistics table NULL NULL NULL
crdb_internal interleaved table NULL NULL NULL
crdb_internal invalid_objects table NULL NULL NULL
crdb_internal jobs table NULL NULL NULL
Expand Down
1 change: 1 addition & 0 deletions pkg/sql/logictest/testdata/logic_test/crdb_internal_tenant
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ crdb_internal gossip_liveness table NULL NULL NULL
crdb_internal gossip_network table NULL NULL NULL
crdb_internal gossip_nodes table NULL NULL NULL
crdb_internal index_columns table NULL NULL NULL
crdb_internal index_usage_statistics table NULL NULL NULL
crdb_internal interleaved table NULL NULL NULL
crdb_internal invalid_objects table NULL NULL NULL
crdb_internal jobs table NULL NULL NULL
Expand Down
11 changes: 11 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/create_statements
Original file line number Diff line number Diff line change
Expand Up @@ -412,6 +412,17 @@ CREATE TABLE crdb_internal.index_columns (
column_direction STRING NULL,
implicit BOOL NULL
) {} {}
CREATE TABLE crdb_internal.index_usage_statistics (
table_id INT8 NOT NULL,
index_id INT8 NOT NULL,
total_reads INT8 NOT NULL,
last_read TIMESTAMPTZ NULL
) CREATE TABLE crdb_internal.index_usage_statistics (
table_id INT8 NOT NULL,
index_id INT8 NOT NULL,
total_reads INT8 NOT NULL,
last_read TIMESTAMPTZ NULL
) {} {}
CREATE TABLE crdb_internal.interleaved (
database_name STRING NOT NULL,
schema_name STRING NOT NULL,
Expand Down
1 change: 1 addition & 0 deletions pkg/sql/logictest/testdata/logic_test/grant_table
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ test crdb_internal gossip_liveness public
test crdb_internal gossip_network public SELECT
test crdb_internal gossip_nodes public SELECT
test crdb_internal index_columns public SELECT
test crdb_internal index_usage_statistics public SELECT
test crdb_internal interleaved public SELECT
test crdb_internal invalid_objects public SELECT
test crdb_internal jobs public SELECT
Expand Down
Loading

0 comments on commit 4e36b23

Please sign in to comment.