Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
75126: roachprod: move default env. vars into config.go and remove the duplicates r=srosenberg a=srosenberg

Both, COCKROACH_ENABLE_RPC_COMPRESSION=false and
COCKROACH_UI_RELEASE_NOTES_SIGNUP_DISMISSED=true are passed into ENV_VARS of
the systemd startup script (start.sh) by default.
The same behavior is preserved _modulo_ duplication.
Since roachprod can now be initialized either via CLI or API,
default env. vars have been consolidated into config.go (used by both).

Release note: None

78333: sql: record index usage stats in index joins r=maryliag,Azhng a=xinhaoz

Fixes #76173

Previously, the construction of index joins skipped the
recording of the  primary key for index usage stats.
This commit records the use of the primary key if the
statement is not of the EXPLAIN variety.

Release note (bug fix): Index usage stats are now properly
captured for index joins.

----------------------------------
Example: We should expect to see a read from the pkey after an index join.
```
root@:26257/defaultdb> CREATE TABLE test (c1 INT, c2 INT, c3 INT, INDEX(c1, c2));
CREATE TABLE


Time: 120ms total (execution 120ms / network 0ms)

root@:26257/defaultdb> EXPLAIN select c3 from TEST WHERE c2 > 1 AND c1 = 3;
                                              info
------------------------------------------------------------------------------------------------
  distribution: local
  vectorized: true

  • index join
  │ estimated row count: 1
  │ table: test@test_pkey
  │
  └── • scan
        estimated row count: 1 (100% of the table; stats collected 49 seconds ago)
        table: test@test_c1_c2_idx
        spans: [/3/2 - /3]

  index recommendations: 1
  1. type: index replacement
     SQL commands: CREATE INDEX ON test (c1, c2) STORING (c3); DROP INDEX test@test_c1_c2_idx;
(15 rows)


Time: 1ms total (execution 1ms / network 0ms)

root@:26257/defaultdb> select c3 from TEST WHERE c2 > 1 AND c1 = 3;
  c3
------
(0 rows)


Time: 1ms total (execution 1ms / network 0ms)

root@:26257/defaultdb> select index_name, index_type, total_reads from crdb_internal.index_usage_statistics AS us JOIN crdb_internal.table_indexes ti ON us.index_id = ti.index_id AND us.table_id = ti.descriptor_id ORDER BY total_reads asc;
    index_name   | index_type | total_reads
-----------------+------------+--------------
  test_pkey      | primary    |           1
  test_c1_c2_idx | secondary  |           1
(2 rows)


Time: 7ms total (execution 6ms / network 0ms)
```

78446: sql: new SQL Stats cluster settings to improve write traffic r=Azhng a=Azhng

Resolves #78339

Previously, SQL Stats flushed to system table as soon as the in-memory
buffer is full. This means  the size of the system tables that back SQL
Stats could grow faster than the cleanup job. Additionally, when the
SQL Stats flush is disabled, the SQL Stats is unable to collect any more
new statement / transaction statistics when the in-memory store is full.

This commit introduces two non-public cluster settings:
* `sql.stats.flush.minimum_interval`: this setting limits minimum interval
  between each flush operation. If a flush operation is triggered sooner
  than what is allowed by the minimum interval, (e.g. when the in-memory
  SQL Stats store is full), the flush operation is aborted.
* `sql.stats.flush.force_cleanup.enabled`: which allows the
  in-memory SQL Stats to be cleared at the interval specified by
  `sql.stats.flush.interval`, even if the SQL Stats flush is disabled.

This commit also updated the stmt_grouping_in_explicit_txn data driven
test to ensure the output order is deterministic.

Release note: None

78526: ci: add scripts for bazel-based `ui test`, `ui lint` ci jobs r=rail,jsbarag a=rickystewart

These scripts just do the same logic that `dev` does for the same
functions.

Release note: None

Co-authored-by: Stan Rosenberg <[email protected]>
Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: Azhng <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
  • Loading branch information
5 people committed Mar 29, 2022
5 parents b9ea426 + f5c22b0 + a57aa7c + f996b30 + c345b54 commit 7f8bdf2
Show file tree
Hide file tree
Showing 20 changed files with 378 additions and 68 deletions.
2 changes: 1 addition & 1 deletion build/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ The `bazelbuilder` image is used exclusively for performing builds using Bazel.
docker manifest push cockroachdb/bazel:$TAG
```
- Then, update `build/teamcity-bazel-support.sh` with the new tag and commit all your changes.
- Ensure the "GitHub CI (Optional)" job passes on your PR before merging.
- Ensure the "Bazel CI" job passes on your PR before merging.

# Dependencies

Expand Down
12 changes: 12 additions & 0 deletions build/bazelbuilder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,18 @@ RUN curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key a
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
google-cloud-sdk

# chrome is needed for UI tests. Unfortunately it is only distributed for x86_64
# and not for ARM. Chrome shouldn't need to be installed when we migrate away
# from Karma for UI testing.
RUN case ${TARGETPLATFORM} in \
"linux/amd64") \
curl -fsSL https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& echo "deb [arch=amd64] https://dl.google.com/linux/chrome/deb/ stable main" | tee /etc/apt/sources.list.d/google.list \
&& apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
google-chrome-stable ;; \
esac

RUN apt-get purge -y \
apt-transport-https \
flex \
Expand Down
2 changes: 1 addition & 1 deletion build/teamcity-bazel-support.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# FYI: You can run `./dev builder` to run this Docker image. :)
# `dev` depends on this variable! Don't change the name or format unless you
# also update `dev` accordingly.
BAZEL_IMAGE=cockroachdb/bazel:20220121-121551
BAZEL_IMAGE=cockroachdb/bazel:20220328-163955

# Call `run_bazel $NAME_OF_SCRIPT` to start an appropriately-configured Docker
# container with the `cockroachdb/bazel` image running the given script.
Expand Down
12 changes: 12 additions & 0 deletions build/teamcity/cockroach/ci/tests/ui_lint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env bash

set -euo pipefail

dir="$(dirname $(dirname $(dirname $(dirname $(dirname "${0}")))))"

source "$dir/teamcity-support.sh" # For $root
source "$dir/teamcity-bazel-support.sh" # For run_bazel

tc_start_block "Run UI tests"
run_bazel build/teamcity/cockroach/ci/tests/ui_lint_impl.sh
tc_end_block "Run UI tests"
6 changes: 6 additions & 0 deletions build/teamcity/cockroach/ci/tests/ui_lint_impl.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

set -xeuo pipefail

bazel build //pkg/cmd/bazci --config=ci
$(bazel info bazel-bin --config=ci)/pkg/cmd/bazci/bazci_/bazci test --config=ci //pkg/ui:lint
12 changes: 12 additions & 0 deletions build/teamcity/cockroach/ci/tests/ui_test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env bash

set -euo pipefail

dir="$(dirname $(dirname $(dirname $(dirname $(dirname "${0}")))))"

source "$dir/teamcity-support.sh" # For $root
source "$dir/teamcity-bazel-support.sh" # For run_bazel

tc_start_block "Run UI tests"
run_bazel build/teamcity/cockroach/ci/tests/ui_test_impl.sh
tc_end_block "Run UI tests"
6 changes: 6 additions & 0 deletions build/teamcity/cockroach/ci/tests/ui_test_impl.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

set -xeuo pipefail

bazel build //pkg/cmd/bazci --config=ci
$(bazel info bazel-bin --config=ci)/pkg/cmd/bazci/bazci_/bazci test --config=ci //pkg/ui/workspaces/db-console:karma //pkg/ui/workspaces/cluster-ui:jest
6 changes: 4 additions & 2 deletions pkg/ccl/serverccl/statusccl/tenant_status_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,7 @@ VALUES (1, 10, 100), (2, 20, 200), (3, 30, 300)
Exec(t, "SELECT * FROM test")

// Record scan on secondary index.
// Note that this is an index join and will also read from the primary index.
cluster.tenantConn(randomServer).
Exec(t, "SELECT * FROM test@test_a_idx")
testTableIDStr := cluster.tenantConn(randomServer).
Expand All @@ -463,7 +464,7 @@ WHERE
table_id = ` + testTableIDStr
// Assert index usage data was inserted.
expected := [][]string{
{testTableIDStr, "1", "1", "true"},
{testTableIDStr, "1", "2", "true"}, // Primary index
{testTableIDStr, "2", "1", "true"},
}
cluster.tenantConn(randomServer).CheckQueryResults(t, query, expected)
Expand Down Expand Up @@ -771,6 +772,7 @@ VALUES (1, 10, 100), (2, 20, 200), (3, 30, 300)
testingCluster.tenantConn(0).Exec(t, "SELECT * FROM idx_test.test")

// Record scan on secondary index.
// Note that this is an index join and will also read from the primary index.
testingCluster.tenantConn(1).Exec(t, "SELECT * FROM idx_test.test@test_a_idx")
testTableIDStr := testingCluster.tenantConn(2).QueryStr(t, "SELECT 'idx_test.test'::regclass::oid")[0][0]
testTableID, err := strconv.Atoi(testTableIDStr)
Expand All @@ -789,7 +791,7 @@ WHERE
`
actual := testingCluster.tenantConn(2).QueryStr(t, query, testTableID)
expected := [][]string{
{testTableIDStr, "1", "1", "true"},
{testTableIDStr, "1", "2", "true"},
{testTableIDStr, "2", "1", "true"},
}

Expand Down
2 changes: 2 additions & 0 deletions pkg/cmd/bazci/bazci.go
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,8 @@ func getBuildInfo(args parsedArgs) (buildInfo, error) {
// to replace (it's the output directory for the configuration).
componentsTestlogs[len(componentsTestlogs)-2] = componentsBinLocation[len(componentsTestlogs)-2]
ret.transitionTests[fullTarget] = strings.Join(componentsTestlogs, "/")
case "nodejs_test":
ret.tests = append(ret.tests, fullTarget)
case "test_suite":
// Expand the list of tests from the test suite with another query.
allTests, err := runBazelReturningStdout("query", "tests("+fullTarget+")")
Expand Down
52 changes: 21 additions & 31 deletions pkg/cmd/roachprod/flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,36 +45,26 @@ var (
listPattern string
secure = false
extraSSHOptions = ""
nodeEnv = []string{
// NOTE: The defaults are also copied in roachtest's invocation of roachprod
// (which overrides the default). On changes, consider updating that one
// too.

// RPC compressions costs around 5% on kv95, so we disable it. It might help
// when moving snapshots around, though.
"COCKROACH_ENABLE_RPC_COMPRESSION=false",
// Get rid of an annoying popup in the UI.
"COCKROACH_UI_RELEASE_NOTES_SIGNUP_DISMISSED=true",
}
tag string
external = false
pgurlCertsDir string
adminurlOpen = false
adminurlPath = ""
adminurlIPs = false
useTreeDist = true
sig = 9
waitFlag = false
createVMOpts = vm.DefaultCreateOpts()
startOpts = roachprod.DefaultStartOpts()
stageOS string
stageDir string
logsDir string
logsFilter string
logsProgramFilter string
logsFrom time.Time
logsTo time.Time
logsInterval time.Duration
nodeEnv []string
tag string
external = false
pgurlCertsDir string
adminurlOpen = false
adminurlPath = ""
adminurlIPs = false
useTreeDist = true
sig = 9
waitFlag = false
createVMOpts = vm.DefaultCreateOpts()
startOpts = roachprod.DefaultStartOpts()
stageOS string
stageDir string
logsDir string
logsFilter string
logsProgramFilter string
logsFrom time.Time
logsTo time.Time
logsInterval time.Duration

monitorOpts install.MonitorOpts
cachedHostsCluster string
Expand Down Expand Up @@ -188,7 +178,7 @@ func initFlags() {
startCmd.Flags().StringArrayVarP(&startOpts.ExtraArgs,
"args", "a", nil, "node arguments")
startCmd.Flags().StringArrayVarP(&nodeEnv,
"env", "e", nodeEnv, "node environment variables")
"env", "e", config.DefaultEnvVars(), "node environment variables")
startCmd.Flags().BoolVar(&startOpts.EncryptedStores,
"encrypt", startOpts.EncryptedStores, "start nodes with encryption at rest turned on")
startCmd.Flags().BoolVar(&startOpts.SkipInit,
Expand Down
13 changes: 0 additions & 13 deletions pkg/cmd/roachtest/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -1795,19 +1795,6 @@ func (c *clusterImpl) StartE(

startOpts.RoachprodOpts.EncryptedStores = c.encAtRest

// Set some env vars. The first two also the default for `roachprod start`,
// but we have to add them so that the third one doesn't wipe them out.
if !envExists(settings.Env, "COCKROACH_ENABLE_RPC_COMPRESSION") {
// RPC compressions costs around 5% on kv95, so we disable it. It might help
// when moving snapshots around, though.
settings.Env = append(settings.Env, "COCKROACH_ENABLE_RPC_COMPRESSION=false")
}

if !envExists(settings.Env, "COCKROACH_UI_RELEASE_NOTES_SIGNUP_DISMISSED") {
// Get rid of an annoying popup in the UI.
settings.Env = append(settings.Env, "COCKROACH_UI_RELEASE_NOTES_SIGNUP_DISMISSED=true")
}

if !envExists(settings.Env, "COCKROACH_CRASH_ON_SPAN_USE_AFTER_FINISH") {
// Panic on span use-after-Finish, so we catch such bugs.
settings.Env = append(settings.Env, "COCKROACH_CRASH_ON_SPAN_USE_AFTER_FINISH=true")
Expand Down
14 changes: 14 additions & 0 deletions pkg/roachprod/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,20 @@ const (
DefaultNumFilesLimit = 65 << 13
)

// DefaultEnvVars returns default environment variables used in conjunction with CLI and MakeClusterSettings.
// These can be overriden by specifying different values (last one wins).
// See 'generateStartCmd' which sets 'ENV_VARS' for the systemd startup script (start.sh).
func DefaultEnvVars() []string {
return []string{
// RPC compressions costs around 5% on kv95, so we disable it. It might help
// when moving snapshots around, though.
// (For other perf. related knobs, see https://github.com/cockroachdb/cockroach/issues/17165)
"COCKROACH_ENABLE_RPC_COMPRESSION=false",
// Get rid of an annoying popup in the UI.
"COCKROACH_UI_RELEASE_NOTES_SIGNUP_DISMISSED=true",
}
}

// IsLocalClusterName returns true if the given name is a valid name for a local
// cluster.
//
Expand Down
7 changes: 2 additions & 5 deletions pkg/roachprod/install/cluster_settings.go
Original file line number Diff line number Diff line change
Expand Up @@ -100,11 +100,8 @@ func MakeClusterSettings(opts ...ClusterSettingOption) ClusterSettings {
PGUrlCertsDir: "./certs",
Secure: false,
UseTreeDist: true,
Env: []string{
"COCKROACH_ENABLE_RPC_COMPRESSION=false",
"COCKROACH_UI_RELEASE_NOTES_SIGNUP_DISMISSED=true",
},
NumRacks: 0,
Env: config.DefaultEnvVars(),
NumRacks: 0,
}
// Override default values using the passed options (if any).
for _, opt := range opts {
Expand Down
29 changes: 25 additions & 4 deletions pkg/server/index_usage_stats_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@ func TestStatusAPIIndexUsage(t *testing.T) {
LastRead: timeutil.Now(),
}

expectedStatsIndexPrimary := roachpb.IndexUsageStatistics{
TotalReadCount: 1,
LastRead: timeutil.Now(),
}

firstPgURL, firstServerConnCleanup := sqlutils.PGUrl(
t, firstServer.ServingSQLAddr(), "CreateConnections" /* prefix */, url.User(security.RootUser))
defer firstServerConnCleanup()
Expand Down Expand Up @@ -111,6 +116,11 @@ func TestStatusAPIIndexUsage(t *testing.T) {
require.NoError(t, err)
require.False(t, rows.Next())

indexKeyPrimary := roachpb.IndexUsageKey{
TableID: roachpb.TableID(tableID),
IndexID: 1, // t@t_pkey
}

indexKeyA := roachpb.IndexUsageKey{
TableID: roachpb.TableID(tableID),
IndexID: 2, // t@t_a_idx
Expand Down Expand Up @@ -150,7 +160,7 @@ func TestStatusAPIIndexUsage(t *testing.T) {
_, err = secondServerSQLConn.Exec("SELECT k FROM t WHERE a = 10 AND b = 200")
require.NoError(t, err)

// Record a full scan over t_b_idx.
// Record an index join and full scan of t_b_idx.
_, err = secondServerSQLConn.Exec("SELECT * FROM t@t_b_idx")
require.NoError(t, err)

Expand All @@ -164,20 +174,29 @@ func TestStatusAPIIndexUsage(t *testing.T) {
thirdLocalStatsReader := thirdServer.SQLServer().(*sql.Server).GetLocalIndexStatistics()

// First node should have nothing.
stats := firstLocalStatsReader.Get(indexKeyA.TableID, indexKeyA.IndexID)
stats := firstLocalStatsReader.Get(indexKeyPrimary.TableID, indexKeyPrimary.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 1, but found %v", stats)

stats = firstLocalStatsReader.Get(indexKeyA.TableID, indexKeyA.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 1, but found %v", stats)

stats = firstLocalStatsReader.Get(indexKeyB.TableID, indexKeyB.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 1, but found %v", stats)

// Third node should have nothing.
stats = firstLocalStatsReader.Get(indexKeyPrimary.TableID, indexKeyPrimary.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 3, but found %v", stats)

stats = thirdLocalStatsReader.Get(indexKeyA.TableID, indexKeyA.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 3, but found %v", stats)

stats = thirdLocalStatsReader.Get(indexKeyB.TableID, indexKeyB.IndexID)
require.Equal(t, roachpb.IndexUsageStatistics{}, stats, "expecting empty stats on node 1, but found %v", stats)

// Second server should have nonempty local storage.
stats = secondLocalStatsReader.Get(indexKeyPrimary.TableID, indexKeyPrimary.IndexID)
compareStatsHelper(t, expectedStatsIndexPrimary, stats, time.Minute)

stats = secondLocalStatsReader.Get(indexKeyA.TableID, indexKeyA.IndexID)
compareStatsHelper(t, expectedStatsIndexA, stats, time.Minute)

Expand All @@ -197,14 +216,16 @@ func TestStatusAPIIndexUsage(t *testing.T) {
}
statsEntries++
switch stats.Key.IndexID {
case indexKeyPrimary.IndexID: // t@t_pkey
compareStatsHelper(t, expectedStatsIndexPrimary, stats.Stats, time.Minute)
case indexKeyA.IndexID: // t@t_a_idx
compareStatsHelper(t, expectedStatsIndexA, stats.Stats, time.Minute)
case indexKeyB.IndexID: // t@t_b_idx
compareStatsHelper(t, expectedStatsIndexB, stats.Stats, time.Minute)
}
}

require.True(t, statsEntries == 2, "expect to find two stats entries in RPC response, but found %d", statsEntries)
require.Equal(t, 3, statsEntries, "expect to find 3 stats entries in RPC response, but found %d", statsEntries)

// Test disabling subsystem.
_, err = secondServerSQLConn.Exec("SET CLUSTER SETTING sql.metrics.index_usage_stats.enabled = false")
Expand Down Expand Up @@ -232,7 +253,7 @@ func TestStatusAPIIndexUsage(t *testing.T) {
compareStatsHelper(t, expectedStatsIndexB, stats.Stats, time.Minute)
}
}
require.True(t, statsEntries == 2, "expect to find two stats entries in RPC response, but found %d", statsEntries)
require.Equal(t, 3, statsEntries, "expect to find 3 stats entries in RPC response, but found %d", statsEntries)
}

func TestGetTableID(t *testing.T) {
Expand Down
11 changes: 10 additions & 1 deletion pkg/sql/opt_exec_factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -610,9 +610,18 @@ func (ef *execFactory) ConstructIndexJoin(
return nil, err
}

tableScan.index = tabDesc.GetPrimaryIndex()
idx := tabDesc.GetPrimaryIndex()
tableScan.index = idx
tableScan.disableBatchLimit()

if !ef.isExplain {
idxUsageKey := roachpb.IndexUsageKey{
TableID: roachpb.TableID(tabDesc.GetID()),
IndexID: roachpb.IndexID(idx.GetID()),
}
ef.planner.extendedEvalCtx.indexUsageStats.RecordRead(idxUsageKey)
}

n := &indexJoinNode{
input: input.(planNode),
table: tableScan,
Expand Down
26 changes: 26 additions & 0 deletions pkg/sql/sqlstats/persistedsqlstats/cluster_settings.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,32 @@ var SQLStatsFlushInterval = settings.RegisterDurationSetting(
settings.NonNegativeDurationWithMaximum(time.Hour*24),
).WithPublic()

// MinimumInterval is the cluster setting that controls the minimum interval
// between each flush operation. If flush operations get triggered faster
// than what is allowed by this setting, (e.g. when too many fingerprints are
// generated in a short span of time, which in turn cause memory pressure), the
// flush operation will be aborted.
var MinimumInterval = settings.RegisterDurationSetting(
settings.TenantWritable,
"sql.stats.flush.minimum_interval",
"the minimum interval that SQL stats can be flushes to disk. If a "+
"flush operation starts within less than the minimum interval, the flush "+
"operation will be aborted",
0,
settings.NonNegativeDuration,
)

// DiscardInMemoryStatsWhenFlushDisabled is the cluster setting that allows the
// older in-memory SQL stats to be discarded when flushing to persisted tables
// is disabled.
var DiscardInMemoryStatsWhenFlushDisabled = settings.RegisterBoolSetting(
settings.TenantWritable,
"sql.stats.flush.force_cleanup.enabled",
"if set, older SQL stats are discarded periodically when flushing to "+
"persisted tables is disabled",
false,
)

// SQLStatsFlushEnabled is the cluster setting that controls if the sqlstats
// subsystem persists the statistics into system table.
var SQLStatsFlushEnabled = settings.RegisterBoolSetting(
Expand Down
Loading

0 comments on commit 7f8bdf2

Please sign in to comment.