Skip to content

Commit

Permalink
Merge #69293
Browse files Browse the repository at this point in the history
69293: sql: implement a fast compressed logical plan mechanism r=rtaft,RaduBerinde a=cucaroach


    Implement a plan "gist" serializer piggy backing on the exec gen/explain
    factory infrastructure so that we can always know what the logical plan
    was and can do historical and statistical tracking. Logically its like
    an explain (SHAPE) but is even more stripped down. A gist is a sequence
    of bytes representing the flattened tree of operators and various
    operator specific metadata.

    The goal is to record every logical plan we use for every query to have
    historical data on which plans are used possibly linked up to statistics
    so we know which stats go with which logical plan.

    Also implement a decoder to turn the serialized plan back into a tree of
    explain.Node's that can be displayed using existing explain code.

    Currently this functionality is only exposed via a new EXPLAIN mode and
    via a crdb_internal "decoder" SRF.   EXPLAIN (GIST) takes a query and
    returns a single string which is the encoded gist.
    crdb_internal.decode_plan_gist() takes an encoded gist string and writes
    out the logical plan one row per line.

    For performance numbers of the ExecBuild comparing a StubFactory to a
    PlanGistFactory wrapped around a StubFactory see the PR.

    Release note (sql change): Record compressed plan gist for all queries.
    For example, a query like this:

    SELECT * FROM abc UNION SELECT * FROM abc ORDER BY b,a

    Produces the following plan according to EXPLAIN (SHAPE)

    • distinct
    │ distinct on: a
    │
    └── • union all
        │
        ├── • sort
        │   │ order: +b,+a
        │   │
        │   └── • scan
        │         missing stats
        │         table: abc@primary
        │         spans: FULL SCAN
        │
        └── • sort
            │ order: +b,+a
            │
            └── • scan
                  missing stats
                  table: abc@primary
                  spans: FULL SCAN

    produces the following "gist":

    AgFuAgAHAAAAEQFuAgAHAAAAERANAAYGAA==

    The "gist" can be turned back into the following plan:

    • distinct
    │ distinct on
    │
    └── • union all
        │
        ├── • sort
        │   │ order
        │   │
        │   └── • scan
        │         table: abc@primary
        │         spans: FULL SCAN
        │
        └── • sort
            │ order
            │
            └── • scan
                  table: abc@primary
                  spans: FULL SCAN

    Fixes: #63885


Co-authored-by: Tommy Reilly <[email protected]>
  • Loading branch information
craig[bot] and cucaroach committed Oct 27, 2021
2 parents fa5c146 + 3c7855b commit 55514ee
Show file tree
Hide file tree
Showing 51 changed files with 3,301 additions and 252 deletions.
6 changes: 5 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -891,7 +891,8 @@ OPTGEN_TARGETS = \
pkg/sql/opt/rule_name.og.go \
pkg/sql/opt/rule_name_string.go \
pkg/sql/opt/exec/factory.og.go \
pkg/sql/opt/exec/explain/explain_factory.og.go
pkg/sql/opt/exec/explain/explain_factory.og.go \
pkg/sql/opt/exec/explain/plan_gist_factory.og.go \

# removed-files is a list of files that used to exist in the
# repository that need to be explicitly cleaned up to prevent build
Expand Down Expand Up @@ -1654,6 +1655,9 @@ pkg/sql/opt/exec/factory.og.go: $(optgen-defs) $(optgen-exec-defs) bin/optgen
pkg/sql/opt/exec/explain/explain_factory.og.go: $(optgen-defs) $(optgen-exec-defs) bin/optgen
optgen -out $@ execexplain $(optgen-exec-defs)

pkg/sql/opt/exec/explain/plan_gist_factory.og.go: $(optgen-defs) $(optgen-exec-defs) bin/optgen
optgen -out $@ execplangist $(optgen-exec-defs)

.PHONY: clean-c-deps
clean-c-deps:
rm -rf $(JEMALLOC_DIR)
Expand Down
2 changes: 2 additions & 0 deletions docs/generated/sql/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -2533,6 +2533,8 @@ The swap_ordinate_string parameter is a 2-character string naming the ordinates
</span></td></tr>
<tr><td><a name="convert_to"></a><code>convert_to(str: <a href="string.html">string</a>, enc: <a href="string.html">string</a>) &rarr; <a href="bytes.html">bytes</a></code></td><td><span class="funcdesc"><p>Encode the string <code>str</code> as a byte array using encoding <code>enc</code>. Supports encodings ‘UTF8’ and ‘LATIN1’.</p>
</span></td></tr>
<tr><td><a name="crdb_internal.decode_plan_gist"></a><code>crdb_internal.decode_plan_gist(gist: <a href="string.html">string</a>) &rarr; <a href="string.html">string</a></code></td><td><span class="funcdesc"><p>Returns rows of output similar to EXPLAIN from a gist created by EXPLAIN (GIST)</p>
</span></td></tr>
<tr><td><a name="crdb_internal.show_create_all_schemas"></a><code>crdb_internal.show_create_all_schemas(database_name: <a href="string.html">string</a>) &rarr; <a href="string.html">string</a></code></td><td><span class="funcdesc"><p>Returns rows of CREATE schema statements.
The output can be used to recreate a database.’</p>
</span></td></tr>
Expand Down
3 changes: 1 addition & 2 deletions pkg/sql/distsql_spec_exec_factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -806,8 +806,7 @@ func (e *distSQLSpecExecFactory) ConstructExplain(
// We cannot create the explained plan in the same PlanInfrastructure with the
// "outer" plan. Create a separate factory.
newFactory := newDistSQLSpecExecFactory(e.planner, e.planningMode)
explainFactory := explain.NewFactory(newFactory)
plan, err := buildFn(explainFactory)
plan, err := buildFn(newFactory)
// Release the resources acquired during the physical planning right away.
newFactory.(*distSQLSpecExecFactory).planCtx.getCleanupFunc()()
if err != nil {
Expand Down
4 changes: 4 additions & 0 deletions pkg/sql/exec_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -2712,6 +2712,10 @@ func (m *sessionDataMutator) SetSynchronousCommit(val bool) {
m.data.SynchronousCommit = val
}

func (m *sessionDataMutator) SetDisablePlanGists(val bool) {
m.data.DisablePlanGists = val
}

func (m *sessionDataMutator) SetDistSQLMode(val sessiondatapb.DistSQLExecMode) {
m.data.DistSQLMode = val
}
Expand Down
146 changes: 75 additions & 71 deletions pkg/sql/explain_plan.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,86 +50,90 @@ func (e *explainPlanNode) startExec(params runParams) error {
ob := explain.NewOutputBuilder(e.flags)
plan := e.plan.WrappedPlan.(*planComponents)

// Determine the "distribution" and "vectorized" values, which we will emit as
// special rows.

// Note that we delay adding the annotation about the distribution until
// after the plan is finalized (when the physical plan is successfully
// created).
distribution := getPlanDistribution(
params.ctx, params.p, params.extendedEvalCtx.ExecCfg.NodeID,
params.extendedEvalCtx.SessionData().DistSQLMode, plan.main,
)

outerSubqueries := params.p.curPlan.subqueryPlans
distSQLPlanner := params.extendedEvalCtx.DistSQLPlanner
planCtx := newPlanningCtxForExplainPurposes(distSQLPlanner, params, plan.subqueryPlans, distribution)
defer func() {
planCtx.planner.curPlan.subqueryPlans = outerSubqueries
}()
physicalPlan, err := newPhysPlanForExplainPurposes(planCtx, distSQLPlanner, plan.main)
var diagramURL url.URL
var diagramJSON string
if err != nil {
if e.options.Mode == tree.ExplainDistSQL {
if len(plan.subqueryPlans) > 0 {
return errors.New("running EXPLAIN (DISTSQL) on this query is " +
"unsupported because of the presence of subqueries")
}
return err
}
ob.AddDistribution(distribution.String())
// For regular EXPLAIN, simply skip emitting the "vectorized" information.
var rows []string
if e.options.Mode == tree.ExplainGist {
rows = []string{e.plan.Gist.String()}
} else {
// There might be an issue making the physical plan, but that should not
// cause an error or panic, so swallow the error. See #40677 for example.
distSQLPlanner.finalizePlanWithRowCount(planCtx, physicalPlan, plan.mainRowCount)
ob.AddDistribution(physicalPlan.Distribution.String())
flows := physicalPlan.GenerateFlowSpecs()
flowCtx := newFlowCtxForExplainPurposes(planCtx, params.p)

ctxSessionData := flowCtx.EvalCtx.SessionData()
var willVectorize bool
if ctxSessionData.VectorizeMode == sessiondatapb.VectorizeOff {
willVectorize = false
// Determine the "distribution" and "vectorized" values, which we will emit as
// special rows.

// Note that we delay adding the annotation about the distribution until
// after the plan is finalized (when the physical plan is successfully
// created).
distribution := getPlanDistribution(
params.ctx, params.p, params.extendedEvalCtx.ExecCfg.NodeID,
params.extendedEvalCtx.SessionData().DistSQLMode, plan.main,
)

outerSubqueries := params.p.curPlan.subqueryPlans
distSQLPlanner := params.extendedEvalCtx.DistSQLPlanner
planCtx := newPlanningCtxForExplainPurposes(distSQLPlanner, params, plan.subqueryPlans, distribution)
defer func() {
planCtx.planner.curPlan.subqueryPlans = outerSubqueries
}()
physicalPlan, err := newPhysPlanForExplainPurposes(planCtx, distSQLPlanner, plan.main)
var diagramURL url.URL
var diagramJSON string
if err != nil {
if e.options.Mode == tree.ExplainDistSQL {
if len(plan.subqueryPlans) > 0 {
return errors.New("running EXPLAIN (DISTSQL) on this query is " +
"unsupported because of the presence of subqueries")
}
return err
}
ob.AddDistribution(distribution.String())
// For regular EXPLAIN, simply skip emitting the "vectorized" information.
} else {
willVectorize = true
for _, flow := range flows {
if err := colflow.IsSupported(ctxSessionData.VectorizeMode, flow); err != nil {
willVectorize = false
break
// There might be an issue making the physical plan, but that should not
// cause an error or panic, so swallow the error. See #40677 for example.
distSQLPlanner.finalizePlanWithRowCount(planCtx, physicalPlan, plan.mainRowCount)
ob.AddDistribution(physicalPlan.Distribution.String())
flows := physicalPlan.GenerateFlowSpecs()
flowCtx := newFlowCtxForExplainPurposes(planCtx, params.p)

ctxSessionData := flowCtx.EvalCtx.SessionData()
var willVectorize bool
if ctxSessionData.VectorizeMode == sessiondatapb.VectorizeOff {
willVectorize = false
} else {
willVectorize = true
for _, flow := range flows {
if err := colflow.IsSupported(ctxSessionData.VectorizeMode, flow); err != nil {
willVectorize = false
break
}
}
}
}
ob.AddVectorized(willVectorize)
ob.AddVectorized(willVectorize)

if e.options.Mode == tree.ExplainDistSQL {
flags := execinfrapb.DiagramFlags{
ShowInputTypes: e.options.Flags[tree.ExplainFlagTypes],
}
diagram, err := execinfrapb.GeneratePlanDiagram(params.p.stmt.String(), flows, flags)
if err != nil {
return err
}
if e.options.Mode == tree.ExplainDistSQL {
flags := execinfrapb.DiagramFlags{
ShowInputTypes: e.options.Flags[tree.ExplainFlagTypes],
}
diagram, err := execinfrapb.GeneratePlanDiagram(params.p.stmt.String(), flows, flags)
if err != nil {
return err
}

diagramJSON, diagramURL, err = diagram.ToURL()
if err != nil {
return err
diagramJSON, diagramURL, err = diagram.ToURL()
if err != nil {
return err
}
}
}
}

var rows []string
if e.options.Flags[tree.ExplainFlagJSON] {
// For the JSON flag, we only want to emit the diagram JSON.
rows = []string{diagramJSON}
} else {
if err := emitExplain(ob, params.EvalContext(), params.p.ExecCfg().Codec, e.plan); err != nil {
return err
}
rows = ob.BuildStringRows()
if e.options.Mode == tree.ExplainDistSQL {
rows = append(rows, "", fmt.Sprintf("Diagram: %s", diagramURL.String()))
if e.options.Flags[tree.ExplainFlagJSON] {
// For the JSON flag, we only want to emit the diagram JSON.
rows = []string{diagramJSON}
} else {
if err := emitExplain(ob, params.EvalContext(), params.p.ExecCfg().Codec, e.plan); err != nil {
return err
}
rows = ob.BuildStringRows()
if e.options.Mode == tree.ExplainDistSQL {
rows = append(rows, "", fmt.Sprintf("Diagram: %s", diagramURL.String()))
}
}
}
v := params.p.newContainerValuesNode(colinfo.ExplainPlanColumns, 0)
Expand Down
5 changes: 5 additions & 0 deletions pkg/sql/faketreeeval/evalctx.go
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,11 @@ func (*DummyEvalPlanner) ExternalWriteFile(ctx context.Context, uri string, cont
return errors.WithStack(errEvalPlanner)
}

// DecodeGist is part of the EvalPlanner interface.
func (*DummyEvalPlanner) DecodeGist(gist string) ([]string, error) {
return nil, errors.WithStack(errEvalPlanner)
}

var _ tree.EvalPlanner = &DummyEvalPlanner{}

var errEvalPlanner = pgerror.New(pgcode.ScalarOperationCannotRunWithoutFullSessionContext,
Expand Down
4 changes: 4 additions & 0 deletions pkg/sql/instrumentation.go
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,10 @@ type instrumentationHelper struct {

// regions used only on EXPLAIN ANALYZE to be displayed as top-level stat.
regions []string

// planGist is a compressed version of plan that can be converted (lossily)
// back into a logical plan or be used to get a plan hash.
planGist explain.PlanGist
}

// outputMode indicates how the statement output needs to be populated (for
Expand Down
1 change: 1 addition & 0 deletions pkg/sql/logictest/testdata/logic_test/information_schema
Original file line number Diff line number Diff line change
Expand Up @@ -4625,6 +4625,7 @@ default_transaction_priority normal
default_transaction_read_only off
default_transaction_use_follower_reads off
disable_partially_distributed_plans off
disable_plan_gists off
disallow_full_table_scans off
distsql_workmem 64 MiB
enable_copying_partitioning_when_deinterleaving_table off
Expand Down
3 changes: 3 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/pg_catalog
Original file line number Diff line number Diff line change
Expand Up @@ -3994,6 +3994,7 @@ default_transaction_priority normal NULL
default_transaction_read_only off NULL NULL NULL string
default_transaction_use_follower_reads off NULL NULL NULL string
disable_partially_distributed_plans off NULL NULL NULL string
disable_plan_gists off NULL NULL NULL string
disallow_full_table_scans off NULL NULL NULL string
distsql off NULL NULL NULL string
distsql_workmem 64 MiB NULL NULL NULL string
Expand Down Expand Up @@ -4091,6 +4092,7 @@ default_transaction_priority normal NULL
default_transaction_read_only off NULL user NULL off off
default_transaction_use_follower_reads off NULL user NULL off off
disable_partially_distributed_plans off NULL user NULL off off
disable_plan_gists off NULL user NULL off off
disallow_full_table_scans off NULL user NULL off off
distsql off NULL user NULL off off
distsql_workmem 64 MiB NULL user NULL 64 MiB 64 MiB
Expand Down Expand Up @@ -4184,6 +4186,7 @@ default_transaction_priority NULL NULL NULL
default_transaction_read_only NULL NULL NULL NULL NULL
default_transaction_use_follower_reads NULL NULL NULL NULL NULL
disable_partially_distributed_plans NULL NULL NULL NULL NULL
disable_plan_gists NULL NULL NULL NULL NULL
disallow_full_table_scans NULL NULL NULL NULL NULL
distsql NULL NULL NULL NULL NULL
distsql_workmem NULL NULL NULL NULL NULL
Expand Down
6 changes: 6 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/set
Original file line number Diff line number Diff line change
Expand Up @@ -560,3 +560,9 @@ SET cluster setting sql.trace.txn.enable_threshold='1s'

statement ok
SET cluster setting sql.trace.txn.enable_threshold='0s'

statement ok
SET disable_plan_gists = 'true'

statement ok
SET disable_plan_gists = 'false'
1 change: 1 addition & 0 deletions pkg/sql/logictest/testdata/logic_test/show_source
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ default_transaction_priority normal
default_transaction_read_only off
default_transaction_use_follower_reads off
disable_partially_distributed_plans off
disable_plan_gists off
disallow_full_table_scans off
distsql off
distsql_workmem 64 MiB
Expand Down
1 change: 1 addition & 0 deletions pkg/sql/opt/bench/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ go_test(
"//pkg/sql/catalog/schemaexpr",
"//pkg/sql/opt/exec",
"//pkg/sql/opt/exec/execbuilder",
"//pkg/sql/opt/exec/explain",
"//pkg/sql/opt/memo",
"//pkg/sql/opt/optbuilder",
"//pkg/sql/opt/testutils/testcat",
Expand Down
5 changes: 3 additions & 2 deletions pkg/sql/opt/bench/bench_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/sql/catalog/schemaexpr"
"github.com/cockroachdb/cockroach/pkg/sql/opt/exec"
"github.com/cockroachdb/cockroach/pkg/sql/opt/exec/execbuilder"
"github.com/cockroachdb/cockroach/pkg/sql/opt/exec/explain"
"github.com/cockroachdb/cockroach/pkg/sql/opt/memo"
"github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder"
"github.com/cockroachdb/cockroach/pkg/sql/opt/testutils/testcat"
Expand Down Expand Up @@ -479,7 +480,7 @@ func (h *harness) runSimple(tb testing.TB, query benchQuery, phase Phase) {

root := execMemo.RootExpr()
eb := execbuilder.New(
exec.StubFactory{},
explain.NewPlanGistFactory(exec.StubFactory{}),
&h.optimizer,
execMemo,
nil, /* catalog */
Expand Down Expand Up @@ -532,7 +533,7 @@ func (h *harness) runPrepared(tb testing.TB, phase Phase) {

root := execMemo.RootExpr()
eb := execbuilder.New(
exec.StubFactory{},
explain.NewPlanGistFactory(exec.StubFactory{}),
&h.optimizer,
execMemo,
nil, /* catalog */
Expand Down
1 change: 1 addition & 0 deletions pkg/sql/opt/exec/execbuilder/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ go_library(
"//pkg/sql/opt/cat",
"//pkg/sql/opt/constraint",
"//pkg/sql/opt/exec",
"//pkg/sql/opt/exec/explain",
"//pkg/sql/opt/memo",
"//pkg/sql/opt/norm",
"//pkg/sql/opt/ordering",
Expand Down
34 changes: 24 additions & 10 deletions pkg/sql/opt/exec/execbuilder/statement.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/sql/opt"
"github.com/cockroachdb/cockroach/pkg/sql/opt/cat"
"github.com/cockroachdb/cockroach/pkg/sql/opt/exec"
"github.com/cockroachdb/cockroach/pkg/sql/opt/exec/explain"
"github.com/cockroachdb/cockroach/pkg/sql/opt/memo"
"github.com/cockroachdb/cockroach/pkg/sql/opt/xform"
"github.com/cockroachdb/cockroach/pkg/sql/sem/tree"
Expand Down Expand Up @@ -119,27 +120,40 @@ func (b *Builder) buildExplainOpt(explain *memo.ExplainExpr) (execPlan, error) {
return planWithColumns(node, explain.ColList), nil
}

func (b *Builder) buildExplain(explain *memo.ExplainExpr) (execPlan, error) {
if explain.Options.Mode == tree.ExplainOpt {
return b.buildExplainOpt(explain)
func (b *Builder) buildExplain(explainExpr *memo.ExplainExpr) (execPlan, error) {
if explainExpr.Options.Mode == tree.ExplainOpt {
return b.buildExplainOpt(explainExpr)
}

node, err := b.factory.ConstructExplain(
&explain.Options,
explain.StmtType,
func(ef exec.ExplainFactory) (exec.Plan, error) {
// Create a separate builder for the explain query.
&explainExpr.Options,
explainExpr.StmtType,
func(f exec.Factory) (exec.Plan, error) {
// Create a separate builder for the explain query. buildRelational
// annotates nodes with extra information when the factory is an
// exec.ExplainFactory so it must be the outer factory and the gist
// factory must be the inner factory.
gf := explain.NewPlanGistFactory(f)
ef := explain.NewFactory(gf)

explainBld := New(
ef, b.optimizer, b.mem, b.catalog, explain.Input, b.evalCtx, b.initialAllowAutoCommit,
ef, b.optimizer, b.mem, b.catalog, explainExpr.Input, b.evalCtx, b.initialAllowAutoCommit,
)
explainBld.disableTelemetry = true
return explainBld.Build()
plan, err := explainBld.Build()
if err != nil {
return nil, err
}
explainPlan := plan.(*explain.Plan)
explainPlan.Gist = gf.PlanGist()
return plan, nil
},
)
if err != nil {
return execPlan{}, err
}
return planWithColumns(node, explain.ColList), nil

return planWithColumns(node, explainExpr.ColList), nil
}

func (b *Builder) buildShowTrace(show *memo.ShowTraceForSessionExpr) (execPlan, error) {
Expand Down
Loading

0 comments on commit 55514ee

Please sign in to comment.