Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jobs: provide a metric to determine non-idle running jobs #70538

Closed
darinpp opened this issue Sep 21, 2021 · 2 comments · Fixed by #76384
Closed

jobs: provide a metric to determine non-idle running jobs #70538

darinpp opened this issue Sep 21, 2021 · 2 comments · Fixed by #76384
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@darinpp
Copy link
Contributor

darinpp commented Sep 21, 2021

We are currently adding a new job AUTO SPAN CONFIG RECONCILIATION that will be continuously running on each tenant pod. This is a problem as the number of currently running jobs is used as an indication that a tenant pod is idle.

We have to add a metric that can provide the number of currently running non-idle jobs.

And alternative would be to hard code on the CC side but that isn't optimal as each new such jobs will require a massive maintenance effort. Alternatively - such continuous running jobs can be considered terminable at all times and reported through a metric. This is also non-optimal as we want to wait for jobs doing useful stuff to finish before terminating the pods.

https://cockroachlabs.slack.com/archives/C020NSBH63F/p1632262147058800

@darinpp darinpp added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Sep 21, 2021
@darinpp darinpp assigned darinpp, irfansharif and ajwerner and unassigned darinpp Sep 21, 2021
@ajwerner ajwerner changed the title Provide a metric to determine non-idle running jobs jobs: provide a metric to determine non-idle running jobs Sep 22, 2021
@ajwerner
Copy link
Contributor

Note that this job will not exist until 22.1.

@irfansharif
Copy link
Contributor

Just going to cargo cult whatever we do for #74747.

irfansharif added a commit to irfansharif/cockroach that referenced this issue Feb 10, 2022
Fixes cockroachdb#70538.

We have a forever running background AUTO SPAN CONFIG RECONCILIATION job
on tenant pods. To know when it's safe to wind down pods, we use the
number of currently running jobs as an indicator. Given the job is
forever running, we need an indicator to suggest that despite the job's
presence, it's safe to wind down.

In cockroachdb#74747 we added a thin API to the jobs subsystem to do just that,
with the intent of using it for idle changefeed jobs. We just cargo-cult
that same approach here to mark the reconciliation job as always idle.

Release note: None
craig bot pushed a commit that referenced this issue Feb 10, 2022
75737: cdc: support ALTER CHANGEFEED statement for adding/dropping targets r=sherman-grewal a=sherman-grewal

This PR introduces a new SQL statement called ALTER
CHANGEFEED, which allows users to add/drop targets
for an existing changefeed. Note that the changefeed
job must be paused in order for the alterations to
be applied to the changefeed. The syntax of the
statement is as follows:

ALTER CHANGEFEED <job_id> {{ADD|DROP} \<targets\>}...

Where there can be an arbitrary number of ADD/
DROP commands in any order. Once a user
executes this statement, the targets will be
added/dropped, and the statement will return the
job id and the new job description of the changefeed
job. It is also important to note that a user cannot
drop all targets in a changefeed.

Example:
ALTER CHANGEFEED 123 ADD foo,bar DROP baz;

Release note (enterprise change): Added support
for a new SQL statement called ALTER CHANGEFEED,
which allows users to add/drop targets for an
existing changefeed. The syntax of the statement
is as follows:

ALTER CHANGEFEED <job_id> {{ADD|DROP} \<targets\>}...

Where there can be an abritrary number of ADD/DROP
commands in any order. The following is an example
of this statement:

ALTER CHANGEFEED 123 ADD foo,bar DROP baz;

With this statement, users can avoid going through
the process of altering a changefeed on their own,
and rely on CRDB to carry out this task.

76277: sql: clean up comments when swapping primary keys r=fqazi a=fqazi

Fixes: #71984

Previously, when swapping out primary keys comments
on old indexes would be leftover. This was inadequate because
entries would be leftover in the system.comments table
that could prevent SHOW CREATE  to fail when it attempts
to map comments for dropped indexes. To address this, this patch
will clean up comments on primary key swaps.

Release note (bug fix): Comments are not cleaned up when the table
primary keys are swapped, which can cause SHOW TABLE to fail.

76354: tree: extract new packages to break dependency of execgen on tree r=yuzefovich a=yuzefovich

**execgen: move BinaryOverloadHelper into colexecbase**

This is so that it's easier to break the dependency on `tree` from
`execgen`. Additionally, this commit makes some of the comments
template-time only.

Release note: None

**treecmp: extract ComparisonOperator into a separate package**

This commit extracts things that `execgen` depends on (like some
constants for the comparison operators as well as some other stuff) into
a new package `treecmp`. The goal is to break the dependency of
`execgen` on `tree`.

It was mostly a simple move of some code from `tree/expr.go` into
`treecmp/comparison_operator.go`. I had to export a few things. Also,
I had to make `ComparisonOperatorSymbol.Inverse` take the receiver as an
argument and left the function in `tree` (in order to not mess with
`FoldComparisonExpr`).

Release note: None

**treebin: extract BinaryOperator into a separate package**

This commit extracts things that `execgen` depends on (like some
constants for the binary operators as well as some other stuff) into
a new package `treebin`. The goal is to break the dependency of
`execgen` on `tree`.

It was a simple move of some code from `tree/expr.go` into
`treebin/binary_operator.go`. I only had to export `IsPadded` function.

Release note: None

**treewindow: extract some window constants into a new package out of tree**

This commit extracts some constants about the window functions into
a new package which allows us to finally break the dependency of
`execgen` on `tree`. Now the absence of such dependency is enforced with
`VerifyNoImports` in `colexec` package.

The only notable change that I had to make is making
`WindowFrameExclusion` not implement `tree.NodeFormatter` interface;
instead, the callsites that wanted to call `Format` on that type now
call `String` which provides the same behavior.

Release note: None

76384: spanconfig: mark reconciliation job as idle r=irfansharif a=irfansharif

Fixes #70538.

We have a forever running background AUTO SPAN CONFIG RECONCILIATION job
on tenant pods. To know when it's safe to wind down pods, we use the
number of currently running jobs as an indicator. Given the job is
forever running, we need an indicator to suggest that despite the job's
presence, it's safe to wind down.

In #74747 we added a thin API to the jobs subsystem to do just that,
with the intent of using it for idle changefeed jobs. We just cargo-cult
that same approach here to mark the reconciliation job as always idle.

Release note: None

Co-authored-by: Sherman Grewal <[email protected]>
Co-authored-by: Faizan Qazi <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: irfan sharif <[email protected]>
@craig craig bot closed this as completed in 1f75f55 Feb 10, 2022
RajivTS pushed a commit to RajivTS/cockroach that referenced this issue Mar 6, 2022
Fixes cockroachdb#70538.

We have a forever running background AUTO SPAN CONFIG RECONCILIATION job
on tenant pods. To know when it's safe to wind down pods, we use the
number of currently running jobs as an indicator. Given the job is
forever running, we need an indicator to suggest that despite the job's
presence, it's safe to wind down.

In cockroachdb#74747 we added a thin API to the jobs subsystem to do just that,
with the intent of using it for idle changefeed jobs. We just cargo-cult
that same approach here to mark the reconciliation job as always idle.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants