-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jobsprofiler: add support for a job diagnostic bundle #105076
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-disaster-recovery
Comments
adityamaru
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-disaster-recovery
T-jobs
labels
Jun 16, 2023
cc @cockroachdb/disaster-recovery |
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jun 22, 2023
Similar to statement bundles this change introduces the infrastructure to collect and read job profiler bundles. Right now, a job profiler bundle will only contain the latest DSP diagram for a job, but going forward this will give us a place to dump raw files such as: - cluster-wide job traces - cpu profiles - trace-driven aggregated stats - raw payload and progress protos Downloading this bundle will be exposed in a future patch in all of the places where statement bundles are today: - DBConsole - CLI shell - SQL shell This change introduces a builtin that constructs and writes the bundle for a job to the system.job_info table. It also introduces a new endpoint on the status server to read this constructed bundle. The next set of PRs will add the necessary components to allow downloading the bundle from the DBConsole. Informs: cockroachdb#105076 Release note: None
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jun 24, 2023
This change adds a new component to the `Profiler` tab of the job details page that supports collecting and viewing job profiler bundles. The component has a button to collect job profiler bundles. These bundles are then listed in a sorted table with the ability to download each bundle. The above operations are backed by the infrastructure added in cockroachdb#105384. Note, the `Profiler` tab is currently disabled for CC but this change allows for a future project to enable the collection of bundles through the CC console as well. Informs: cockroachdb#105076 Release note (ui change): collect and download job profiler bundles from the `Profiler` tab on the job details page.
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jun 24, 2023
Similar to statement bundles this change introduces the infrastructure to collect and read job profiler bundles. Right now, a job profiler bundle will only contain the latest DSP diagram for a job, but going forward this will give us a place to dump raw files such as: - cluster-wide job traces - cpu profiles - trace-driven aggregated stats - raw payload and progress protos Downloading this bundle will be exposed in a future patch in all of the places where statement bundles are today: - DBConsole - CLI shell - SQL shell This change introduces a builtin that constructs and writes the bundle for a job to the system.job_info table. It also introduces a new endpoint on the status server to read this constructed bundle. The next set of PRs will add the necessary components to allow downloading the bundle from the DBConsole. Informs: cockroachdb#105076 Release note: None
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 7, 2023
Similar to statement bundles this change introduces the infrastructure to request, collect and read the execution details for a particular job. Right now, the execution details will only contain the latest DSP diagram for a job, but going forward this will give us a place to dump raw files such as: - cluster-wide job traces - cpu profiles - trace-driven aggregated stats - raw payload and progress protos Downloading some or all of these execution details will be exposed in a future patch in all of the places where statement bundles are today: - DBConsole - CLI shell - SQL shell This change introduces a builtin that allows the caller to request the collection and persistence of a job's current execution details. This change also introduces a new endpoint on the status server to read the data corresponding to the execution details persisted for a job. The next set of PRs will add the necessary components to allow downloading the files from the DBConsole. Informs: cockroachdb#105076 Release note: None
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 11, 2023
Similar to statement bundles this change introduces the infrastructure to request, collect and read the execution details for a particular job. Right now, the execution details will only contain the latest DSP diagram for a job, but going forward this will give us a place to dump raw files such as: - cluster-wide job traces - cpu profiles - trace-driven aggregated stats - raw payload and progress protos Downloading some or all of these execution details will be exposed in a future patch in all of the places where statement bundles are today: - DBConsole - CLI shell - SQL shell This change introduces a builtin that allows the caller to request the collection and persistence of a job's current execution details. This change also introduces a new endpoint on the status server to read the data corresponding to the execution details persisted for a job. The next set of PRs will add the necessary components to allow downloading the files from the DBConsole. Informs: cockroachdb#105076 Release note: None
craig bot
pushed a commit
that referenced
this issue
Jul 11, 2023
105384: jobsprofiler: enable requesting a job's execution details r=dt a=adityamaru Similar to statement bundles this change introduces the infrastructure to request, collect and read the execution details for a particular job. Right now, the execution details will only contain the latest DSP diagram for a job, but going forward this will give us a place to dump raw files such as: - cluster-wide job traces - cpu profiles - trace-driven aggregated stats - raw payload and progress protos Downloading some or all of these execution details will be exposed in a future patch in all of the places where statement bundles are today: - DBConsole - CLI shell - SQL shell This change introduces a builtin that allows the caller to request the collection and persistence of a job's current execution details. This change also introduces a new endpoint on the status server to read the data corresponding to the execution details persisted for a job. The next set of PRs will add the necessary components to allow downloading the files from the DBConsole. Informs: #105076 Release note: None Co-authored-by: adityamaru <[email protected]>
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 11, 2023
In cockroachdb#105384 we added infrastructure to request and store execution details for a job. This currently only includes the DistSQL diagram generated during a job execution. Going forward this will include several files such as traces, goroutines, profiles etc. This change introduces an endpoint that allows listing all such files that are available for consumption. This list will be displayed on the job details page allowing the user to download any subset of the files collected during job execution. Informs: cockroachdb#105076 Release note: None
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 12, 2023
This change collect cluster-wide goroutines that have a pprof label tying it to the particular job's execution, whose job execution details have been requested. This relies on the support added to the pprofui server to collect cluster-wide, labelled goroutines in cockroachdb#105916. Informs: cockroachdb#105076 Release note: None
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 17, 2023
In cockroachdb#105384 and cockroachdb#106629 we added support to collect and list files that had been collected as part of a job's execution details. These files are meant to provide improved observability into the state of a job. This change is the first of a few that exposes these endpoints on the DBConsole job details page. This change only adds support for listing files that have been requested as part of a job's execution details. A follow-up change will add support to request these files, sort them and download them from the job details page. This page is not available on the Cloud Console as it is meant for advanced debugging. This change also renames the `Profiler` tab to `Advanced Debugging` as the users of this tab are going to be internal CRDB support and engineering for the time being. Informs: cockroachdb#105076 Release note (ui change): add table in the Profiler job details page that lists all the available files describing a job's execution details
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 18, 2023
In cockroachdb#105384 and cockroachdb#106629 we added support to collect and list files that had been collected as part of a job's execution details. These files are meant to provide improved observability into the state of a job. This change is the first of a few that exposes these endpoints on the DBConsole job details page. This change only adds support for listing files that have been requested as part of a job's execution details. A follow-up change will add support to request these files, sort them and download them from the job details page. This page is not available on the Cloud Console as it is meant for advanced debugging. This change also renames the `Profiler` tab to `Advanced Debugging` as the users of this tab are going to be internal CRDB support and engineering for the time being. Informs: cockroachdb#105076 Release note (ui change): add table in the Profiler job details page that lists all the available files describing a job's execution details
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 19, 2023
In cockroachdb#105384 and cockroachdb#106629 we added support to collect and list files that had been collected as part of a job's execution details. These files are meant to provide improved observability into the state of a job. This change is the first of a few that exposes these endpoints on the DBConsole job details page. This change only adds support for listing files that have been requested as part of a job's execution details. A follow-up change will add support to request these files, sort them and download them from the job details page. This page is not available on the Cloud Console as it is meant for advanced debugging. This change also renames the `Profiler` tab to `Advanced Debugging` as the users of this tab are going to be internal CRDB support and engineering for the time being. Informs: cockroachdb#105076 Release note (ui change): add table in the Profiler job details page that lists all the available files describing a job's execution details
This was referenced Jul 19, 2023
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 24, 2023
In cockroachdb#105384 and cockroachdb#106629 we added support to collect and list files that had been collected as part of a job's execution details. These files are meant to provide improved observability into the state of a job. This change is the first of a few that exposes these endpoints on the DBConsole job details page. This change only adds support for listing files that have been requested as part of a job's execution details. A follow-up change will add support to request these files, sort them and download them from the job details page. This page is not available on the Cloud Console as it is meant for advanced debugging. This change also renames the `Profiler` tab to `Advanced Debugging` as the users of this tab are going to be internal CRDB support and engineering for the time being. Informs: cockroachdb#105076 Release note (ui change): add table in the Profiler job details page that lists all the available files describing a job's execution details
craig bot
pushed a commit
that referenced
this issue
Jul 24, 2023
106879: jobs: add table to display execution details r=maryliag a=adityamaru In #105384 and #106629 we added support to collect and list files that had been collected as part of a job's execution details. These files are meant to provide improved obersvability into the state of a job. This change is the first of a few that exposes these endpoints on the DBConsole job details page. This change only adds support for listing files that have been requested as part of a job's execution details. A future change will add support to request these files, sort them and download them from the job details page. This page is not available on the Cloud Console as it is meant for advanced debugging. Informs: #105076 Release note (ui change): add table in the Profiler job details page that lists all the available files describing a job's execution details <img width="1505" alt="Screenshot 2023-07-18 at 2 26 50 PM" src="https://github.com/cockroachdb/cockroach/assets/13837382/aebe18a6-9c25-4c9a-ad7c-a94e2e4c97ff"> <img width="1510" alt="Screenshot 2023-07-18 at 2 27 03 PM" src="https://github.com/cockroachdb/cockroach/assets/13837382/da9b3a21-8dc6-47ca-ac02-24d8bb7d09e7"> 107236: sql: use txn.NewBatch instead of &kv.Batch{} r=fqazi a=rafiss This will make these requests properly passes along the admission control headers. informs #79212 Epic: None Release note: None 107447: sql: fix CREATE MATERIALIZED VIEW AS schema change job description r=fqazi a=ecwall Fixes #107445 This changes the CREATE MATERIALIZED VIEW AS schema change job description SQL syntax. For example ``` CREATE VIEW "v" AS "SELECT t.id FROM movr.public.t"; ``` becomes ``` CREATE MATERIALIZED VIEW defaultdb.public.v AS SELECT t.id FROM defaultdb.public.t WITH DATA; ``` Release note (bug fix): Fix CREATE MATERIALIZED VIEW AS schema change job description SQL syntax. Co-authored-by: adityamaru <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Evan Wall <[email protected]>
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 26, 2023
In cockroachdb#106879 we added a table to the `Advanced Debugging` tab of the job details page. This table lists out all the execution detail files that are available for the given job. This change is a follow up to add download functionality to each row in the table. The format of the downloaded file is determined by the prefix of the filename. A final change to allow users to generate execution details will be added in the next follow up. Informs: cockroachdb#105076 Release note: None
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 26, 2023
In cockroachdb#106879 we added a table to the `Advanced Debugging` tab of the job details page. This table lists out all the execution detail files that are available for the given job. This change is a follow up to add download functionality to each row in the table. The format of the downloaded file is determined by the prefix of the filename. A final change to allow users to generate execution details will be added in the next follow up. Informs: cockroachdb#105076 Release note: None
craig bot
pushed a commit
that referenced
this issue
Jul 27, 2023
107198: jobsprofiler: stringify protobin files when requested r=dt a=adityamaru This change is in preparation for a larger change that will allow downloading debug files from the `Advanded Debugging` tab on the job details page. With this change a `binpb` file will have a `binpb.txt` version of the file listed too. If the user requests to download a `binpb.txt` file we unmarshal and stringify the contents of the file before serving them to the user. Currently, there is only one protobin file type written by a job resumer on completion. Informs: #105076 Release note: None 107700: netutil: fix a buglet r=erikgrinaker,stevendanna a=knz I was noticing an excess number of conn objects remaining open after a test shutdown. Release note: None Epic: CRDB-28893 107711: backupccl: skip TestBackupRestoreTenant r=stevendanna a=adityamaru Skip while we debug the timeouts in #107669. Informs: #107669 Release note: None Co-authored-by: adityamaru <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]>
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 27, 2023
In cockroachdb#106879 we added a table to the `Advanced Debugging` tab of the job details page. This table lists out all the execution detail files that are available for the given job. This change is a follow up to add download functionality to each row in the table. The format of the downloaded file is determined by the prefix of the filename. A final change to allow users to generate execution details will be added in the next follow up. Informs: cockroachdb#105076 Release note: None
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Jul 27, 2023
This is the last of the three PRs to add support for requesting, viewing and downloading execution details from the job details page. This change wires up the logic needed to request the execution details for a given job. The request is powered by the crdb_internal.request_job_execution_details builtin that triggers the collection of execution details. Fixes: cockroachdb#105076 Release note: None
craig bot
pushed a commit
that referenced
this issue
Jul 28, 2023
107210: jobs: enable downloading execution detail files r=maryliag a=adityamaru In #106879 we added a table to the `Advanced Debugging` tab of the job details page. This table lists out all the execution detail files that are available for the given job. This change is a follow up to add download functionality to each row in the table. The format of the downloaded file is determined by the prefix of the filename. A final change to allow users to generate execution details will be added in the next follow up. Informs: #105076 Release note: None 107760: spanconfigccl: fix tests under multitenancy r=yuzefovich a=rafiss fixes #106818 fixes #106821 Release note: None Co-authored-by: adityamaru <[email protected]> Co-authored-by: Rafi Shamim <[email protected]>
craig bot
pushed a commit
that referenced
this issue
Aug 2, 2023
107759: jobs: add button to request execution details r=maryliag a=adityamaru This is the last of the three PRs to add support for requesting, viewing and downloading execution details from the job details page. This change wires up the logic needed to request the execution details for a given job. The request is powered by the crdb_internal.request_job_execution_details builtin that triggers the collection of execution details. Fixes: #105076 Release note: None 107956: server: export distsender metrics from SQL pods r=knz a=nvanbenschoten This commit exports the DistSender timeseries metrics from SQL pods. ``` distsender.batches distsender.batches.partial distsender.batch_requests.replica_addressed.bytes distsender.batch_responses.replica_addressed.bytes distsender.batch_requests.cross_region.bytes distsender.batch_responses.cross_region.bytes distsender.batch_requests.cross_zone.bytes distsender.batch_responses.cross_zone.bytes distsender.batches.async.sent distsender.batches.async.throttled distsender.rpc.sent distsender.rpc.sent.local distsender.rpc.sent.nextreplicaerror distsender.errors.notleaseholder distsender.errors.inleasetransferbackoffs distsender.rangelookups requests.slow.distsender distsender.rpc.%s.sent # rpc name distsender.rpc.err.%s # error name distsender.rangefeed.total_ranges distsender.rangefeed.catchup_ranges distsender.rangefeed.error_catchup_ranges distsender.rangefeed.restart_ranges distsender.rangefeed.restart_stuck ``` Epic: None Release note: None Co-authored-by: adityamaru <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-disaster-recovery
This issue tracks the work to be able to collect a diagnostic bundle for a running or finished job. This is similar to the support we have for collecting statement bundles. The job diagnostic bundle will contain information such as:
This bundle will be download-able from the DBConsole and via the SQL shell.
Epic: CRDB-8964
Jira issue: CRDB-28850
The text was updated successfully, but these errors were encountered: