Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jobs: add a Profiler tab to the job details page #103945

Merged
merged 1 commit into from
Jun 15, 2023

Conversation

adityamaru
Copy link
Contributor

This change adds a Profiler tab to the job details page. This change also adds a row that allows collection of a cluster-wide CPU profile for 5 seconds, of all the samples corresponding to the job's execution.

Fixes: #102735

Release note (ui change): job details page now has a profiler tab for more advanced observability into a job's execution. Currently, we support collecting a cluster-wide CPU profile of the job.

@adityamaru adityamaru requested review from dt, maryliag and a team May 26, 2023 14:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru
Copy link
Contributor Author

Screenshot 2023-05-26 at 9 17 16 AM Screenshot 2023-05-26 at 9 17 27 AM Screenshot 2023-05-26 at 9 17 52 AM Screenshot 2023-05-26 at 9 18 19 AM

@adityamaru
Copy link
Contributor Author

I still need to test that this works on CC but considering it uses the same linking logic as Advanced Debug, I think it should. I'll report back.

Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but considering it uses the same linking logic as Advanced Debug, I think it should

keep in mind that Advanced Debug doesn't exist on CC, so the routes might not work there at all

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt)

Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @dt)


pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 19 at r1 (raw file):

import Long from "long";
import Helmet from "react-helmet";
import { RouteComponentProps, useHistory, useLocation } from "react-router-dom";

I don't see these 2 imports being used anywhere, maybe you forgot to remove from some previous tests?


pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 128 at r1 (raw file):

          <SummaryCard className={cardCx("summary-card")}>
            <SummaryCardItem
              label="Cluster-wide CPU Profile (profiles all nodes; MEMORY OVERHEAD)"

is the intention of this MEMORY OVERHEAD to be a warning?
If so, we have some components that make warnings more obvious and you could add a better description.
For example you could add something like this below the SummaryCartItem

<InlineAlert intent="warning" title="Creation of a Profile consume additional resources and can potentially negatively impact workload responsiveness." />

pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 267 at r1 (raw file):

                    {this.renderOverviewTabContent(hasNextRun, nextRun, job)}
                  </TabPane>
                  <TabPane tab={TabKeysEnum.PROFILER} key="profiler">

if the profiler doesn't work on CC, you can always hide this tab completely on CC. We have a context that you can use, such as
const isCockroachCloud = useContext(CockroachCloudContext);

{!isCockroachCloud && (<TabPane>...</TabPane>)}

@adityamaru
Copy link
Contributor Author

sorry for the delay here @maryliag I'm going to get to this soon. Other things have bumped this PR from the queue 😓

@maryliag
Copy link
Contributor

maryliag commented Jun 5, 2023

no worries! 😄

Copy link
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @maryliag)


pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 19 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

I don't see these 2 imports being used anywhere, maybe you forgot to remove from some previous tests?

Done.


pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 128 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

is the intention of this MEMORY OVERHEAD to be a warning?
If so, we have some components that make warnings more obvious and you could add a better description.
For example you could add something like this below the SummaryCartItem

<InlineAlert intent="warning" title="Creation of a Profile consume additional resources and can potentially negatively impact workload responsiveness." />

Nice! Changed to use this now.


pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 267 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

if the profiler doesn't work on CC, you can always hide this tab completely on CC. We have a context that you can use, such as
const isCockroachCloud = useContext(CockroachCloudContext);

{!isCockroachCloud && (<TabPane>...</TabPane>)}

nice, added this check in. I can't seem to add the const at the top level because this is a class component? React shouts at me with:

Invalid hook call. Hooks can only be called inside of the body of a function component.

@adityamaru adityamaru force-pushed the job-details-cpu-profile branch from ca793e2 to 34332f0 Compare June 14, 2023 18:15
@adityamaru
Copy link
Contributor Author

Screenshot 2023-06-14 at 2 10 19 PM

@adityamaru adityamaru requested a review from maryliag June 14, 2023 18:24
@adityamaru adityamaru force-pushed the job-details-cpu-profile branch from 34332f0 to 2fec7c0 Compare June 14, 2023 23:19
Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @dt)


pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 267 at r1 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

nice, added this check in. I can't seem to add the const at the top level because this is a class component? React shouts at me with:

Invalid hook call. Hooks can only be called inside of the body of a function component.

After you made this change, is the page loading as expected? At least for DB Console


pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 134 at r3 (raw file):

            <InlineAlert
              intent="warning"
              title="This profiles all nodes in the cluster. This operation buffers profiles in memory and should only be run if there is sufficient overhead."

tagging @florence-crl to review the warning message
I don't know if the user will know what "sufficient overhead" means and how much is it, so maybe something that matches a few other performance warnings we have:

"This operation buffers profiles in memory for all the nodes in the cluster and can potentially negatively impact workload responsiveness."

@florence-crl
Copy link

I agree with @maryliag. Please use her recommendation for the warning message:
"This operation buffers profiles in memory for all the nodes in the cluster and can potentially negatively impact workload responsiveness."

Copy link
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @florence-crl, and @maryliag)


pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 267 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

After you made this change, is the page loading as expected? At least for DB Console

Yup for the DB console its working as expected!


pkg/ui/workspaces/cluster-ui/src/jobs/jobDetailsPage/jobDetails.tsx line 134 at r3 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

tagging @florence-crl to review the warning message
I don't know if the user will know what "sufficient overhead" means and how much is it, so maybe something that matches a few other performance warnings we have:

"This operation buffers profiles in memory for all the nodes in the cluster and can potentially negatively impact workload responsiveness."

I don't know whether I'd say it negatively impacts workload responsiveness. I've changed it to say:

"This operation buffers profiles in memory for all the nodes in the cluster and can result in increased memory usage."

@florence-crl is this okay?

This change adds a Profiler tab to the job details
page. This change also adds a row that allows collection
of a cluster-wide CPU profile for 5 seconds, of all the
samples corresponding to the job's execution.

Fixes: cockroachdb#102735

Release note (ui change): job details page now has a
profiler tab for more advanced observability into a job's
execution. Currently, we support collecting a cluster-wide
CPU profile of the job.
@adityamaru adityamaru force-pushed the job-details-cpu-profile branch from 2fec7c0 to acd6d23 Compare June 15, 2023 16:07
Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this!
:lgtm:
I will just leave now to @florence-crl the approval on the message itself

Reviewed all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt and @florence-crl)

@florence-crl
Copy link

"This operation buffers profiles in memory for all the nodes in the cluster and can result in increased memory usage."
sounds good to me.

@adityamaru
Copy link
Contributor Author

TFTR!

bors r=maryliag

@craig
Copy link
Contributor

craig bot commented Jun 15, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

jobsprofiler: enable collection of cluster wide job-specific CPU profiles
4 participants