Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to obtain debug.zip directly via http endpoint #51008

Closed
celiala opened this issue Jul 6, 2020 · 8 comments
Closed

Ability to obtain debug.zip directly via http endpoint #51008

celiala opened this issue Jul 6, 2020 · 8 comments
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@celiala
Copy link
Collaborator

celiala commented Jul 6, 2020

Is your feature request related to a problem? Please describe.

As a Support Engineer, I want the ability to obtain debug.zip directly, without needing to submit an SRE request ticket to request an SRE to do this task.

The process for obtaining debug.zip for a cloud customer currently involves many hops, including a manual task that only Cloud SRE can currently do:

The on-call SRE obtains debug.zip by running a cockroach command, which stores the output onto the pod that's running CRDB. For cloud customers, when a SRE runs this, it ends up on persistent disk, and then the SRE copies this onto their laptop, then sends that to the Support Engineer.

Describe the solution you'd like

The Cloud team would like to add the ability to download debug.zip directly from the AdminUI an HTTP endpoint (which support could then cURL). [7/21 update: s/Admin UI/HTTP endpoint/, based on comment below]

We heard that the ability to generate debug.zip via a SQL shell command was on the roadmap -- it sounds like @knz is the best person to ask about this? If so, we would love to use/extend this somehow.

@ajwerner suggested hooking the debug zip logic into an HTTP endpoint which the Cloud team could use.

Reference: Jun 30 Slack thread

Describe alternatives you've considered

Carlo/Ben proposed an alternative, but this requires a lot more net new pieces/unknowns:

Have a sql-level command like GENERATE ZIP TO ` (similar to the backup command), which could return the zip as a blob and leave storage to the caller.
This would require SRE to do additional work to make this call, then store the result into
some bucket (not yet existing), which would be shared for all clusters’ debug zips. Each cluster would only have write access to debug zip bucket. The bucket objects would have a short ttl (12 hours?). We can give access to this bucket to the support peeps. We’d create an intrusion endpoint to trigger the command, maybe even available from super user console.

Reference: Jun 18 Slack thread

Additional context

Bram also mentioned: It would also be cool would be a button to upload it to a write only S3 bucket so we can get it that way.

@blathers-crl
Copy link

blathers-crl bot commented Jul 6, 2020

Hi @celiala, I've guessed the C-ategory of your issue and suitably labeled it. Please re-label if inaccurate.

While you're here, please consider adding an A- label to help keep our repository tidy.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jul 6, 2020
@knz knz added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) and removed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Jul 6, 2020
@knz
Copy link
Contributor

knz commented Jul 6, 2020

cc @tbg we discussed this today

@celiala
Copy link
Collaborator Author

celiala commented Jul 6, 2020

cc @piyush-singh @kscurtis @knz -- writing up the issue describing idea to get debug.zip directly from the Admin UI.

@knz knz added the A-cli label Jul 6, 2020
@knz
Copy link
Contributor

knz commented Jul 6, 2020

Thank you!

@celiala
Copy link
Collaborator Author

celiala commented Jul 21, 2020

Some updates around this request, based on offline syncs:

  1. If the DB team is able to add an http endpoint that Support team can curl, this would be sufficient for Support team's needs (i.e. UI work would no longer be needed for this request)
  2. Next steps is for Dev Interfaces to clarify interface for this endpoint

I will update description above to reflect request change (from point 1)

cc @lunevalex @knz

@celiala celiala changed the title Ability to obtain debug.zip directly via Admin UI Ability to obtain debug.zip directly via http endpoint Jul 21, 2020
@aayushshah15 aayushshah15 self-assigned this Jul 22, 2020
@knz
Copy link
Contributor

knz commented Jul 30, 2020

The more general case for context: #51454

@celiala
Copy link
Collaborator Author

celiala commented Aug 12, 2020

Updating this ticket to note recent developments, which reduces the urgency of this ask, at least from the Cloud cluster support perspective:

Previously, the process for obtaining debug.zip for a cloud customer included a manual toil for the SRE team, specifically:

The on-call SRE obtains debug.zip by running a cockroach command, which stores the output onto the pod that's running CRDB. For cloud customers, when a SRE runs this, it ends up on persistent disk, and then the SRE copies this onto their laptop, then sends that to the Support Engineer.

SREs have since automated this process (intrusion PR), so that now SREs can easily obtain the debug.zip by running a command against the intrusion service binary.

This means that the Cloud FE team can extend this work to surface the debug.zip for cloud cluster through the Cloud Internal Dashboard, which the support team can access.

Flagging @tim-o @piyush-singh @knz @lunevalex to see if there's other use cases for / stakeholders interested in this specific task?

Otherwise, with Cloud FE soon being able to solve for support's request, we can probably close this ticket.

@tim-o
Copy link
Contributor

tim-o commented Aug 12, 2020 via email

@celiala celiala closed this as completed Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

No branches or pull requests

4 participants