Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][Bitbucket Server] Activities collector fails when PR is corrupt #7601

Closed
2 of 3 tasks
sstojak1 opened this issue Jun 10, 2024 · 7 comments
Closed
2 of 3 tasks
Labels
component/ext This issue or PR relates to external components, such as Grafana severity/p1 This bug affects functionality or significantly affect ux type/bug This issue is a bug

Comments

@sstojak1
Copy link
Contributor

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

The Bitbucket Server data collection task fails because one pull request is corrupted. Error that Devlake throws:
| Retry exceeded 3 times calling rest/api/1.0/projects/{projectKey}/repos/{repoName}/pull-requests/{pullRequestId}/activities. The last error was: Http DoAsync error calling [method:GET path:rest/api/1.0/projects/{projectKey}/repos/{repoName}/pull-requests/{pullRequestId}/activities query:map[limit:[100] state:[all]]]. Response: {"errors":[{"context":null,"message":"'git update-ref --stdin -z --no-deref' exited with code 128 saying: fatal: cannot update ref 'stash-refs/pull-requests/{pullRequestId}/from': trying to write ref 'stash-refs/pull-requests/{pullRequestId}/from' with nonexistent object {commitSHA}","exceptionName":"com.atlassian.bitbucket.scm.CommandFailedException"}]} (500)

What do you expect to happen

I think it would make sense for the data collection to continue even if one pull request is corrupted since we don't want to lose all of those Bitbucket metrics for the repository.

How to reproduce

We have this kind of a state in our env. Not sure how to reproduce.

Anything else

No response

Version

v1.0.0-beta11

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@sstojak1 sstojak1 added the type/bug This issue is a bug label Jun 10, 2024
@dosubot dosubot bot added component/ext This issue or PR relates to external components, such as Grafana severity/p1 This bug affects functionality or significantly affect ux labels Jun 10, 2024
@sstojak1
Copy link
Contributor Author

this is similar to 7258

Copy link

dosubot bot commented Jun 10, 2024

To ensure that the Bitbucket Server data collection task in DevLake continues even if one pull request is corrupted, you can configure the Sync Policy to "Skip Failed Tasks." This setting allows the data collection process to skip over any failed tasks, preventing the entire pipeline from failing and ensuring that you do not lose all Bitbucket metrics for the repository.

Here is the relevant section from the document:

### Step 2.3 - Set the Sync Policy
There are three settings for Sync Policy:
- Data Time Range: You can select the time range of the data you wish to collect. The default is set to the past six months.
- Sync Frequency: You can choose how often you would like to sync your data in this step by selecting a sync frequency option or enter a cron code to specify your preferred schedule.
- Skip Failed Tasks: sometime a few tasks may fail in a long pipeline; you can choose to skip them to avoid spending more time in running the pipeline all over again.

Additionally, the ConvertPullRequests function in the pr_convertor.go file is designed to handle errors gracefully. The api.NewDataConverter method in the ConvertPullRequests function is set up to manage errors effectively, allowing the data collection process to continue even if some pull requests are not found or are corrupted.

By enabling the "Skip Failed Tasks" option and leveraging the error handling in the ConvertPullRequests function, you can ensure that the data collection process continues even if some tasks fail [1][2].

To continue talking to Dosu, mention @dosu.

@d4x1
Copy link
Contributor

d4x1 commented Jun 14, 2024

@sstojak1
There is an option in project's sync config, you can have a try.
image

image

@klesh
Copy link
Contributor

klesh commented Jun 14, 2024

I believe it is fixed already, will be available in the coming rc version which should be ready in a couple of days.

@sstojak1
Copy link
Contributor Author

@d4x1 This option is already on for all our projects.
Here a single task is falling since one PR is corrupt in Bitbucket. As a result, other repository information won't be collected.

@klesh
Are you referring to 7577?
7577 is connected with PR commits and it's handling 40X error status. This error is with 500 status and it's connected with PR activities.
If it's ok, we can do it like @abeizn did for commits but handle 500 + body message - com.atlassian.bitbucket.scm.CommandFailedException?
What do you think?
image

@klesh
Copy link
Contributor

klesh commented Jun 14, 2024

Ahh.. 500 errors? I am not sure, 500 represents Server Internal Errors, It might suggest that the server is corrupted or down, in this case, it is hard to say if it is appropriate to skip the PR.
It makes more sense to fix the 500 errors on the bitbucket server rather than ignoring them on the devlake end.

@sstojak1
Copy link
Contributor Author

You're correct. Deciding whether to skip something based on the message content will be challenging. Resolving the ticket...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/ext This issue or PR relates to external components, such as Grafana severity/p1 This bug affects functionality or significantly affect ux type/bug This issue is a bug
Projects
None yet
Development

No branches or pull requests

3 participants