You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using Devlake v1.0.1 and I am currently ingesting the commits from a project with the Gitlab integration, and then ingesting the deployments for the project via webhook (from Spinnaker). I do not have the all-time deployment data for my project, only the last couple of months. After I run the collect data job, I noticed that in the project_pr_metrics table, all of the commits for the project that occurred before I onboarded to Spinnaker seem to get associated with a "seemingly random" deployment.
It may be easier to visualize this. Take a look at this screenshot. The bottom blue dots represent commits. The top dots represent deployments. The lines represent the commit-to-deployment relationship in the project_pr_metrics table. As you can see, all of my commits that occurred before the beginning of the my deployment data seem to ALL become associated with a single deployment, causing certain metrics to be very skewed.
Does anyone have any advice on how I can:
Debug why devlake chooses to associated these commits to this deploy? For example, why wouldnt it be the first deploy? Or preferably, not create associations between commits and deployments where commits occurred before the first deployment?
Work around this issue? One idea I have is to set the blueprint for each project only to ingest data after the second successful production deployment.
Thanks for your time and support!
Screenshots
Additional context
Its worth noting that I have found a similar issue. In this comment@nicolavolpini mentioned that "still shows several PRs associated to the same deployment webhook" but no additional context was posted on how this issue could be resolved.
The text was updated successfully, but these errors were encountered:
qpawelc
changed the title
[Question][Module Name] Question title
[Question][Module Name] Commits before deployments misattributed to random deployment
Dec 18, 2024
qpawelc
changed the title
[Question][Module Name] Commits before deployments misattributed to random deployment
[Question][DORA] Commits before deployments misattributed to random deployment
Dec 18, 2024
qpawelc
changed the title
[Question][DORA] Commits before deployments misattributed to random deployment
[Question][DORA] Commits before first recorded deployments misattributed to random deployment
Dec 18, 2024
Hey there, @qpawelc! 👋 I'm here to help you out with any bugs, questions, or even becoming a contributor. Let's tackle this together!
To address the issue of commits being associated with a "seemingly random" deployment in DevLake, you can consider the following steps:
Debugging the Association Logic: DevLake associates commits with deployments by finding the commits difference between two consecutive deployments and connecting PR/MR and the commits difference through merge_commit or pr_commit[1]. If historical deployment data is missing, DevLake might default to associating older commits with the earliest available deployment. Unfortunately, specific fallback mechanisms for missing deployment data are not documented, so you might need to explore the codebase or logs to understand the exact logic being applied.
Workaround with Blueprint Configuration: You can configure the data ingestion to start after a specific event, such as a deployment, by using the SyncPolicy component. This allows you to set a timeAfter property to limit data collection to a specific time range [2]. This approach can help you avoid associating older commits with recent deployments by only ingesting data after a certain point in time.
Review Related Issues: Since you've found a similar issue in the DevLake repository, it might be helpful to follow up on that issue or any related discussions to see if there have been updates or solutions provided by the community.
By implementing these steps, you should be able to better manage the commit-to-deployment associations and mitigate the skewed metrics issue.
Question
Hey all!
I am using Devlake
v1.0.1
and I am currently ingesting the commits from a project with the Gitlab integration, and then ingesting the deployments for the project via webhook (from Spinnaker). I do not have the all-time deployment data for my project, only the last couple of months. After I run the collect data job, I noticed that in theproject_pr_metrics
table, all of the commits for the project that occurred before I onboarded to Spinnaker seem to get associated with a "seemingly random" deployment.It may be easier to visualize this. Take a look at this screenshot. The bottom blue dots represent commits. The top dots represent deployments. The lines represent the commit-to-deployment relationship in the
project_pr_metrics
table. As you can see, all of my commits that occurred before the beginning of the my deployment data seem to ALL become associated with a single deployment, causing certain metrics to be very skewed.Does anyone have any advice on how I can:
Thanks for your time and support!
Screenshots
Additional context
Its worth noting that I have found a similar issue. In this comment @nicolavolpini mentioned that "still shows several PRs associated to the same deployment webhook" but no additional context was posted on how this issue could be resolved.
The text was updated successfully, but these errors were encountered: