-
-
Notifications
You must be signed in to change notification settings - Fork 736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Establish a baseline for the number of envs disabled per project #6807
chore: Establish a baseline for the number of envs disabled per project #6807
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Code Health Quality Gates: FAILED
- Declining Code Health: 1 findings(s) 🚩
const projectEnvironmentsDisabled = createCounter({ | ||
name: 'project_environments_disabled', | ||
help: 'How many "environment disabled" events we have received for each project', | ||
labelNames: ['project_id'], | ||
}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Getting worse: Complex Method
MetricsMonitor.startMonitoring already has high cyclomatic complexity, and now it increases in Lines of Code from 522 to 530
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a small comment around label name convention.
const projectEnvironmentsDisabled = createCounter({ | ||
name: 'project_environments_disabled', | ||
help: 'How many "environment disabled" events we have received for each project', | ||
labelNames: ['project_id'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I see that we have multiple variants of label names in here. On line 97 we use camelCase but on the sdk version counter we use snake case. I wish we kept to one convention, but I guess that's too late now. The question is which convention do we want to stick with for the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I chose snake case for two reasons:
- I didn't see line 97. I stopped when I found the ones for sdk_version
- It appears that the rest of prometheus (e.g. the name of the metric) uses snake case ("project_environments_disabled"), so I figured we should be consistent in metric names and label names.
And on that last point, I suggest that we stay with snake_case so that metric names and label names use the same format.
…ct (#6807) This PR adds a counter in Prometheus for counting the number of "environment disabled" events we get per project. The purpose of this is to establish a baseline for one of the "project management UI" project's key results. ## On gauges vs counters This PR uses a counter. Using a gauge would give you the total number of envs disabled, not the number of disable events. The difference is subtle, but important. For projects that were created before the new feature, the gauge might be appropriate. Because each disabled env would require at least one disabled event, we can get a floor of how many events were triggered for each project. However, for projects created after we introduce the planned change, we're not interested in the total envs anymore, because you can disable a hundred envs on creation with a single action. In this case, a gauge showing 100 disabled envs would be misleading, because it didn't take 100 events to disable them. So the interesting metric here is how many times did you specifically disable an environment in project settings, hence the counter. ## Assumptions and future plans To make this easier on ourselves, we make the follow assumption: people primarily disable envs **when creating a project**. This means that there might be a few lagging indicators granting some projects a smaller number of events than expected, but we may be able to filter those out. Further, if we had a metric for each project and its creation date, we could correlate that with the metrics to answer the question "how many envs do people disable in the first week? Two weeks? A month?". Or worded differently: after creating a project, how long does it take for people to configure environments? Similarly, if we gather that data, it will also make filtering out the number of events for projects created **after** the new changes have been released much easier. The good news: Because the project creation metric with dates is a static aggregate, it can be applied at any time, even retroactively, to see the effects.
…ct (#6807) (#6819) This PR adds a counter in Prometheus for counting the number of "environment disabled" events we get per project. The purpose of this is to establish a baseline for one of the "project management UI" project's key results. ## On gauges vs counters This PR uses a counter. Using a gauge would give you the total number of envs disabled, not the number of disable events. The difference is subtle, but important. For projects that were created before the new feature, the gauge might be appropriate. Because each disabled env would require at least one disabled event, we can get a floor of how many events were triggered for each project. However, for projects created after we introduce the planned change, we're not interested in the total envs anymore, because you can disable a hundred envs on creation with a single action. In this case, a gauge showing 100 disabled envs would be misleading, because it didn't take 100 events to disable them. So the interesting metric here is how many times did you specifically disable an environment in project settings, hence the counter. ## Assumptions and future plans To make this easier on ourselves, we make the follow assumption: people primarily disable envs **when creating a project**. This means that there might be a few lagging indicators granting some projects a smaller number of events than expected, but we may be able to filter those out. Further, if we had a metric for each project and its creation date, we could correlate that with the metrics to answer the question "how many envs do people disable in the first week? Two weeks? A month?". Or worded differently: after creating a project, how long does it take for people to configure environments? Similarly, if we gather that data, it will also make filtering out the number of events for projects created **after** the new changes have been released much easier. The good news: Because the project creation metric with dates is a static aggregate, it can be applied at any time, even retroactively, to see the effects.
This PR adds a counter in Prometheus for counting the number of "environment disabled" events we get per project. The purpose of this is to establish a baseline for one of the "project management UI" project's key results.
On gauges vs counters
This PR uses a counter. Using a gauge would give you the total number of envs disabled, not the number of disable events. The difference is subtle, but important.
For projects that were created before the new feature, the gauge might be appropriate. Because each disabled env would require at least one disabled event, we can get a floor of how many events were triggered for each project.
However, for projects created after we introduce the planned change, we're not interested in the total envs anymore, because you can disable a hundred envs on creation with a single action. In this case, a gauge showing 100 disabled envs would be misleading, because it didn't take 100 events to disable them.
So the interesting metric here is how many times did you specifically disable an environment in project settings, hence the counter.
Assumptions and future plans
To make this easier on ourselves, we make the follow assumption: people primarily disable envs when creating a project.
This means that there might be a few lagging indicators granting some projects a smaller number of events than expected, but we may be able to filter those out.
Further, if we had a metric for each project and its creation date, we could correlate that with the metrics to answer the question "how many envs do people disable in the first week? Two weeks? A month?". Or worded differently: after creating a project, how long does it take for people to configure environments?
Similarly, if we gather that data, it will also make filtering out the number of events for projects created after the new changes have been released much easier.
The good news: Because the project creation metric with dates is a static aggregate, it can be applied at any time, even retroactively, to see the effects.