Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add reason tag in failed requests and failed events stats #2430

Merged
merged 7 commits into from
Oct 11, 2022

Conversation

satishrudderstack
Copy link
Contributor

Currently, we have stats for failed requests and failed events. With this change we will also be capturing the reason for the failure. Example reasons are invalidJson, invalidWriteKey

@satishrudderstack satishrudderstack marked this pull request as draft September 13, 2022 03:13
@codecov
Copy link

codecov bot commented Sep 13, 2022

Codecov Report

Base: 45.30% // Head: 43.96% // Decreases project coverage by -1.34% ⚠️

Coverage data is based on head (a91654e) compared to base (60bc1be).
Patch coverage: 25.94% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2430      +/-   ##
==========================================
- Coverage   45.30%   43.96%   -1.35%     
==========================================
  Files         177      187      +10     
  Lines       36542    39108    +2566     
==========================================
+ Hits        16555    17193     +638     
- Misses      18911    20828    +1917     
- Partials     1076     1087      +11     
Impacted Files Coverage Δ
processor/transformer/transformer.go 65.76% <0.00%> (-0.46%) ⬇️
router/batchrouter/batchrouter.go 34.13% <0.00%> (-0.02%) ⬇️
services/filemanager/filemanager.go 53.09% <0.00%> (-1.45%) ⬇️
warehouse/admin.go 3.03% <0.00%> (ø)
warehouse/bigquery/bigquery.go 0.00% <0.00%> (ø)
warehouse/clickhouse/clickhouse.go 0.00% <0.00%> (ø)
warehouse/deltalake/deltalake.go 0.00% <0.00%> (ø)
warehouse/deltalake/query_builder.go 0.00% <0.00%> (ø)
warehouse/errors.go 0.00% <0.00%> (ø)
warehouse/identities.go 1.05% <0.00%> (-0.47%) ⬇️
... and 43 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@lvrach lvrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve, but with a suggestion that would make our stats a bit cleaner.

@@ -1079,7 +1101,7 @@ func (gateway *HandleT) getPayloadAndWriteKey(_ http.ResponseWriter, r *http.Req
if !ok || writeKey == "" {
err = errors.New(response.NoWriteKeyInBasicAuth)
misc.IncrementMapByKey(sourceFailStats, "noWriteKey", 1)
gateway.updateSourceStats(sourceFailStats, "gateway.write_key_failed_requests", map[string]string{"noWriteKey": "noWriteKey", "reqType": reqType})
gateway.updateFailedSourceStats(sourceFailStats, "gateway.write_key_failed_requests", map[string]string{"noWriteKey": "noWriteKey", "reqType": reqType, "reason": "noWriteKeyInBasicAuth"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some formatting to make the long line more readable:

Suggested change
gateway.updateFailedSourceStats(sourceFailStats, "gateway.write_key_failed_requests", map[string]string{"noWriteKey": "noWriteKey", "reqType": reqType, "reason": "noWriteKeyInBasicAuth"})
gateway.updateFailedSourceStats(sourceFailStats, "gateway.write_key_failed_requests", map[string]string{
"noWriteKey": "noWriteKey",
"reqType": reqType,
"reason": "noWriteKeyInBasicAuth"
})

Comment on lines 691 to +692
gateway.updateSourceStats(sourceSuccessStats, "gateway.write_key_successful_requests", sourceTagMap)
gateway.updateSourceStats(sourceFailStats, "gateway.write_key_failed_requests", sourceTagMap)
gateway.updateFailedSourceStats(sourceFailStats, "gateway.write_key_failed_requests", sourceTagMap)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it peculiar having 3 different metrics that capture similar things, with an introduction of a tag we can simplify things.

What if gateway.write_key_requests had two extra tags:

  • failed: true/false
  • reason - same to what you introduce here.

Using the failed tag above you can get any metric you like.

With this, we need can simplify our code and make our stats more compact.

We still need to keep:

		gateway.updateSourceStats(sourceSuccessStats, "gateway.write_key_successful_requests", sourceTagMap)
		gateway.updateFailedSourceStats(sourceFailStats, "gateway.write_key_failed_requests", sourceTagMap)

For a few release for backwards compatibility, but after that, they can be removed.

Copy link
Contributor Author

@satishrudderstack satishrudderstack Sep 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lvrach The similar metrics exist for gateway events, gateway.write_key_events, gateway.write_key_successful_events, gateway.write_key_failed_events

Let's take this a separate change ?

Also, I think we might need to revisit all the metrics once. I think we need to add rudderstack resourceIDs (sourceID, destinationID, transformationID, trackingPlanID) in the tags wherever possible. Going forward, this would enable us to send more notifications to the customers. With your changes to filter out these tags in free this should not be an issue ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for tags

@satishrudderstack
Copy link
Contributor Author

@lvrach Added a new commit to this PR. 53779e3. Please take a look

@satishrudderstack satishrudderstack marked this pull request as ready for review October 10, 2022 12:32
@@ -244,7 +244,7 @@ func Run(ctx context.Context) int {
}
// TODO: remove as soon as we update the configuration with statsExcludedTags where necessary
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone explain this TODO to me?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was discussing the same with @chandumlg ... So the basic idea is that for free customers we don't want to send so many tags to influx as it gets bloated up but essentially we should remove the redundant tags for every metric

@cisse21 cisse21 merged commit a6137a9 into master Oct 11, 2022
@cisse21 cisse21 deleted the chore.reasonInFailedSourceStats branch October 11, 2022 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants