feat: adding _OP_FAILED_KEYS and FAILED_KEYS #422

miki725 · 2024-09-11T14:50:18Z

Followed the steps in the contributor's guide: https://crashoverride.com/docs/other/contributing#filing-the-pull-request
PR title uses semantic commit messages
Filled out the template to a useful degree
Updated CHANGELOG.md if necessary

Issue

when we know exact reason why imds failed we should make it explicit in the report

Description

reports when chalk knows exact reason for the failure. currently thats when we get explicit 403 which implies metadata is disabled for that instance of we see readLine timeout which implies hop limit is reached. Other failure modes we cant account for so dont report anything on

Testing

➜ make tests args="test_plugins.py::test_aws_no_imds --logs"

miki725 · 2024-09-11T15:09:08Z

https://app.shortcut.com/crashoverride-1/story/2332/chalk-report-disabled-metadata-endpoint

ee7

Great to surface this information. Looks good, aside from a couple of comments on ensuring the messages are correct.

src/plugins/cloudMetadata.nim

ee7 · 2024-09-11T16:36:33Z

https://app.shortcut.com/crashoverride-1/story/2315/chalk-detect-and-communicate-that-imds-is-disabled

(I took the liberty of marking the later story as a duplicate - please yell at me if we want something else).

ee7

As being discussed on Slack, I like the idea of a single key whose value is an array of "key that couldn't be reported, and reason" as a long-term design. Otherwise we might end up wanting to add quite a few _FOO_FAILURE_REASON keys. But I think we can handle that later.

LGTM (though see nit).

this will allow to report any other future failures in a generic way

ee7

I also like that this avoids complexity of whether a user who subscribes to FOO should consider whether they also subscribe to FOO_FAILURE_REASON (which may even be added after FOO, as it would've been for _OP_CLOUD_METADATA_FAILURE_REASON).

Sure, we could imagine behavior like "subscribing to FOO also subscribes to a corresponding FOO_FAILURE_REASON key", and have that behavior configurable by another key, but the current design seems better to me. I suppose a trade-off is that a user currently isn't able to opt-out of particular keys being reported in FAILED_KEYS if they don't care about them, but a user subscribed to FOO probably does often care about that key being missing. And that opt-out could always be added if anyone wanted it.

Looks good, but can we clarify the error description for the hop limit a little further?

src/plugins/cloudMetadata.nim

Co-authored-by: ee7 <[email protected]>

miki725 requested a review from viega as a code owner September 11, 2024 14:50

feat: adding _OP_CLOUD_METADATA_FAILURE_REASON

cb50474

miki725 force-pushed the imds-reason branch from 5b8ac40 to cb50474 Compare September 11, 2024 14:50

ee7 reviewed Sep 11, 2024

View reviewed changes

src/plugins/cloudMetadata.nim Outdated Show resolved Hide resolved

src/plugins/cloudMetadata.nim Outdated Show resolved Hide resolved

ee7 previously approved these changes Sep 11, 2024

View reviewed changes

feat: switching to generic FAILED_KEYS

c559642

this will allow to report any other future failures in a generic way

miki725 dismissed ee7’s stale review via c559642 September 12, 2024 00:58

miki725 changed the title ~~feat: adding _OP_CLOUD_METADATA_FAILURE_REASON~~ feat: adding _OP_FAILED_KEYS and FAILED_KEYS Sep 12, 2024

ee7 reviewed Sep 12, 2024

View reviewed changes

src/plugins/cloudMetadata.nim Outdated Show resolved Hide resolved

docs: better hop limit description

3de2089

Co-authored-by: ee7 <[email protected]>

ee7 approved these changes Sep 12, 2024

View reviewed changes

miki725 merged commit 356c8c8 into main Sep 16, 2024
4 checks passed

miki725 deleted the imds-reason branch September 16, 2024 14:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: adding _OP_FAILED_KEYS and FAILED_KEYS #422

feat: adding _OP_FAILED_KEYS and FAILED_KEYS #422

miki725 commented Sep 11, 2024

miki725 commented Sep 11, 2024

ee7 left a comment

ee7 commented Sep 11, 2024 •

edited

Loading

ee7 left a comment •

edited

Loading

ee7 left a comment •

edited

Loading

feat: adding _OP_FAILED_KEYS and FAILED_KEYS #422

feat: adding _OP_FAILED_KEYS and FAILED_KEYS #422

Conversation

miki725 commented Sep 11, 2024

Issue

Description

Testing

miki725 commented Sep 11, 2024

ee7 left a comment

Choose a reason for hiding this comment

ee7 commented Sep 11, 2024 • edited Loading

ee7 left a comment • edited Loading

Choose a reason for hiding this comment

ee7 left a comment • edited Loading

Choose a reason for hiding this comment

ee7 commented Sep 11, 2024 •

edited

Loading

ee7 left a comment •

edited

Loading

ee7 left a comment •

edited

Loading