Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding _OP_FAILED_KEYS and FAILED_KEYS #422

Merged
merged 3 commits into from
Sep 16, 2024
Merged

feat: adding _OP_FAILED_KEYS and FAILED_KEYS #422

merged 3 commits into from
Sep 16, 2024

Conversation

miki725
Copy link
Contributor

@miki725 miki725 commented Sep 11, 2024

Issue

when we know exact reason why imds failed we should make it explicit in the report

Description

reports when chalk knows exact reason for the failure. currently thats when we get explicit 403 which implies metadata is disabled for that instance of we see readLine timeout which implies hop limit is reached. Other failure modes we cant account for so dont report anything on

Testing

➜ make tests args="test_plugins.py::test_aws_no_imds --logs"

@miki725 miki725 requested a review from viega as a code owner September 11, 2024 14:50
@miki725
Copy link
Contributor Author

miki725 commented Sep 11, 2024

Copy link
Contributor

@ee7 ee7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to surface this information. Looks good, aside from a couple of comments on ensuring the messages are correct.

src/plugins/cloudMetadata.nim Outdated Show resolved Hide resolved
src/plugins/cloudMetadata.nim Outdated Show resolved Hide resolved
@ee7
Copy link
Contributor

ee7 commented Sep 11, 2024

https://app.shortcut.com/crashoverride-1/story/2315/chalk-detect-and-communicate-that-imds-is-disabled

(I took the liberty of marking the later story as a duplicate - please yell at me if we want something else).

ee7
ee7 previously approved these changes Sep 11, 2024
Copy link
Contributor

@ee7 ee7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As being discussed on Slack, I like the idea of a single key whose value is an array of "key that couldn't be reported, and reason" as a long-term design. Otherwise we might end up wanting to add quite a few _FOO_FAILURE_REASON keys. But I think we can handle that later.

LGTM (though see nit).

this will allow to report any other future failures in a generic way
@miki725 miki725 changed the title feat: adding _OP_CLOUD_METADATA_FAILURE_REASON feat: adding _OP_FAILED_KEYS and FAILED_KEYS Sep 12, 2024
Copy link
Contributor

@ee7 ee7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also like that this avoids complexity of whether a user who subscribes to FOO should consider whether they also subscribe to FOO_FAILURE_REASON (which may even be added after FOO, as it would've been for _OP_CLOUD_METADATA_FAILURE_REASON).

Sure, we could imagine behavior like "subscribing to FOO also subscribes to a corresponding FOO_FAILURE_REASON key", and have that behavior configurable by another key, but the current design seems better to me. I suppose a trade-off is that a user currently isn't able to opt-out of particular keys being reported in FAILED_KEYS if they don't care about them, but a user subscribed to FOO probably does often care about that key being missing. And that opt-out could always be added if anyone wanted it.

Looks good, but can we clarify the error description for the hop limit a little further?

src/plugins/cloudMetadata.nim Outdated Show resolved Hide resolved
@miki725 miki725 merged commit 356c8c8 into main Sep 16, 2024
4 checks passed
@miki725 miki725 deleted the imds-reason branch September 16, 2024 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants