Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify attach process #322

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Simplify attach process #322

wants to merge 2 commits into from

Conversation

guilhem
Copy link
Contributor

@guilhem guilhem commented Nov 25, 2024

This pull request focuses on simplifying the volume attachment logic in the ControllerServer by removing redundant checks and related methods. The most important changes include the removal of the checkAttachmentCapacity function and its associated methods, as well as the related test cases.

Context

CSI caller (like Kubernetes) should already respect volume limitation from NodeGetInfo.
We don't need another check and logic at attach.

Motivation

Reduce logic and complexity and rely more on Linode API calls.
If there is an error, return it to the user for easier debug.

Risk

Making more failing attach requests and maybe hitting a rate limit.
This should not be a problem, as CSI attacher have a backoff logic.
Moreover, current limit logic is already doing api calls that should have hit API ratelimit.

Possible Mitigations

  • Implementing a rate limiting on api call.
  • As GCP, implementing a csiErrorBackoff mechanism

IMHO, both are not mandatory for the moment.

Simplification of volume attachment logic:

  • Removed the checkAttachmentCapacity function and its invocation from the ControllerPublishVolume method in internal/driver/controllerserver.go.
  • Deleted the canAttach function, which was used to determine if an additional volume could be attached to an instance, from internal/driver/controllerserver_helper.go.
  • Removed the checkAttachmentCapacity function, which checked if an instance could accommodate additional volume attachments, from internal/driver/controllerserver_helper.go.

Removal of related test cases:

  • Deleted the TestCheckAttachmentCapacity test function from internal/driver/controllerserver_helper_test.go, which tested the checkAttachmentCapacity logic.

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Pull Request Guidelines:

  1. Does your submission pass tests?
  2. Have you added tests?
  3. Are you addressing a single feature in this PR?
  4. Are your commits atomic, addressing one change per commit?
  5. Are you following the conventions of the language?
  6. Have you saved your large formatting changes for a different PR, so we can focus on your work?
  7. Have you explained your rationale for why this feature is needed?
  8. Have you linked your PR to an open issue

@guilhem guilhem requested review from a team as code owners November 25, 2024 23:04
@guilhem guilhem force-pushed the simpleattach branch 2 times, most recently from 2bc84b3 to 7449925 Compare November 25, 2024 23:26
Copy link

codecov bot commented Nov 25, 2024

Codecov Report

Attention: Patch coverage is 75.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 75.07%. Comparing base (17b7d1b) to head (efdefae).

Files with missing lines Patch % Lines
internal/driver/controllerserver_helper.go 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #322      +/-   ##
==========================================
+ Coverage   74.79%   75.07%   +0.28%     
==========================================
  Files          22       22              
  Lines        2396     2359      -37     
==========================================
- Hits         1792     1771      -21     
+ Misses        499      489      -10     
+ Partials      105       99       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@guilhem guilhem changed the title Simpleattach Simplify attach process Nov 26, 2024
switch {
case strings.Contains(apiErr.Message, "is already attached"):
return errAlreadyAttached
case strings.Contains(apiErr.Message, "Maximum number of block storage volumes are attached to this Linode"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but this seems brittle when the API changes their text message. How can we check this error, but not rely on the spelling of an API text error message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent question.
API doesn't document errors
image
And linodego report request as it

But we have to notice this logic was already done before in CSI :/

if errors.As(err, &apiErr) && strings.Contains(apiErr.Message, "is already attached") {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah def not the ideal way to do it. Hopefully linodego can provide better error codes in future so we don't have to build logic based on the err strings :/

@@ -677,33 +650,6 @@ func (cs *ControllerServer) getInstance(ctx context.Context, linodeID int) (*lin
return instance, nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really no value in having these checks client side at all?

Copy link
Contributor Author

@guilhem guilhem Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to prefer to rely on API internal logic and validation and for CSI to act as a simple pass-through.

The problem I see could be a rate-limiting issue on API (GET vs. POST).
But attacher already has a backoff on error
https://github.com/kubernetes-csi/external-attacher?tab=readme-ov-file#csi-error-and-timeout-handling

But that may be questionable, or we may have to wait for linodego to implement rate-limiting.
That's why this PR was in draft state at first.

@@ -34,6 +34,10 @@ var (
// attachments allowed for the instance, call errMaxVolumeAttachments.
errMaxAttachments = status.Error(codes.ResourceExhausted, "max number of volumes already attached to instance")

// errAlreadyAttached is used to indicate that a volume is already attached
// to a Linode instance.
errAlreadyAttached = status.Error(codes.FailedPrecondition, "volume is already attached")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging, I want to wait for an answer on this issue
kubernetes-csi/external-attacher#604

Comment on lines +672 to +673
case strings.Contains(apiErr.Message, "is already attached"):
return errAlreadyAttached
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: We only get this err when the volume is attached to another linode?

I think pass the specific err message from the api which could include linode id of another node. This would be helpful in debugging issue in future.

case strings.Contains(apiErr.Message, "is already attached"):
return errAlreadyAttached
case strings.Contains(apiErr.Message, "Maximum number of block storage volumes are attached to this Linode"):
return errMaxAttachments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my suggestion from other comment. We should try to pass along the err message by the api here too

switch {
case strings.Contains(apiErr.Message, "is already attached"):
return errAlreadyAttached
case strings.Contains(apiErr.Message, "Maximum number of block storage volumes are attached to this Linode"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah def not the ideal way to do it. Hopefully linodego can provide better error codes in future so we don't have to build logic based on the err strings :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants