Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tunable timeouts in provisioner #451

Closed
saikat-royc opened this issue Jul 1, 2020 · 17 comments
Closed

tunable timeouts in provisioner #451

saikat-royc opened this issue Jul 1, 2020 · 17 comments

Comments

@saikat-royc
Copy link

saikat-royc commented Jul 1, 2020

In current code we have a single timeout flag which is used for short csi calls like Probe(), GetDriverName() [https://github.com/kubernetes-csi/external-provisioner/blob/master/cmd/csi-provisioner/csi-provisioner.go#L160] and also for the CSI provisioning calls (which is potentially a long operation)
https://github.com/kubernetes-csi/external-provisioner/blob/master/cmd/csi-provisioner/csi-provisioner.go#L233

The proposal is the following:
Use a relative small timeout say current default of 10 secs, for short CSI calls like probe, driverinfo, and use the existing timeout flag to make CSI provisioning calls (as it exists in code today). Thus we make the short calls efficient, and still keep the number of tunable flags to 1.

Similar behavior seen in code for other csi sidecars like snapshotter, attacher.
CSI Drivers using the sidecars can tune only the timeouts for the csi calls (provisioning, attach, snapshot etc)

@saikat-royc saikat-royc changed the title Timeouts in provisioner tunable timeouts in provisioner Jul 1, 2020
@saikat-royc
Copy link
Author

/cc @msau42 @xing-yang

@msau42
Copy link
Collaborator

msau42 commented Jul 1, 2020

I think it's reasonable to use a short, unconfigurable timeout for short calls like probe, getdrivername.

@xing-yang
Copy link
Contributor

If it becomes unconfigurable, then I think it's better just to use the existing timeout parameter.

@saikat-royc
Copy link
Author

@xing-yang can you please elaborate. I didnt understand your comment completely.
The proposal is to use the existing "timeout" flag for only provisioning specific csi calls. (so individual csi drivers can configure it in their own deployments). For short csi calls like driver info , probe we use a default value of say 10 secs. (not exposed for tuning
via a flag).

Is that what you are suggesting too?

@xing-yang
Copy link
Contributor

Hi @saikat-royc, I meant we should either have a separate configurable timeout flag for short CSI calls like what you suggested, or just keep the status quo. I prefer to make them configurable, not using fixed values.

@msau42
Copy link
Collaborator

msau42 commented Jul 7, 2020

@xing-yang I don't see short calls like Probe() and GetDriverName() dependent on the storage system characteristics. I would like to avoid a proilferation of arguments if possible.

@saikat-royc
Copy link
Author

Had a discussion with @msau42 and @xing-yang about the trade offs of using a separate tunable or static value for probes.

Using a small timeout value, if the driver had issues, debugging becomes easier. However the functionality would still be stuck irrespective of the timeout values, and the side car will not be able to serve requests until the driver is ready.
Evaluating pros and cons of adding a new timeout value and making it tunable (thereby increasing the number of flags), we have decided to keep the existing behavior. A single "timeout" tunable flag value to control all CSI calls.

If any specific need arises for a separate timeout for probing , we can re-open this issue. Closing this issue.

@venkatsc
Copy link

venkatsc commented Jul 22, 2020

10 seconds timeout seems to be not enough if docker dns cache is not enabled. On kubespray cluster with k8s v1.18.5, requests are failing with timeout if we try to talk to our api with hostname.

Requested volume is created in storage backend but PVC is stuck in pending forever and PV is not created. Configuring API with IP instead of HOSTNAME worked fine. Therefore, I would like to request to increase the default timeout and provide configuration option.

@msau42
Copy link
Collaborator

msau42 commented Jul 22, 2020

@venkatsc there around is a "--timeout" option you can configure for your driver.

@msau42
Copy link
Collaborator

msau42 commented Jul 22, 2020

@xing-yang @saikat-royc I had another thought. Since 10s seems too small, but we can't really decide on a good default, what if we don't default anything and make the argument required?

@xing-yang
Copy link
Contributor

That means many drivers would fail at start time after upgrading to the new sidecar? That's not a good experience.
On the other hand, making it required would make driver owner aware of this setting.

@msau42
Copy link
Collaborator

msau42 commented Jul 22, 2020

Yes but we already are doing a major release and breaking a lot of parameters already, so this is the best time to do it.

@xing-yang
Copy link
Contributor

I'm fine with it. Just need to make sure we document this breaking change.

@jsafrane
Copy link
Contributor

If we're discussing DNS and its timeouts here, it's a property of cluster / its network / container runtime. How can a driver vendor set it better than us?

@venkatsc
Copy link

@venkatsc there around is a "--timeout" option you can configure for your driver.

I did try with "--timeout" but it did not work as this option required "time" type. What value/type does this option take? It could not take integer. Also, it is not very clear if the given value for this option is seconds/milliseconds.

@jsafrane
Copy link
Contributor

It takes golang's duration spec, --timeout 10s or --timeout 1h

@pohly
Copy link
Contributor

pohly commented Aug 3, 2020 via email

kbsonlong pushed a commit to kbsonlong/external-provisioner that referenced this issue Dec 29, 2023
…go_modules/google.golang.org/grpc-1.56.0

Bump google.golang.org/grpc from 1.55.0 to 1.56.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants