Skip to content
This repository has been archived by the owner on Nov 5, 2021. It is now read-only.

[Bug] Slow startup outside of public clouds #417

Closed
networkop opened this issue Jun 23, 2020 · 7 comments
Closed

[Bug] Slow startup outside of public clouds #417

networkop opened this issue Jun 23, 2020 · 7 comments
Milestone

Comments

@networkop
Copy link
Contributor

What happened: Whenever running cloudprober outside of GCP or AWS, it will exhibit a slot startup time of at least 20 seconds.

What you expected to happen: cloudprober should be able to start in under a second.

How to reproduce it (as minimally and precisely as possible): build the cloudprober binary and run it on a laptop

Anything else we need to know?: this is caused by the md.Available() function call in sysvars_ec2.go which tries to contact EC2's metadata service. GCP check returns faster, when DNS lookup of metadata.google.internal fails. Perhaps EC2 code can be refactored to check for ec2 string in product_uuid as documented here? <- this would require sudo privileges

@networkop networkop changed the title Slow startup outside of public clouds [BUG] Slow startup outside of public clouds Jun 23, 2020
@networkop networkop changed the title [BUG] Slow startup outside of public clouds [Bug] Slow startup outside of public clouds Jun 23, 2020
@manugarg
Copy link
Contributor

@networkop Thanks for noticing and debugging this.

Perhaps EC2 code can be refactored to check for ec2 string in product_uuid as documented here? <- this would require sudo privileges

Yeah, I've seen this before; this option looks good to me.

This doesn't seem to require root privileges though. I gave it a try:
[ec2-user@ip-172-31-34-75 ~]$ id
uid=1000(ec2-user) gid=1000(ec2-user) groups=1000(ec2-user),4(adm),190(systemd-journal) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

[ec2-user@ip-172-31-34-75 ~]$ cat /sys/hypervisor/uuid
ec2f5c3e-9c7e-42c7-2ced-7e5884805967

@networkop
Copy link
Contributor Author

Interesting, maybe it's OS-specific. I've got Debian10 and I don't have /sys/hypervisor/uuid but I do have /sys/devices/virtual/dmi/id/product_uuid which can only be read by root:

admin@ip-10-100-0-174:~$ cat /etc/*release* | grep -i name
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_CODENAME=buster
admin@ip-10-100-0-174:~$ uname -a
Linux ip-10-100-0-174 4.19.0-9-cloud-amd64 #1 SMP Debian 4.19.118-2 (2020-04-29) x86_64 GNU/Linux
admin@ip-10-100-0-174:~$ id
uid=1000(admin) gid=1000(admin) groups=1000(admin),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),113(netdev)
admin@ip-10-100-0-174:~$ cat /sys/hypervisor/uuid
cat: /sys/hypervisor/uuid: No such file or directory
admin@ip-10-100-0-174:~$ cat /sys/devices/virtual/dmi/id/product_uuid
cat: /sys/devices/virtual/dmi/id/product_uuid: Permission denied
admin@ip-10-100-0-174:~$ sudo cat /sys/devices/virtual/dmi/id/product_uuid
ec2cbb79-3ff1-ab56-5b59-1c6a3dc05c2c
admin@ip-10-100-0-174:~$ 

@manugarg
Copy link
Contributor

Ah, that complicates the things.

Apparently we can disable retries in the Available method:
aws/aws-sdk-go#582

We can use 0 or some other small number of retries.

@networkop
Copy link
Contributor Author

yeah, setting it to a 0 is an easy win. I'll be much happier with 5 seconds instead of 20.
What do you think about a flag that disables all these cloud checks completely? This would really help with local development if your build toolchain watches your source files and rebuilds them on the fly (e.g. I'm using tilt) - the feedback loop will be near-instant.

@networkop
Copy link
Contributor Author

so are you ok for me to submit a PR for this?
should I do both things?
a) decrese AWS retries and
b) add a new flag
TBH, if the option with the flag is ok with you, then decreasing AWS retries may not be needed at all.

@manugarg
Copy link
Contributor

Sure. Sounds good to me.

Let's reduce retries for AWS regardless, or we can do that separately. I think it makes sense to optimize the default behavior as well.

@networkop
Copy link
Contributor Author

closed in #427

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants