Skip to content
This repository has been archived by the owner on Apr 4, 2018. It is now read-only.

aws/gce: Standardise LB health check configs #48

Merged
merged 1 commit into from
May 14, 2015

Conversation

dcarley
Copy link
Contributor

@dcarley dcarley commented May 12, 2015

For all load balancer health checks on AWS and GCE. Using variables so that
they are always the same. Most of these values have been reduced because it
was taking a very long time for new instances to come into service.

Change the following:

  • interval to 5s which is the minimum supported by AWS. This has reduced AWS
    from 30s and increased GCE from 1s.
  • timeout to 2s which is the minimum supported by AWS. This has reduced AWS
    from 5s and increased GCE from 1s.
  • healthy threshold to 2 requests. This has not changed AWS or GCE.
  • unhealthy threshold to 2 requests. This has changed AWS from 10 and not
    changed GCE.

I'm not 100% confident about the values. They weren't thoroughly tested when
we first introduced them for GCE and I suspect we might want to experiment
with them in the future, but this is a good start.

I've changed the target for API on AWS from the default of TCP:8080 to
HTTP:8080/info in order to match GCE and give a more accurate check. Other
targets remain as their defaults but we have to pass them because they're
mandatory. They don't match GCE because GCE can't do TCP health checks.

This will not apply cleanly to existing GCE environments due to a bug in
Terraform. This should be fixed in the future by hashicorp/terraform#1894.
But for the timebeing I think it's important enough that we should delete
existing forwarding rules, target pools, and health checks, then let
Terraform recreate them with the correct config.

For all load balancer health checks on AWS and GCE. Using variables so that
they are always the same. Most of these values have been reduced because it
was taking a very long time for new instances to come into service.

Change the following:

- interval to 5s which is the minimum supported by AWS. This has reduced AWS
  from 30s and increased GCE from 1s.
- timeout to 2s which is the minimum supported by AWS. This has reduced AWS
  from 5s and increased GCE from 1s.
- healthy threshold to 2 requests. This has not changed AWS or GCE.
- unhealthy threshold to 2 requests. This has changed AWS from 10 and not
  changed GCE.

I'm not 100% confident about the values. They weren't thoroughly tested when
we first introduced them for GCE and I suspect we might want to experiment
with them in the future, but this is a good start.

I've changed the target for API on AWS from the default of `TCP:8080` to
`HTTP:8080/info` in order to match GCE and give a more accurate check. Other
targets remain as their defaults but we have to pass them because they're
mandatory. They don't match GCE because GCE can't do TCP health checks.

This will *not* apply cleanly to existing GCE environments due to a bug in
Terraform. This should be fixed in the future by hashicorp/terraform#1894.
But for the timebeing I think it's important enough that we should delete
existing forwarding rules, target pools, and health checks, then let
Terraform recreate them with the correct config.
@dcarley dcarley force-pushed the standardise_health_checks branch from dcb7de6 to caaec5e Compare May 13, 2015 10:18
@annashipman
Copy link
Contributor

👍 this is great, I like the pulling out of the variables and making it consistent particularly.

I agree that it's fine just to start with these values and see how we get on.

How do you suggest we merge/roll this out given that it won't apply cleanly?

@dcarley
Copy link
Contributor Author

dcarley commented May 14, 2015

How do you suggest we merge/roll this out given that it won't apply cleanly?

I guess merge and let people sort out their own environments if they have one that they can't destroy right now. Contradicts what I said in another PR and I don't think we'd want to do this regularly, but now is a better time than any while we don't have a shared environment.

@annashipman
Copy link
Contributor

OK. I'll email to alert team.

@annashipman
Copy link
Contributor

Haven't heard this is a problem for anyone, so merging.

annashipman added a commit that referenced this pull request May 14, 2015
aws/gce: Standardise LB health check configs
@annashipman annashipman merged commit 2fa39f4 into master May 14, 2015
@annashipman annashipman deleted the standardise_health_checks branch May 14, 2015 12:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants