Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google_app_engine_flexible_app_version seems to time out at 10m with 500 error #6194

Closed
bharathkkb opened this issue Apr 24, 2020 · 7 comments
Assignees
Labels

Comments

@bharathkkb
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

Terraform v0.12.20

  • provider.archive v1.3.0
  • provider.google v3.18.0
  • provider.google-beta v3.18.0
  • provider.null v2.1.2
  • provider.random v2.2.1

Affected Resource(s)

  • google_app_engine_flexible_app_version

Terraform Configuration Files

data "google_project" "project" {
  project_id = var.project_id
}

resource "google_app_engine_application" "app" {
  depends_on  = [var.appengine_depends_on]
  project     = var.project_id
  location_id = var.location_id
}

resource "google_project_iam_member" "gae_api" {
  for_each = toset(concat(["roles/compute.networkUser","roles/storage.objectViewer"], var.gae_service_account_roles))
  project  = var.project_id
  role     = each.value
  member   = "serviceAccount:service-${data.google_project.project.number}@gae-api-prod.google.com.iam.gserviceaccount.com"
}

resource "google_app_engine_flexible_app_version" "app" {
  depends_on = [var.appengine_depends_on, google_project_iam_member.gae_api]
  version_id = "v1"
  project    = var.project_id
  runtime    = "custom"

  deployment {
    container {
      image = var.container_image
    }
  }

  liveness_check {
    path = "/"
  }

  readiness_check {
    path = "/"
  }

  env_variables = var.env_vars

  network {
    name         = var.network_name
    subnetwork   = var.subnetwork_name
    instance_tag = "tf-aef"
  }

  automatic_scaling {
    cool_down_period    = "120s"
    min_total_instances = 2
    max_total_instances = 20
    cpu_utilization {
      target_utilization = 0.5
    }
  }

  noop_on_destroy = true
}

Debug Output

https://gist.github.com/bharathkkb/2c0801fc8c1840de1a60d0ea694cb59b

Panic Output

N/A

Expected Behavior

Output is an actionable error or it creates the resource.

Actual Behavior

No actionable error
Error creating FlexibleAppVersion: googleapi: Error 500: Internal error encountered.

Steps to Reproduce

  1. terraform apply

Important Factoids

  • There is an error in the debug logs that says
    produced an invalid plan for module.appengine.google_app_engine_flexible_app_version.app, but we are tolerating it

  • it could be that one of the values I am giving is wrong, but I would expect to pinpoint the error though

References

  • #0000
@ghost ghost added the bug label Apr 24, 2020
@edwardmedia edwardmedia self-assigned this Apr 24, 2020
@edwardmedia
Copy link
Contributor

@bharathkkb the path of your liveness_check and readiness_check are questionable. Check out if that path is valid in your container. Below is the HCL that works for me. If you still see problems, could you post full debug log, and code for your image?

resource "google_app_engine_flexible_app_version" "foo" {
  version_id = "v1"
  service    = "tf6194trysvc1"
  runtime    = "custom"
  runtime_api_version = "1"
  resources {
    cpu       = 1
    memory_gb = 0.5
    disk_gb   = 10
  }
  deployment {
    container {
       image = "us.gcr.io/myproject/appengine/default.20200425t235145:latest"
    }
  }
  liveness_check {
    path = "."
  }
  readiness_check {
    path = "."
  }
  env_variables = {
    port = "8000"
  }
  instance_class = "B1"
  manual_scaling {
    instances = 1
  }
  noop_on_destroy = true
}

@bharathkkb
Copy link
Author

@edwardmedia
The paths for the liveness_checks was lifted as is from docs and it made sense as the app should respond 200 OK at /. In you example, what does path = . imply?

I will send full debug log internally as it will be difficult for me to sanitize.

@edwardmedia
Copy link
Contributor

edwardmedia commented Apr 27, 2020

@bharathkkb whether "/" or ".", it depends on which you want the appengine health check to call. In my example, I created the image using below nginx sample. I provided the relative location which is at /usr/share/nginx/www
https://github.com/GoogleCloudPlatform/appengine-custom-runtimes-samples/tree/master/nginx
You might want to review your container implementation and decide what to provide for appengine health check

@edwardmedia
Copy link
Contributor

@bharathkkb did you figure out what the problem was in your case?

@bharathkkb
Copy link
Author

hi @edwardmedia

We ended up switching over to GKE for unrelated reasons and was unable to check your last suggestion.

However when we switched to GKE we did change our liveness and readiness probes to a different path /actuator/health so that may have been the cause.

If you are able to repro the error, I feel like adding some documentation around this could have saved us some time as the error googleapi: Error 500: Internal error encountered. is not very useful. Feel free to close this out and thanks for your help!

@ghost ghost removed the waiting-response label May 7, 2020
@edwardmedia
Copy link
Contributor

@bharathkkb thank you for the reply. I am closing this issue. Feel free to reopen it if you see this issue again.

@ghost
Copy link

ghost commented Jun 7, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Jun 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants