Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: very first deployment of RAM stuck due to race on resource manager quota #126

Closed
BrunoReboul opened this issue Feb 5, 2021 · 0 comments · Fixed by #134
Closed

Bug: very first deployment of RAM stuck due to race on resource manager quota #126

BrunoReboul opened this issue Feb 5, 2021 · 0 comments · Fixed by #134
Assignees
Labels
type: bug Error or flaw in code with unintended results
Milestone

Comments

@BrunoReboul
Copy link
Owner

BrunoReboul commented Feb 5, 2021

Error

  • Deployemnt all microservices instance at the same time by using a ram-vx.y.z-env tag does not guarentee in which order the instances are deployed (each deployment is designed to be idempotent
  • setfeed deployments ussualy complete in 1min30sec while deployment based on cloud functions complete in 4min30sec
  • This lead to activate the real time tirggers while publish2fs cache as not yet been deplyed
  • the monitor instance are triggered by the realtime flows, as the cache is empty each execution fallback on querying resource manager to resolve org / folders / projects ids into displayNames
  • resources manager quotas it far smaller than the rate of real time changes on many existing org. leading to continuously exhaust resource manager quotas
  • the remaining deployment then all fail as each deployment nead a couple of query to resource manager to check iam bindigns.
  • leading to a dead lock.

Workarround

  • delete the CAI feeds,
  • relaunch all deployments but setfeeds
  • Trigger manually dumpinventory for Orgs, Folders, Projetcs, wait for the firestore cahce to be populated
  • Deploy setfeeed as the last microservice.

Fix

  • remove the fall back mechanism to query resource manager when the data is not found in cache, avoid so the dead lock to occur
  • Simplify the install doc accordingly
@BrunoReboul BrunoReboul added type: bug Error or flaw in code with unintended results util: ramcli Package ramcli Real-time Asset Monitor command line cli labels Feb 5, 2021
@BrunoReboul BrunoReboul added this to the 2021-01 milestone Feb 5, 2021
@BrunoReboul BrunoReboul self-assigned this Feb 5, 2021
@BrunoReboul BrunoReboul added type: new feature Create a feature util: gcb // Package gcb helps with Google Cloud Build type: bug Error or flaw in code with unintended results type: doc Improvements or additions to documentation and removed type: bug Error or flaw in code with unintended results util: ramcli Package ramcli Real-time Asset Monitor command line cli type: new feature Create a feature util: gcb // Package gcb helps with Google Cloud Build labels Feb 5, 2021
@BrunoReboul BrunoReboul changed the title Bug: very first deployment of RAM stuck due to race on resource manager quota moved to config tempate Feb 5, 2021
@BrunoReboul BrunoReboul reopened this Feb 18, 2021
@BrunoReboul BrunoReboul removed the type: doc Improvements or additions to documentation label Feb 18, 2021
@BrunoReboul BrunoReboul changed the title moved to config tempate Bug: very first deployment of RAM stuck due to race on resource manager quota Feb 18, 2021
@BrunoReboul BrunoReboul mentioned this issue Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Error or flaw in code with unintended results
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant