Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev cluster config #71

Merged
merged 1 commit into from
Sep 21, 2023
Merged

Dev cluster config #71

merged 1 commit into from
Sep 21, 2023

Conversation

approxit
Copy link
Contributor

What I've done:

  • Created golem-cluster-dev.yaml file, with disabled registry stats and enabled project files sync
  • Cleaned up some fields in cluster yaml files
  • Added support for disabling registry stats on image resolution
  • Added default value for auth/ssh_user field
  • Fixed problem with disable_existing_loggers settings in logging facility, that made bunch of premature exit problems invisible
  • Fixed problem with premature exit of managed yagna process due to SIGINT forwarding from parent to child subprocess, that made problems with clean GolemNode shutdown
  • Fixed problem with premature webserver exit on self-shutdown, that made final exit logs not reachable
  • Added more verbose logs of service lifetime
  • Reformatted code

Notable remarks:

  • Forcibly exiting webserver can leave yagna process running in the background
  • Terminating nodes with ray down can have race condition with autoscaler in the background, and webserver self-shutdown can be called multiple times, exiting too early in result, witch raises ugly exception on ray down

@approxit approxit merged commit afea9db into main Sep 21, 2023
@approxit approxit deleted the approxit/dev-yaml branch September 21, 2023 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants