Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛Dynamic autoscaling: not scaling beside 1 machine #5026

Conversation

sanderegg
Copy link
Member

@sanderegg sanderegg commented Nov 14, 2023

What do these changes do?

This PR fixes the following issue:

  • dynamic autoscaling service setup
  • creating dynamic services does create a new machine and goes there
  • creating another service does not create a new machine and the service stays forever

root cause:

  • autoscaling analyzes the current cluster and divides it in active, pending, drained and missing nodes respectively,
  • it then tries to assign the "unrunnable" tasks to these 4 types of nodes in this order,
  • it was not subtracting the current usage of the active node, therefore always thinking there were enough machines around.

Testing can only be made directly on AWS machines, which makes testing a bit annoying.

to @pcrespov : I will not refactor yet the filter function as I do not want to break this now again. especially since testing is tedious.

Related issue/s

How to test

Dev Checklist

DevOps Checklist

@sanderegg sanderegg added the a:autoscaling autoscaling service in simcore's stack label Nov 14, 2023
@sanderegg sanderegg added this to the 7peaks milestone Nov 14, 2023
@sanderegg sanderegg self-assigned this Nov 14, 2023
Copy link

codecov bot commented Nov 14, 2023

Codecov Report

Merging #5026 (9f1e5c8) into master (992d8be) will decrease coverage by 13.9%.
The diff coverage is 100.0%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #5026      +/-   ##
=========================================
- Coverage    80.3%   66.5%   -13.9%     
=========================================
  Files        1250     547     -703     
  Lines       51230   27727   -23503     
  Branches     1107     198     -909     
=========================================
- Hits        41168   18447   -22721     
+ Misses       9830    9230     -600     
+ Partials      232      50     -182     
Flag Coverage Δ
integrationtests 64.9% <ø> (+13.2%) ⬆️
unittests 98.5% <100.0%> (+19.2%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...oscaling/src/simcore_service_autoscaling/models.py 100.0% <100.0%> (ø)
...e_service_autoscaling/modules/auto_scaling_core.py 95.0% <100.0%> (-0.9%) ⬇️
...vice_autoscaling/modules/auto_scaling_mode_base.py 100.0% <100.0%> (ø)
...scaling/modules/auto_scaling_mode_computational.py 100.0% <100.0%> (ø)
...e_autoscaling/modules/auto_scaling_mode_dynamic.py 100.0% <100.0%> (ø)
...ore_service_autoscaling/utils/auto_scaling_core.py 94.2% <100.0%> (-0.2%) ⬇️
...service_autoscaling/utils/computational_scaling.py 100.0% <100.0%> (ø)
...mcore_service_autoscaling/utils/dynamic_scaling.py 100.0% <100.0%> (ø)

... and 948 files with indirect coverage changes

Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5129 lines exceeds the maximum allowed for the inline comments feature.

Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5129 lines exceeds the maximum allowed for the inline comments feature.

Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5136 lines exceeds the maximum allowed for the inline comments feature.

Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5245 lines exceeds the maximum allowed for the inline comments feature.

Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5249 lines exceeds the maximum allowed for the inline comments feature.

Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5249 lines exceeds the maximum allowed for the inline comments feature.

Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5307 lines exceeds the maximum allowed for the inline comments feature.

Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5308 lines exceeds the maximum allowed for the inline comments feature.

Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5309 lines exceeds the maximum allowed for the inline comments feature.

@sanderegg sanderegg force-pushed the bugfix/dynamic-autoscaling-does-not-scale branch from ce51c58 to 1861d26 Compare November 14, 2023 10:09
Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5309 lines exceeds the maximum allowed for the inline comments feature.

@sanderegg sanderegg force-pushed the bugfix/dynamic-autoscaling-does-not-scale branch from 1861d26 to 0d68cf9 Compare November 14, 2023 11:25
@sanderegg sanderegg marked this pull request as ready for review November 14, 2023 11:33
Copy link
Contributor

@matusdrobuliak66 matusdrobuliak66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks 👍

@sanderegg sanderegg force-pushed the bugfix/dynamic-autoscaling-does-not-scale branch from a16894d to ef34e05 Compare November 14, 2023 13:15
Copy link

@codeclimate codeclimate bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff size of 5300 lines exceeds the maximum allowed for the inline comments feature.

@sanderegg sanderegg force-pushed the bugfix/dynamic-autoscaling-does-not-scale branch from ef34e05 to d20e242 Compare November 14, 2023 13:33
Copy link

sonarcloud bot commented Nov 14, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
1.6% 1.6% Duplication

Copy link

codeclimate bot commented Nov 14, 2023

Code Climate has analyzed commit 9f1e5c8 and detected 0 issues on this pull request.

View more on Code Climate.

Copy link
Contributor

@GitHK GitHK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@sanderegg sanderegg merged commit 125ae0b into ITISFoundation:master Nov 14, 2023
56 checks passed
@sanderegg sanderegg deleted the bugfix/dynamic-autoscaling-does-not-scale branch November 14, 2023 14:54
@sanderegg sanderegg added the bug buggy, it does not work as expected label Nov 22, 2023
@matusdrobuliak66 matusdrobuliak66 mentioned this pull request Nov 23, 2023
29 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:autoscaling autoscaling service in simcore's stack bug buggy, it does not work as expected
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants