Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️/✨use placement constraints instead of generic resources ⚠️ #5255

Conversation

GitHK
Copy link
Contributor

@GitHK GitHK commented Jan 19, 2024

What do these changes do?

Enables the update of the docker engine.

The director and director-v2 can be instructed to no longer add generic resources, but
use placement constraints to start services on specific nodes.
NOTE: this is possible because we never take into account the quantity of a certain resource that needs to be consumed on the host machine, it's randomly set to a big number and we only use 1 unit of it.

The director and director-v2 have two new env vars which have possible substitutions for generic resources with constraints. If one of these items is found it will be replaced.

The env vars map the resource kind, ususally VRAM and AIRAM with a placement constraint.

DIRECTOR_GENERIC_RESOURCE_PLACEMENT_CONSTRAINTS_SUBSTITUTIONS = '{"VRAM": "node.labels.CUSTOM==CUSTOM_VALUE"}'
DIRECTOR_V2_GENERIC_RESOURCE_PLACEMENT_CONSTRAINTS_SUBSTITUTIONS= '{"VRAM": "node.labels.CUSTOM==CUSTOM_VALUE"}'

Related issue/s

How to test

Dev Checklist

DevOps Checklist

Copy link

codecov bot commented Jan 19, 2024

Codecov Report

Attention: 27 lines in your changes are missing coverage. Please review.

Comparison is base (1221670) 87.1% compared to head (b3cfd3a) 87.1%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #5255     +/-   ##
========================================
- Coverage    87.1%   87.1%   -0.1%     
========================================
  Files        1256    1303     +47     
  Lines       52221   53356   +1135     
  Branches     1156    1169     +13     
========================================
+ Hits        45529   46501    +972     
- Misses       6443    6606    +163     
  Partials      249     249             
Flag Coverage Δ
integrationtests 63.6% <85.2%> (-1.4%) ⬇️
unittests 85.1% <61.0%> (+0.1%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...ckages/models-library/src/models_library/docker.py 90.1% <100.0%> (+0.5%) ⬆️
...ctor_v2/core/dynamic_services_settings/__init__.py 100.0% <100.0%> (ø)
...2/src/simcore_service_director_v2/core/settings.py 98.7% <100.0%> (-0.1%) ⬇️
...modules/dynamic_sidecar/scheduler/_core/_events.py 99.3% <100.0%> (+<0.1%) ⬆️
...rary/src/models_library/service_settings_labels.py 91.8% <75.0%> (-0.4%) ⬇️
...ector_v2/core/dynamic_services_settings/sidecar.py 95.9% <93.7%> (-0.8%) ⬇️
...es/dynamic_sidecar/docker_service_specs/sidecar.py 87.9% <85.7%> (-0.4%) ⬇️
...s/dynamic_sidecar/docker_service_specs/settings.py 81.0% <50.0%> (-0.9%) ⬇️
...es/director/src/simcore_service_director/config.py 95.2% <75.0%> (-4.8%) ⬇️
...rary/src/models_library/utils/common_validators.py 69.2% <20.0%> (-30.8%) ⬇️
... and 1 more

... and 59 files with indirect coverage changes

@GitHK GitHK self-assigned this Jan 19, 2024
@GitHK GitHK added a:director issue related with the director service a:director-v2 issue related with the director-v2 service labels Jan 19, 2024
@GitHK GitHK added this to the This is Sparta! milestone Jan 19, 2024
@GitHK GitHK changed the title ♻️/✨use placement constraints instead of generic resources ♻️/✨use placement constraints instead of generic resources ⚠️ Jan 22, 2024
@GitHK GitHK marked this pull request as ready for review January 22, 2024 12:43
Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you did works but with this the whole generic resources gets blocked in case we re-use it at a latter time. Please check my proposal so that both might be used together without changing the code again. Also if suddenly we need one generic resource such as SSE or whatever this code will prevent it, while the proposal I did would not.

Copy link
Contributor

@YuryHrytsuk YuryHrytsuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are your ENV changes "safe" according to definition in the update env instruction?

Based on the answer, we should follow the steps described

Copy link
Member

@mrnicegyu11 mrnicegyu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very happy you enabled this @GitHK and unblocked the docker engine version upgrade with this.

But I regret that instead of getting rid of generic resources at the source (removing them where they are specified), we are introducing code that "circumvents" this by a replacement. Anyone who does not have the arcane knowledge of sparc y1-y6 will not understand why there is a weird replacement functionality introduced here I guess. If I was adding this code, I would post disclaimer-comments everywhere explaining that this logic is intentionally a "hack" somehow (if I understood it correctly), enabling a fast path beyond the generic-constraints and the associated bug in docker.

that being said I am happy that ops are now enabled to upgrade the docker engine version number :--)

.env-devel Outdated Show resolved Hide resolved
@GitHK GitHK requested review from pcrespov and sanderegg January 24, 2024 09:16
@GitHK
Copy link
Contributor Author

GitHK commented Jan 24, 2024

Are your ENV changes "safe" according to definition in the update env instruction?

Based on the answer, we should follow the steps described

I'd say yes, I've also replied in the associated PR.

@GitHK GitHK requested a review from YuryHrytsuk January 24, 2024 09:17
Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for the changes! it looks very clean. thanks!

Copy link

sonarcloud bot commented Jan 24, 2024

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@GitHK GitHK enabled auto-merge (squash) January 24, 2024 13:56
@GitHK GitHK merged commit f80b0c8 into ITISFoundation:master Jan 24, 2024
55 checks passed
@GitHK GitHK deleted the pr-osparc-generic-resoruces-via-placing-constraints branch January 24, 2024 14:54
@matusdrobuliak66 matusdrobuliak66 mentioned this pull request Feb 14, 2024
39 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:director issue related with the director service a:director-v2 issue related with the director-v2 service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dynamic services: Due to issues in docker >20.x VRAM generic resource may have to be used differently
6 participants