Skip to content
This repository has been archived by the owner on Aug 2, 2023. It is now read-only.

fix: Reduce possibility of stuck in PREPARING under high-load conditions #522

Merged
merged 7 commits into from
Jan 22, 2022

Conversation

achimnol
Copy link
Member

  • fix: Apply nested transaction to check_scaling_group predicate
  • fix: minor code clean up
  • refactor: Remove legacy stat-sync codes and apply batching
  • fix: Improve synchronization of kernel creation postprocessing

@achimnol achimnol added this to the 21.03 milestone Jan 21, 2022
@achimnol achimnol added the bug label Jan 21, 2022
@achimnol achimnol self-assigned this Jan 21, 2022
@codecov
Copy link

codecov bot commented Jan 21, 2022

Codecov Report

Merging #522 (8e3689a) into main (9aa7616) will increase coverage by 0.03%.
The diff coverage is 4.16%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #522      +/-   ##
==========================================
+ Coverage   48.76%   48.80%   +0.03%     
==========================================
  Files          54       54              
  Lines        9010     9001       -9     
==========================================
- Hits         4394     4393       -1     
+ Misses       4616     4608       -8     
Impacted Files Coverage Δ
src/ai/backend/manager/registry.py 16.99% <0.00%> (+0.06%) ⬆️
src/ai/backend/manager/scheduler/predicates.py 28.40% <33.33%> (-0.33%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9aa7616...8e3689a. Read the comment docs.

@achimnol achimnol merged commit 468ddbf into main Jan 22, 2022
@achimnol achimnol deleted the fix/preparing branch January 22, 2022 06:06
achimnol added a commit that referenced this pull request Jan 22, 2022
…ons (#522)

* fix: Apply nested transaction to `check_scaling_group` predicate
* refactor: Remove legacy stat-sync codes and apply batching
* fix: Improve synchronization of kernel creation postprocessing

Backported-From: main (22.03)
Backported-To: 21.09
achimnol added a commit that referenced this pull request Jan 22, 2022
…ons (#522)

* fix: Apply nested transaction to `check_scaling_group` predicate
* refactor: Remove legacy stat-sync codes and apply batching
* fix: Improve synchronization of kernel creation postprocessing

Backported-From: main (22.03)
Backported-To: 21.03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant