Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[App] Fix AutoScaler trying to replicate multiple works in a single machine #15991

Merged
merged 8 commits into from
Dec 11, 2022

Conversation

akihironitta
Copy link
Contributor

@akihironitta akihironitta commented Dec 9, 2022

What does this PR do?

See the title and error below that this PR fixes.

$ lightning run app examples/app_server_with_auto_scaler/app.py
Your Lightning App is starting. This won't take long.
ERROR: Found an exception when loading your application from examples/app_server_with_auto_scaler/app.py. Please, resolve it to run your app.

Traceback (most recent call last):
  File "examples/app_server_with_auto_scaler/app.py", line 75, in <module>
    MyAutoScaler(
  File "/home/aki/work/github.com/Lightning-AI/lightning/src/lightning/app/components/auto_scaler.py", line 444, in __init__
    self.add_work(work)
  File "/home/aki/work/github.com/Lightning-AI/lightning/src/lightning/app/components/auto_scaler.py", line 464, in add_work
    setattr(self, work_attribute, work)
  File "/home/aki/work/github.com/Lightning-AI/lightning/src/lightning/app/core/flow.py", line 171, in __setattr__
    value._register_cloud_compute()
  File "/home/aki/work/github.com/Lightning-AI/lightning/src/lightning/app/core/work.py", line 617, in _register_cloud_compute
    _CLOUD_COMPUTE_STORE[internal_id].add_component_name(self.name)
  File "/home/aki/work/github.com/Lightning-AI/lightning/src/lightning/app/utilities/packaging/cloud_compute.py", line 31, in add_component_name
    raise Exception(
Exception: A Cloud Compute can be assigned only to a single Work. Attached to root.worker_0_a7d14e1db8fc4e0380de041c7b5016ab

Does your PR introduce any breaking changes? If yes, please list them.

None

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

cc @Borda

@akihironitta akihironitta added bug Something isn't working components labels Dec 9, 2022
@akihironitta akihironitta added this to the v1.8.x milestone Dec 9, 2022
@akihironitta akihironitta requested a review from tchaton as a code owner December 9, 2022 16:19
@akihironitta akihironitta self-assigned this Dec 9, 2022
@github-actions github-actions bot added the app (removed) Generic label for Lightning App package label Dec 9, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Dec 9, 2022

⚡ Required checks status: All passing 🟢

Groups summary

🟢 lightning_app: Tests workflow
Check ID Status
app-pytest (macOS-11, app, 3.8, latest) success
app-pytest (macOS-11, app, 3.8, oldest) success
app-pytest (macOS-11, lightning, 3.9, latest) success
app-pytest (ubuntu-20.04, app, 3.8, latest) success
app-pytest (ubuntu-20.04, app, 3.8, oldest) success
app-pytest (ubuntu-20.04, lightning, 3.9, latest) success
app-pytest (windows-2022, app, 3.8, latest) success
app-pytest (windows-2022, app, 3.8, oldest) success
app-pytest (windows-2022, lightning, 3.8, latest) success

These checks are required after the changes to src/lightning_app/components/auto_scaler.py, tests/tests_app/components/test_auto_scaler.py.

🟢 lightning_app: Examples
Check ID Status
app-examples (macOS-11, app, 3.9, latest) success
app-examples (macOS-11, app, 3.9, oldest) success
app-examples (macOS-11, lightning, 3.9, latest) success
app-examples (ubuntu-20.04, app, 3.9, latest) success
app-examples (ubuntu-20.04, app, 3.9, oldest) success
app-examples (ubuntu-20.04, lightning, 3.9, latest) success
app-examples (windows-2022, app, 3.9, latest) success
app-examples (windows-2022, app, 3.9, oldest) success
app-examples (windows-2022, lightning, 3.9, latest) success

These checks are required after the changes to src/lightning_app/components/auto_scaler.py.

🟢 lightning_app: Azure
Check ID Status
App.cloud-e2e success

These checks are required after the changes to src/lightning_app/components/auto_scaler.py.

🟢 lightning_app: Docs
Check ID Status
make-doctest (app) success
make-html (app) success

These checks are required after the changes to src/lightning_app/components/auto_scaler.py.

🟢 mypy
Check ID Status
mypy success

These checks are required after the changes to src/lightning_app/components/auto_scaler.py.

🟢 install
Check ID Status
install-pkg (ubuntu-22.04, app, 3.7) success
install-pkg (ubuntu-22.04, app, 3.10) success
install-pkg (ubuntu-22.04, lite, 3.7) success
install-pkg (ubuntu-22.04, lite, 3.10) success
install-pkg (ubuntu-22.04, pytorch, 3.7) success
install-pkg (ubuntu-22.04, pytorch, 3.10) success
install-pkg (ubuntu-22.04, lightning, 3.7) success
install-pkg (ubuntu-22.04, lightning, 3.10) success
install-pkg (macOS-12, app, 3.7) success
install-pkg (macOS-12, app, 3.10) success
install-pkg (macOS-12, lite, 3.7) success
install-pkg (macOS-12, lite, 3.10) success
install-pkg (macOS-12, pytorch, 3.7) success
install-pkg (macOS-12, pytorch, 3.10) success
install-pkg (macOS-12, lightning, 3.7) success
install-pkg (macOS-12, lightning, 3.10) success
install-pkg (windows-2022, app, 3.7) success
install-pkg (windows-2022, app, 3.10) success
install-pkg (windows-2022, lite, 3.7) success
install-pkg (windows-2022, lite, 3.10) success
install-pkg (windows-2022, pytorch, 3.7) success
install-pkg (windows-2022, pytorch, 3.10) success
install-pkg (windows-2022, lightning, 3.7) success
install-pkg (windows-2022, lightning, 3.10) success

These checks are required after the changes to src/lightning_app/components/auto_scaler.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 60 minutes every 180 seconds. If you have any other questions, contact carmocca for help.

@akihironitta akihironitta changed the title [App] Fixes AutoScaler trying to replicate multiple works in a running machine [App] Fixes AutoScaler trying to replicate multiple works in a single machine Dec 9, 2022
@akihironitta akihironitta changed the title [App] Fixes AutoScaler trying to replicate multiple works in a single machine [App] Fix AutoScaler trying to replicate multiple works in a single machine Dec 9, 2022
Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add a simple test that asserts the contents of self._work_kwargs by calling create_work.

@mergify mergify bot added the ready PRs ready to be merged label Dec 10, 2022
Copy link
Collaborator

@lantiga lantiga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved pending Adrian's suggestion on the test to add

@akihironitta akihironitta enabled auto-merge (squash) December 11, 2022 00:28
@akihironitta akihironitta force-pushed the bugfix/autoscaler-cloudcompute branch from b1f8157 to 29f5448 Compare December 11, 2022 00:37
@akihironitta akihironitta merged commit c1d0156 into master Dec 11, 2022
@akihironitta akihironitta deleted the bugfix/autoscaler-cloudcompute branch December 11, 2022 00:56
Borda pushed a commit that referenced this pull request Dec 14, 2022
… machine (#15991)

* dont try to replicate new works in the existing machine

* update chglog

* Update comment

* Update src/lightning_app/components/auto_scaler.py

* add test

(cherry picked from commit c1d0156)
lantiga added a commit that referenced this pull request Dec 15, 2022
* update chlog

* CI: Add remote fetch (#16001)

Co-authored-by: thomas <[email protected]>
(cherry picked from commit 37fe3f6)

* Set the logger explicitly in tests (#15815)

(cherry picked from commit 9ed43c6)

* [App] Fix `AutoScaler` trying to replicate multiple works in a single machine (#15991)

* dont try to replicate new works in the existing machine

* update chglog

* Update comment

* Update src/lightning_app/components/auto_scaler.py

* add test

(cherry picked from commit c1d0156)

* Fix typo in PR titles generated by github-actions bot (#16003)

(cherry picked from commit 2dcebc2)

* Update docker requirement from <=5.0.3,>=5.0.0 to >=5.0.0,<6.0.2 in /requirements (#16007)

Update docker requirement in /requirements

Updates the requirements on [docker](https://github.com/docker/docker-py) to permit the latest version.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](docker/docker-py@5.0.0...6.0.1)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

(cherry picked from commit 4083b20)

* Update deepdiff requirement from <=5.8.1,>=5.7.0 to >=5.7.0,<6.2.3 in /requirements (#16006)

Update deepdiff requirement in /requirements

Updates the requirements on [deepdiff](https://github.com/seperman/deepdiff) to permit the latest version.
- [Release notes](https://github.com/seperman/deepdiff/releases)
- [Changelog](https://github.com/seperman/deepdiff/blob/master/docs/changelog.rst)
- [Commits](seperman/deepdiff@5.7.0...6.2.2)

---
updated-dependencies:
- dependency-name: deepdiff
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 5e705fa)

* app: update doctest_skip (#15997)

simple

Co-authored-by: hhsecond <[email protected]>
(cherry picked from commit 4fea6bf)

* CI: clean install & share pkg build (#15986)

* abstract pkg build
* share ci
* syntax
* Checkgroup
* folders
* whl 1st
* doctest

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <[email protected]>

(cherry picked from commit 18a4638)

* Adding hint to the logger's error messages (#16034)

Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Carlos Mocholí <[email protected]>
Fixes #15143

(cherry picked from commit 7ce3825)

* fix publish

* Introduce `{Work,Flow}.lightningignore` (#15818)

(cherry picked from commit edd2b42)

* [App] Support running on multiple clusters (#16016)

(cherry picked from commit d3a7226)

* [App] Improve lightning connect experience (#16035)

(cherry picked from commit e522a12)

* Cleanup cluster waiting (#16054)

(cherry picked from commit 6458a5a)

* feature(cli): login flow fixes and improvements (#16052)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
(cherry picked from commit ebe7848)

* Add guards to cluster deletion from cli (#16053)

Adds guards to cluster deletion.
- If cluster has running apps -> throw an error
- If cluster has stopped apps -> confirm w/ user that apps and logs will be deleted

(cherry picked from commit 64d0ebb)

* Load app before setting LIGHTNING_DISPATCHED (#16057)

(cherry picked from commit 8d3339a)

* [App] Hot fix: Resolve detection of python debugger (#16068)

Co-authored-by: thomas <[email protected]>
Co-authored-by: Carlos Mocholí <[email protected]>
(cherry picked from commit eae56ee)

* fix(cloud): detect and ignore venv (#16056)

Co-authored-by: Ethan Harris <[email protected]>
(cherry picked from commit 3b323c8)

* version 1.8.5

* update chlog

Co-authored-by: thomas chaton <[email protected]>
Co-authored-by: Carlos Mocholí <[email protected]>
Co-authored-by: Akihiro Nitta <[email protected]>
Co-authored-by: Adrian Wälchli <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Huy Đỗ <[email protected]>
Co-authored-by: Ethan Harris <[email protected]>
Co-authored-by: Luca Furst <[email protected]>
Co-authored-by: Yurij Mikhalevich <[email protected]>
Co-authored-by: Luca Antiga <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app (removed) Generic label for Lightning App package bug Something isn't working ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants