Skip to content

Commit

Permalink
Sync debug branch with master changes (#16115)
Browse files Browse the repository at this point in the history
* Remove the deprecated profiler imports (#16059)

* Revert "Load app before setting LIGHTNING_DISPATCHED" (#16064)

Revert "Load app before setting LIGHTNING_DISPATCHED (#16057)"

This reverts commit 8d3339a.

* [App] Hot fix: Resolve detection of python debugger (#16068)

Co-authored-by: thomas <[email protected]>
Co-authored-by: Carlos Mocholí <[email protected]>

* Load the app before setting `LIGHTNING_DISPATCHED` (#16071)

* fix(cloud): detect and ignore venv (#16056)

Co-authored-by: Ethan Harris <[email protected]>

* Add function to remove checkpoint to allow override for extended classes (#16067)

* Drop FairScale sharded parity tests (#16069)

* minor fix: indent spaces in comment-out (#16076)

* ci: print existing candidates (#16077)

* [App] Fix bug where previously deleted apps cannot be re-run from the CLI (#16082)

* Better check for programmatic lightningignore (#16080)

Co-authored-by: Jirka Borovec <[email protected]>

* [App] Removing single quote (#16079)

* [App] PoC: Add support for Request (#16047)

* Have checkgroup pull the latest runs (#16033)

* Update Multinode Warning (#16091)

* [App] Serve datatypes with better client code (#16018)

* docs: add PT version (#16010)

* docs: add PT version

* stable

Co-authored-by: Adrian Wälchli <[email protected]>

Co-authored-by: Adrian Wälchli <[email protected]>

* add 1.13.1 to adjust versions (#16099)

* Remove redundant `find_unused_parameters=False` in Lite (#16026)

* [App] Add display name property to the work (#16095)

Co-authored-by: thomas <[email protected]>

* Fix detection of whether app is running in cloud (#16045)

* [App] Add work.delete (#16103)

Co-authored-by: thomas <[email protected]>

* [App] Improve the autoscaler UI (#16063)

[App] Improve the autoscaler UI (#16063)

* Re-enable Lite CLI on Windows + PyTorch 1.13 (#15645)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <[email protected]>

* [App] Min replica=0 would break autoscaler component (#16092)

* fixing the bug where num_replica=0 would fail

* changelog

* [App] Scale out/in interval for autoscaler (#16093)

* Adding arguments for scale out/in interval

* Tests

* Set the default work start method to spawn on MacOS (#16089)

* [App] Add status endpoint, enable `ready` (#16075)

Co-authored-by: thomas chaton <[email protected]>

* Clarify `work.stop()` limitation (#16073)

* fix merge errors

* Update torchvision requirement from <=0.14.0,>=0.11.1 to >=0.11.1,<0.15.0 in /requirements (#16108)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jirka <[email protected]>

* CI: settle file names (#16098)

* CI: settle file names

* rename

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix test failing on master due to bad auto-merge (#16118)

* fix merge error

Co-authored-by: Carlos Mocholí <[email protected]>
Co-authored-by: Akihiro Nitta <[email protected]>
Co-authored-by: thomas chaton <[email protected]>
Co-authored-by: thomas <[email protected]>
Co-authored-by: Yurij Mikhalevich <[email protected]>
Co-authored-by: Ethan Harris <[email protected]>
Co-authored-by: Sean Naren <[email protected]>
Co-authored-by: Qiushi Pan <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Sherin Thomas <[email protected]>
Co-authored-by: Justus Schock <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jirka <[email protected]>
  • Loading branch information
15 people authored Dec 20, 2022
1 parent 0fc7b82 commit 9f05f49
Show file tree
Hide file tree
Showing 101 changed files with 1,453 additions and 999 deletions.
8 changes: 4 additions & 4 deletions .github/checkgroup.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ subprojects:
- id: "pytorch_lightning: Tests workflow"
paths:
- ".actions/**"
- ".github/workflows/ci-pytorch-tests.yml"
- ".github/workflows/ci-tests-pytorch.yml"
- "requirements/fabric/**"
- "src/lightning_fabric/**"
- "requirements/pytorch/**"
Expand Down Expand Up @@ -178,7 +178,7 @@ subprojects:
- "src/lightning_fabric/**"
- "tests/tests_fabric/**"
- "setup.cfg" # includes pytest config
- ".github/workflows/ci-fabric-tests.yml"
- ".github/workflows/ci-tests-fabric.yml"
- "!requirements/*/docs.txt"
- "!*.md"
- "!**/*.md"
Expand Down Expand Up @@ -223,7 +223,7 @@ subprojects:
- id: "lightning_app: Tests workflow"
paths:
- ".actions/**"
- ".github/workflows/ci-app-tests.yml"
- ".github/workflows/ci-tests-app.yml"
- "src/lightning_app/**"
- "tests/tests_app/**"
- "requirements/app/**"
Expand All @@ -245,7 +245,7 @@ subprojects:
- id: "lightning_app: Examples"
paths:
- ".actions/**"
- ".github/workflows/ci-app-examples.yml"
- ".github/workflows/ci-examples-app.yml"
- "src/lightning_app/**"
- "tests/tests_examples_app/**"
- "examples/app_*/**"
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@

## Unit and Integration Testing

| workflow name | workflow file | action | accelerator\* |
| -------------------------- | ------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- |
| Test PyTorch full | .github/workflows/ci-pytorch-tests.yml | Run all tests except for accelerator-specific, standalone and slow tests. | CPU |
| Test PyTorch slow | .github/workflows/ci-pytorch-tests-slow.yml | Run only slow tests. Slow tests usually need to spawn threads and cannot be speed up or simplified. | CPU |
| workflow name | workflow file | action | accelerator\* |
| ----------------- | -------------------------------------- | ------------------------------------------------------------------------- | ------------- |
| Test PyTorch full | .github/workflows/ci-tests-pytorch.yml | Run all tests except for accelerator-specific, standalone and slow tests. | CPU |

| pytorch-lightning (IPUs) | .azure-pipelines/ipu-tests.yml | Run only IPU-specific tests. | IPU |
| pytorch-lightning (HPUs) | .azure-pipelines/hpu-tests.yml | Run only HPU-specific tests. | HPU |
| pytorch-lightning (GPUs) | .azure-pipelines/gpu-tests-pytorch.yml | Run all CPU and GPU-specific tests, standalone, and examples. Each standalone test needs to be run in separate processes to avoid unwanted interactions between test cases. | GPU |
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ on:
types: [opened, reopened, ready_for_review, synchronize] # added `ready_for_review` since draft is skipped
paths:
- ".actions/**"
- ".github/workflows/ci-app-examples.yml"
- ".github/workflows/ci-examples-app.yml"
- "src/lightning_app/**"
- "tests/tests_examples_app/**"
- "examples/app_*/**"
Expand Down Expand Up @@ -89,7 +89,8 @@ jobs:
- name: Install Lightning package
env:
PACKAGE_NAME: ${{ matrix.pkg-name }}
run: pip install -e .
# do not use -e because it will make both packages available since it adds `src` to `sys.path` automatically
run: pip install .

- name: Adjust tests
if: ${{ matrix.pkg-name == 'lightning' }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ on:
types: [opened, reopened, ready_for_review, synchronize] # added `ready_for_review` since draft is skipped
paths:
- ".actions/**"
- ".github/workflows/ci-app-tests.yml"
- ".github/workflows/ci-tests-app.yml"
- "src/lightning_app/**"
- "tests/tests_app/**"
- "requirements/app/**"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Test Lite
name: Test Fabric

# see: https://help.github.com/en/actions/reference/events-that-trigger-workflows
on:
Expand All @@ -9,11 +9,11 @@ on:
types: [opened, reopened, ready_for_review, synchronize] # added `ready_for_review` since draft is skipped
paths:
- ".actions/**"
- "requirements/lite/**"
- "src/lightning_lite/**"
- "tests/tests_lite/**"
- "requirements/fabric/**"
- "src/lightning_fabric/**"
- "tests/tests_fabric/**"
- "setup.cfg" # includes pytest config
- ".github/workflows/ci-lite-tests.yml"
- ".github/workflows/ci-tests-fabric.yml"
- "!requirements/*/docs.txt"
- "!*.md"
- "!**/*.md"
Expand All @@ -30,7 +30,7 @@ defaults:
shell: bash

jobs:
lite-cpu:
fabric-cpu:
runs-on: ${{ matrix.os }}
if: github.event.pull_request.draft == false
strategy:
Expand All @@ -39,21 +39,21 @@ jobs:
include:
# assign python and pytorch version combinations to operating systems (arbitrarily)
# note: there's no distribution of torch==1.10 for Python>=3.10
- {os: "macOS-11", pkg-name: "lite", python-version: "3.8", pytorch-version: "1.11"}
- {os: "macOS-11", pkg-name: "lite", python-version: "3.9", pytorch-version: "1.12"}
- {os: "ubuntu-20.04", pkg-name: "lite", python-version: "3.8", pytorch-version: "1.10"}
- {os: "ubuntu-20.04", pkg-name: "lite", python-version: "3.9", pytorch-version: "1.11"}
- {os: "ubuntu-20.04", pkg-name: "lite", python-version: "3.10", pytorch-version: "1.12"}
- {os: "windows-2022", pkg-name: "lite", python-version: "3.9", pytorch-version: "1.11"}
- {os: "windows-2022", pkg-name: "lite", python-version: "3.10", pytorch-version: "1.12"}
- {os: "macOS-11", pkg-name: "fabric", python-version: "3.8", pytorch-version: "1.11"}
- {os: "macOS-11", pkg-name: "fabric", python-version: "3.9", pytorch-version: "1.12"}
- {os: "ubuntu-20.04", pkg-name: "fabric", python-version: "3.8", pytorch-version: "1.10"}
- {os: "ubuntu-20.04", pkg-name: "fabric", python-version: "3.9", pytorch-version: "1.11"}
- {os: "ubuntu-20.04", pkg-name: "fabric", python-version: "3.10", pytorch-version: "1.12"}
- {os: "windows-2022", pkg-name: "fabric", python-version: "3.9", pytorch-version: "1.11"}
- {os: "windows-2022", pkg-name: "fabric", python-version: "3.10", pytorch-version: "1.12"}
# only run PyTorch latest with Python latest
- {os: "macOS-11", pkg-name: "lite", python-version: "3.10", pytorch-version: "1.13"}
- {os: "ubuntu-20.04", pkg-name: "lite", python-version: "3.10", pytorch-version: "1.13"}
- {os: "windows-2022", pkg-name: "lite", python-version: "3.10", pytorch-version: "1.13"}
- {os: "macOS-11", pkg-name: "fabric", python-version: "3.10", pytorch-version: "1.13"}
- {os: "ubuntu-20.04", pkg-name: "fabric", python-version: "3.10", pytorch-version: "1.13"}
- {os: "windows-2022", pkg-name: "fabric", python-version: "3.10", pytorch-version: "1.13"}
# "oldest" versions tests, only on minimum Python
- {os: "macOS-11", pkg-name: "lite", python-version: "3.7", pytorch-version: "1.10", requires: "oldest"}
- {os: "ubuntu-20.04", pkg-name: "lite", python-version: "3.7", pytorch-version: "1.10", requires: "oldest"}
- {os: "windows-2022", pkg-name: "lite", python-version: "3.7", pytorch-version: "1.10", requires: "oldest"}
- {os: "macOS-11", pkg-name: "fabric", python-version: "3.7", pytorch-version: "1.10", requires: "oldest"}
- {os: "ubuntu-20.04", pkg-name: "fabric", python-version: "3.7", pytorch-version: "1.10", requires: "oldest"}
- {os: "windows-2022", pkg-name: "fabric", python-version: "3.7", pytorch-version: "1.10", requires: "oldest"}
# "lightning" installs the monolithic package
- {os: "macOS-11", pkg-name: "lightning", python-version: "3.8", pytorch-version: "1.13"}
- {os: "ubuntu-20.04", pkg-name: "lightning", python-version: "3.8", pytorch-version: "1.13"}
Expand Down Expand Up @@ -87,8 +87,8 @@ jobs:
- name: Adjust PyTorch versions in requirements files
if: ${{ matrix.requires != 'oldest' }}
run: |
python ./requirements/pytorch/adjust-versions.py requirements/lite/base.txt ${{ matrix.pytorch-version }}
cat requirements/lite/base.txt
python ./requirements/pytorch/adjust-versions.py requirements/fabric/base.txt ${{ matrix.pytorch-version }}
cat requirements/fabric/base.txt
- name: Get pip cache dir
id: pip-cache
Expand All @@ -98,7 +98,7 @@ jobs:
uses: actions/cache@v3
with:
path: ${{ steps.pip-cache.outputs.dir }}
key: ${{ runner.os }}-pip-py${{ matrix.python-version }}-${{ matrix.pkg-name }}-${{ matrix.release }}-${{ matrix.requires }}-${{ hashFiles('requirements/lite/*.txt') }}
key: ${{ runner.os }}-pip-py${{ matrix.python-version }}-${{ matrix.pkg-name }}-${{ matrix.release }}-${{ matrix.requires }}-${{ hashFiles('requirements/fabric/*.txt') }}
restore-keys: |
${{ runner.os }}-pip-py${{ matrix.python-version }}-${{ matrix.pkg-name }}-${{ matrix.release }}-${{ matrix.requires }}-
Expand All @@ -109,27 +109,27 @@ jobs:
env:
PACKAGE_NAME: ${{ matrix.pkg-name }}
run: |
pip install -e . "pytest-timeout" -r requirements/lite/devel.txt --upgrade --find-links ${TORCH_URL}
pip install -e . "pytest-timeout" -r requirements/fabric/devel.txt --upgrade --find-links ${TORCH_URL}
pip list
- name: Adjust tests
if: ${{ matrix.pkg-name == 'lightning' }}
run: |
python .actions/assistant.py copy_replace_imports --source_dir="./tests" \
--source_import="lightning_lite" --target_import="lightning.lite"
--source_import="lightning_fabric" --target_import="lightning.fabric"
- name: Testing Warnings
# the stacklevel can only be set on >=3.7
if: matrix.python-version != '3.7'
working-directory: tests/tests_lite
working-directory: tests/tests_fabric
# needs to run outside of `pytest`
run: python utilities/test_warnings.py

- name: Switch coverage scope
run: python -c "print('COVERAGE_SCOPE=' + str('lightning' if '${{matrix.pkg-name}}' == 'lightning' else 'lightning_lite'))" >> $GITHUB_ENV
run: python -c "print('COVERAGE_SCOPE=' + str('lightning' if '${{matrix.pkg-name}}' == 'lightning' else 'lightning_fabric'))" >> $GITHUB_ENV

- name: Testing Lite
working-directory: tests/tests_lite
- name: Testing Fabric
working-directory: tests/tests_fabric
# NOTE: do not include coverage report here, see: https://github.com/nedbat/coveragepy/issues/1003
run: coverage run --source ${COVERAGE_SCOPE} -m pytest -v --timeout=30 --durations=50 --junitxml=results-${{ runner.os }}-py${{ matrix.python-version }}-${{ matrix.requires }}-${{ matrix.release }}.xml

Expand All @@ -138,11 +138,11 @@ jobs:
uses: actions/upload-artifact@v3
with:
name: unittest-results-${{ runner.os }}-py${{ matrix.python-version }}-${{ matrix.requires }}-${{ matrix.release }}
path: tests/tests_lite/results-${{ runner.os }}-py${{ matrix.python-version }}-${{ matrix.requires }}-${{ matrix.release }}.xml
path: tests/tests_fabric/results-${{ runner.os }}-py${{ matrix.python-version }}-${{ matrix.requires }}-${{ matrix.release }}.xml

- name: Statistics
if: success()
working-directory: tests/tests_lite
working-directory: tests/tests_fabric
run: |
coverage report
coverage xml
Expand All @@ -153,7 +153,7 @@ jobs:
continue-on-error: true
with:
token: ${{ secrets.CODECOV_TOKEN }}
file: tests/tests_lite/coverage.xml
file: tests/tests_fabric/coverage.xml
flags: ${COVERAGE_SCOPE},cpu,pytest,python${{ matrix.python-version }}
name: CPU-coverage
fail_ci_if_error: false
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@ name: Test PyTorch
# see: https://help.github.com/en/actions/reference/events-that-trigger-workflows
on:
push:
branches: [master, "release/*", "lite/debug"]
branches: [master, "release/*"]
pull_request:
branches: [master, "release/*", "lite/debug"]
branches: [master, "release/*"]
types: [opened, reopened, ready_for_review, synchronize] # added `ready_for_review` since draft is skipped
paths:
- ".actions/**"
Expand All @@ -14,9 +14,9 @@ on:
- "tests/tests_pytorch/**"
- "tests/legacy/back-compatible-versions.txt"
- "setup.cfg" # includes pytest config
- ".github/workflows/ci-pytorch-tests.yml"
- "requirements/fabric/**"
- "src/lightning_fabric/**"
- ".github/workflows/ci-tests-pytorch.yml"
- "requirements/lite/**"
- "src/lightning_lite/**"
- "!requirements/pytorch/docs.txt"
- "!*.md"
- "!**/*.md"
Expand Down Expand Up @@ -104,7 +104,7 @@ jobs:
- name: Adjust PyTorch versions in requirements files
if: ${{ matrix.requires != 'oldest' }}
run: |
python ./requirements/pytorch/adjust-versions.py requirements/fabric/base.txt ${{ matrix.pytorch-version }}
python ./requirements/pytorch/adjust-versions.py requirements/lite/base.txt ${{ matrix.pytorch-version }}
python ./requirements/pytorch/adjust-versions.py requirements/pytorch/base.txt ${{ matrix.pytorch-version }}
python ./requirements/pytorch/adjust-versions.py requirements/pytorch/examples.txt ${{ matrix.pytorch-version }}
cat requirements/pytorch/base.txt
Expand Down Expand Up @@ -171,8 +171,8 @@ jobs:
if: ${{ matrix.pkg-name == 'lightning' }}
run: |
python .actions/assistant.py copy_replace_imports --source_dir="./tests" \
--source_import="pytorch_lightning,lightning_fabric" \
--target_import="lightning.pytorch,lightning.fabric"
--source_import="pytorch_lightning,lightning_lite" \
--target_import="lightning.pytorch,lightning.lite"
- name: Testing Warnings
# the stacklevel can only be set on >=3.7
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/release-pypi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ jobs:
branch = f"origin/builds/{os.getenv('TAG')}"
while True:
remote_refs = [b.name for b in repo.remote().refs]
print([n for n in remote_refs if "builds" in n])
if branch in remote_refs:
break
time.sleep(60)
Expand Down
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -110,8 +110,8 @@ celerybeat-schedule

# dotenv
.env
.env_staging
.env_local
.env.staging
.env.local

# virtualenv
.venv
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,9 +96,9 @@ Lightning is rigorously tested across multiple CPUs, GPUs, TPUs, IPUs, and HPUs
| Linux py3.7 \[TPUs\*\*\*\] | - | - |
| Linux py3.8 \[IPUs\] | - | - |
| Linux py3.8 \[HPUs\] | [![Build Status](<https://dev.azure.com/Lightning-AI/lightning/_apis/build/status/pytorch-lightning%20(HPUs)?branchName=master>)](https://dev.azure.com/Lightning-AI/lightning/_build/latest?definitionId=26&branchName=master) | - |
| Linux py3.{7,9} | - | [![Test](https://github.com/Lightning-AI/lightning/actions/workflows/ci-pytorch-tests.yml/badge.svg?branch=master&event=push)](https://github.com/Lightning-AI/lightning/actions/workflows/ci-pytorch-tests.yml) |
| OSX py3.{7,9} | - | [![Test](https://github.com/Lightning-AI/lightning/actions/workflows/ci-pytorch-tests.yml/badge.svg?branch=master&event=push)](https://github.com/Lightning-AI/lightning/actions/workflows/ci-pytorch-tests.yml) |
| Windows py3.{7,9} | - | [![Test](https://github.com/Lightning-AI/lightning/actions/workflows/ci-pytorch-tests.yml/badge.svg?branch=master&event=push)](https://github.com/Lightning-AI/lightning/actions/workflows/ci-pytorch-tests.yml) |
| Linux py3.{7,9} | - | [![Test](https://github.com/Lightning-AI/lightning/actions/workflows/ci-tests-pytorch.yml/badge.svg?branch=master&event=push)](https://github.com/Lightning-AI/lightning/actions/workflows/ci-tests-pytorch.yml) |
| OSX py3.{7,9} | - | [![Test](https://github.com/Lightning-AI/lightning/actions/workflows/ci-tests-pytorch.yml/badge.svg?branch=master&event=push)](https://github.com/Lightning-AI/lightning/actions/workflows/ci-tests-pytorch.yml) |
| Windows py3.{7,9} | - | [![Test](https://github.com/Lightning-AI/lightning/actions/workflows/ci-tests-pytorch.yml/badge.svg?branch=master&event=push)](https://github.com/Lightning-AI/lightning/actions/workflows/ci-tests-pytorch.yml) |

- _\*\* tests run on two NVIDIA P100_
- _\*\*\* tests run on Google GKE TPUv2/3. TPU py3.7 means we support Colab and Kaggle env._
Expand Down
2 changes: 1 addition & 1 deletion docs/source-app/api_references.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ ___________________
~multi_node.lite.LiteMultiNode
~multi_node.pytorch_spawn.PyTorchSpawnMultiNode
~multi_node.trainer.LightningTrainerMultiNode
~auto_scaler.AutoScaler
~serve.auto_scaler.AutoScaler

----

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ def run(self):
trainer = L.Trainer(max_epochs=10, strategy="ddp")
trainer.fit(model)

# 8 GPU: (2 nodes of 4 x v100)
# 8 GPUs: (2 nodes of 4 x v100)
component = LightningTrainerMultiNode(
LightningTrainerDistributed,
num_nodes=4,
Expand Down
Loading

0 comments on commit 9f05f49

Please sign in to comment.