Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IFU 2023-03-24 #29

Merged
merged 242 commits into from
Apr 11, 2023
Merged

IFU 2023-03-24 #29

merged 242 commits into from
Apr 11, 2023

Conversation

jataylo
Copy link

@jataylo jataylo commented Mar 29, 2023

atalman and others added 30 commits September 15, 2022 16:29
…torch#1124)

* Modify smoke test matrix

More vision smoke tests

Temporary pointing to my repo for testing

Try 2 use atalman builder

Modify path

Fixing commits

Testing

Testing

Smoke test modifications

Refactor test code

Fix typo

Fixing image read

A little more refactoring

Addressing comments

Testing

* Add same test for windows and macos

* Addressing c omments
* Add manywheel special build

Testing

Builder change

Testing

Adding manywheel cuda workflow

Simplify

Fix expr

* address comments

* checking for general setting
pytorch#1144)

* add a reusable workflow to run all smoke tests/or smoke tests for a specific os/channel
* add workflows to schedule the periodic smoke tests for nightly and release channels
Need it to get several convenience functions
* Refactors rpath to externally set var. Adds mechanism to add metadata

* Sets RUNPATH when using cudnn and cublas wheels

* Escapes dollar sign

* Fix rpath for cpu builds

Co-authored-by: atalman <[email protected]>
* Update action.yml

* Update validate-macos-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml

* Update validate-linux-binaries.yml
* Fix check binary for arm64

* Update check_binary.sh

Co-authored-by: Nikita Shulga <[email protected]>

Co-authored-by: Nikita Shulga <[email protected]>
* Fix for invluding nvtx dll and cudart

* Fix for include nvtx

* Fix spaces
* Fix anaconda torchaudio smoke test

* Format using ufmt
Most recent version fails with  invalid cert error when trying to update
python
`.test/smoke_test` -> `test/smoke_test`

Noticed when pushed pytorch@3b93537 and no tests were run
* Updates to support rocm5.3 wheel builds (#6)

* Changes to support ROCm 5.3

* Updated as per comments

* Installing python before magma build

- In ROCm 5.3 libtorch build are failing during magma build due to
  to missing python binary so added install statement

* Move python install to libtorch/Dockerfile (#8)

* Updating the condition for noRCCL build (#9)

* Updating the condition for noRCCL build

* Updated changes as per comments

* Use MIOpen branch for ROCm5.3; Change all conditions to -eq

* Use staging branch of MIOpen for ROCm5.3

* Fix merge conflict

Fix merge conflict

Co-authored-by: Pruthvi Madugundu <[email protected]>
Co-authored-by: Pruthvi Madugundu <[email protected]>
Co-authored-by: Jithun Nair <[email protected]>
Co-authored-by: Jithun Nair <[email protected]>
* Validate python 3.11

* Validate linux binaries change

Add options

Import torchvision

Adding python 3.11 install

pass package to check nightly binaries date

Test

test

Add python 3.11 code

testing

Adding python 3.11 test

Add python 3.11 validation

Adding zlib develop install

Install zlib etc..

Adding zlib1g as well

testing

testing

Adding validate windows binary

Trying to workaround

testing

Refacor smoke test

Add import statement

fix datetime call

* Fix stripping dev

* fix import
* Strip pypi-cudnn from the version.py

* small fix
atalman and others added 25 commits March 14, 2023 15:24
This needs to be enabled for official wheel building.
Also, change default to ubuntu-20.04
Using following images:
```
% aws ec2 describe-images --image-ids ami-078eece1d8119409f ami-052eac90edaa9d08f ami-0c6c29c5125214c77 --query "Images[].[ImageId, Description]"
[
    [
        "ami-078eece1d8119409f",
        "Canonical, Ubuntu, 18.04 LTS, arm64 bionic image build on 2023-03-02"
    ],
    [
        "ami-0c6c29c5125214c77",
        "Canonical, Ubuntu, 22.04 LTS, arm64 jammy image build on 2023-03-03"
    ],
    [
        "ami-052eac90edaa9d08f",
        "Canonical, Ubuntu, 20.04 LTS, arm64 focal image build on 2023-03-01"
    ]
]
```
And call it to rebuild only domains if torch wheel is available
* Switch deprecated ubuntu-18.04 runner to self-hosted 2xlarge

* Leave build-nvidia-docker for now

* Apply suggestions from code review

Co-authored-by: Nikita Shulga <[email protected]>

* Use ephemeral runners

* Use ubuntu-latest

* Apply suggestions from code review

Co-authored-by: Nikita Shulga <[email protected]>

* Switch from latest to 22.04 to pin the version

---------

Co-authored-by: Nikita Shulga <[email protected]>
(cherry picked from commit d7f2a7c)

Alas, it's still used and causes nightly build failures
* Fix torchvision image extension compilation

* Fix torchvision image extension compilation

* Set enable_mkldnn to pypi build
Also, add testing (which is currently broken)
By adding explicit libz dependency
To make build consistent with Linux-x86_64
This reverts commit ae8e825.

As it does not want to be built on aarch64
I've noticed that build errors in `build_ArmComputeLibrary` would be
ignored as semicolon is used between the commands, instead of &&
Also, replace nightly version evaluation by relying on torch, to rely on
individual libraries
Summary:
We are seeing failures during CI dealing with some headers that have
nested namespaces. This is expected to remedy them.

One such example:
https://github.com/pytorch/pytorch/actions/runs/4510336715/jobs/7942660912

Test Plan: Test this with CI.
@jataylo jataylo marked this pull request as ready for review March 31, 2023 16:08
@jataylo
Copy link
Author

jataylo commented Mar 31, 2023

@jithunnair-amd Opening for review, 1.10.1 wheels and 1.13 rocm5.5 wheels produced

Extensive testing was already performed on the previously closed IFU in which the bulk of the commits came in
#25

@jithunnair-amd jithunnair-amd merged commit 2469f06 into main Apr 11, 2023
@jithunnair-amd
Copy link

jithunnair-amd commented Apr 11, 2023

Mea culpa: I accidentally hit the "Squash and Merge" option instead of the "Merge pull request" option that's desirable for IFU PRs. So I force-pushed a regular merge of this PR's branch to main after this PR was merged. The regular merge commit is be17939

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.