[not for land] hardcoded two-stage reduction for max(abs(tensor)) #1205

vkuzo · 2024-10-31T17:29:50Z

Summary:

Based on investigations around
pytorch/pytorch#128063, changing the user code to do the tensorwise max in two steps helps the compiler find better fusion opportunities.

TBD on if we want to land this or have torch.compile do this automatically, but for now putting up a PR to make this easier to explore/benchmark.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: Based on investigations around pytorch/pytorch#128063, changing the user code to do the tensorwise max in two steps helps the compiler find better fusion opportunities. TBD on if we want to land this or have torch.compile do this automatically, but for now putting up a PR to make this easier to explore/benchmark. Test Plan: Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2024-10-31T17:29:54Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1205

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Unrelated Failure

As of commit 2ec4979 with merge base d252612 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test (CUDA 2.3, linux.g5.12xlarge.nvidia.gpu, torch==2.3.0, cuda, 12.1) / linux-job (gh)
RuntimeError: Command
Run Regression Tests / test (CUDA 2.4, linux.g5.12xlarge.nvidia.gpu, torch==2.4.0, cuda, 12.1) / linux-job (gh)
RuntimeError: Command
Run Regression Tests / test (CUDA 2.5, linux.g5.12xlarge.nvidia.gpu, torch==2.5.0 --index-url https://download.pytorch.o... / linux-job (gh)
RuntimeError: Command

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Run Regression Tests / test (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://download.pytorc... / linux-job (gh) (trunk failure)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[not for land] hardcoded two-stage reduction for max(abs(tensor)) #1205

[not for land] hardcoded two-stage reduction for max(abs(tensor)) #1205

vkuzo commented Oct 31, 2024

pytorch-bot bot commented Oct 31, 2024 •

edited

Loading

[not for land] hardcoded two-stage reduction for max(abs(tensor)) #1205

Are you sure you want to change the base?

[not for land] hardcoded two-stage reduction for max(abs(tensor)) #1205

Conversation

vkuzo commented Oct 31, 2024

pytorch-bot bot commented Oct 31, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1205

❌ 3 New Failures, 1 Unrelated Failure

pytorch-bot bot commented Oct 31, 2024 •

edited

Loading