-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merging IterDomains requires that their iteration types match. #2317
Comments
FWIW, https://github.com/Lightning-AI/lightning-thunder/blob/bc3925be04e7ec58d9b24fb7ac55fbe007862a65/thunder/tests/opinfos.py#L6095-L6117 could be enhanced to test 3D. Currently, it only tests 1D and 2D input shapes. |
Agreed, we do test for 3D cases in our tests though: Fuser/tests/python/pytest_input_generators.py Lines 1515 to 1544 in a873bff
We likely need more tests such as the reproducer of this issue and other larger fusions. |
The issue seems related to: Issue #1659. The segment for transpose scheduler:
|
FYI, @Priya2698 , https://github.com/Lightning-AI/lightning-thunder/tree/wjy/bug2317 is the Thunder branch to reproduce the bug.
Traceback (most recent call last): Error from segmentation group 5: Merging IterDomains requires that their iteration types match. Outer: iS284{( ceilDiv(1600, 32) )}, Inner: rS257{i6} Use NVFUSER_DISABLE=parallel_compile to simplify error message.
|
Update: never mind. I should have run
|
FYI, @Priya2698 , I synced https://github.com/Lightning-AI/lightning-thunder/tree/wjy/bug2317. You can still reproduce the same problem using |
This is happening when we segment at a reduction output. In the consumer segments, the edge is converted to an input that has a Reduction domain. Instead, I think we should filter out Reduction domains in |
Issue #2317. The issue arises in the following lines for reference 1: `[I0, I1, I2, r3]`: After tiling: ``` reference1->split(inner_most_pos1_in_ref1, params.tile_size1); reference1->reorder({{inner_most_pos1_in_ref1 + 1, -1}}); reference1->split(inner_most_pos2_in_ref1, params.tile_size2); reference1->reorder({{inner_most_pos2_in_ref1 + 1, -1}}); ``` Reference 1 is: [I0, I1/tile1, I2/tile2, r3, tile1, tile2] ``` // Merge remaining dimensions int64_t lhs_i = -1; for (int64_t i = reference1->nDims() - 2; i > 0; i--) { auto axis_i = i - 1; if (lhs_i == -1) { lhs_i = axis_i; } else { reference1->merge(axis_i, lhs_i); lhs_i = axis_i; } ``` This tries to merge a reduction iterdomain with iteration type iterdomain. This PR ignored the reduction axis when merging all non-tile dimensions.
Issue #2317. The issue arises in the following lines for reference 1: `[I0, I1, I2, r3]`: After tiling: ``` reference1->split(inner_most_pos1_in_ref1, params.tile_size1); reference1->reorder({{inner_most_pos1_in_ref1 + 1, -1}}); reference1->split(inner_most_pos2_in_ref1, params.tile_size2); reference1->reorder({{inner_most_pos2_in_ref1 + 1, -1}}); ``` Reference 1 is: [I0, I1/tile1, I2/tile2, r3, tile1, tile2] ``` // Merge remaining dimensions int64_t lhs_i = -1; for (int64_t i = reference1->nDims() - 2; i > 0; i--) { auto axis_i = i - 1; if (lhs_i == -1) { lhs_i = axis_i; } else { reference1->merge(axis_i, lhs_i); lhs_i = axis_i; } ``` This tries to merge a reduction iterdomain with iteration type iterdomain. This PR ignored the reduction axis when merging all non-tile dimensions.
Currently, when we segment at the output of a reduction, the consumer segment will have an input tensor that has a `Reduction` axis in it. This can be problematic; see #2481 and #2317. This PR strips reduction axes from the root and allocation domain in these cases. --------- Co-authored-by: Jingyue Wu <[email protected]>
Check out
wjy/linear
and runNVFUSER_DISABLE=parallel_compile python repro.py
.This happened after I rebased https://github.com/Lightning-AI/lightning-thunder/tree/wjy/sharded for #2199. I suspect 3D linear isn't not handled so well as reshape+2D_linear+reshape.
The text was updated successfully, but these errors were encountered: