Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler] strip execution context affinities in const eval #18663

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

sogartar
Copy link
Contributor

@sogartar sogartar commented Oct 2, 2024

During compile-time constant evaluation in pass
iree-consteval-jit-globals it does not make sense to assign device/queue affinities. We will be compiling and executing it on the compilation host.

The JITed IR is first stripped of all attributes that assign affinities and all flow.tensor.transfer ops.

During compile-time constant evaluation in pass
iree-consteval-jit-globals it does not make sense to assign
device/queue affinities. We will be compiling and executing it on the
compilation host.

The JITed IR is first stripped of all attributes that assign affinities
and all flow.tensor.transfer ops.

Signed-off-by: Boian Petkantchin <[email protected]>
Signed-off-by: Boian Petkantchin <[email protected]>
@sogartar sogartar force-pushed the jit-globals-strip-device-affinities branch from 0f6be21 to 5a294e4 Compare October 2, 2024 14:41
@hanhanW hanhanW requested a review from ScottTodd October 2, 2024 15:32
@hanhanW
Copy link
Contributor

hanhanW commented Oct 2, 2024

Adding @ScottTodd to reviewers because he reviewed those device transfer codes before.

@ScottTodd ScottTodd added the compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) label Oct 2, 2024
Copy link
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general principle here makes sense to me, but I definitely want to hear from @benvanik . Not sure if I'm missing some architectural considerations.

Comment on lines 885 to 888
if (failed(stripExecutionContextAffinities(
programBuilder.getTargetModule()))) {
return signalPassFailure();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this code could instead go in importInitializer(). Do we have a preference for making the import function take on more responsibility or treating this as a cleanup step?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it there.

@sogartar sogartar force-pushed the jit-globals-strip-device-affinities branch from be72e43 to 5b0910f Compare October 2, 2024 17:59

LogicalResult matchAndRewrite(IREE::Stream::AffinityOpInterface op,
PatternRewriter &rewriter) const override {
// Shouldn't we reject ops for which `op.requiresAffinity() == true`?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're confused when you're writing this then someone coming across this in the future will be doubly confused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the comment. I was not sure if removing this attribute when it is required may result in an illegal operation.

Copy link
Contributor Author

@sogartar sogartar Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess since we are attaching the interface to Flow ops, having the attribute removed, must not lead to illegality.

@sogartar sogartar requested a review from benvanik October 3, 2024 19:32
sogartar added a commit to nod-ai/shark-ai that referenced this pull request Oct 3, 2024
… numerics (#237)

Verifies the IREE module numerical accuracy compared to execution with
PyTorch.
The prefill step result has very low absolute accuracy of around `1e-2`
for FP32.
The resulting cache state of prefill is way off.
The decode step accuracy is also completely off.

This test is market as skipped until
iree-org/iree#18663 is merged. Without it the
IREE compilation will crash.
@sogartar
Copy link
Contributor Author

sogartar commented Oct 8, 2024

@benvanik, will you be able to review again?

Copy link
Collaborator

@benvanik benvanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be relying on specific ops and instead needs to walk attributes and replace in there. Any op in the IR can have an affinity assigned to it. You can use AttrTypeReplacer to walk and replace any AffinityAttr on any ops. Don't use pattern matching - just walk the ops and then use recursivelyReplaceElementsIn. Also make sure to call it on the function itself.

@sogartar
Copy link
Contributor Author

sogartar commented Nov 6, 2024

@benvanik, will you be able to review this again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants