Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use IDifferentiablePtrType for function-scope vars that cannot be SSA'd for performance reasons #5001

Open
saipraveenb25 opened this issue Sep 3, 2024 · 0 comments
Assignees
Labels
goal:client support Feature or fix needed for a current slang user.

Comments

@saipraveenb25
Copy link
Collaborator

saipraveenb25 commented Sep 3, 2024

We currently have a special SSA pass that converts all types (arrays, structs, structs of structs, etc..) into SSA types and converts reads and writes into OpGetElement/OpUpdateElement instructions with access-chain information.

This was initially done in order to avoid dealing with OpVar and OpPtrType instructions. However, it makes sense to treat these as 'buffer' types (at a function-local level) and treat loads and stores from these as we would from a resource object (turn loads into "store & set-zero" and stores into "atomic accumulates").

This avoid some catastrophic situations where aggregate objects that are modified within a loop will cause the entire aggregate to be stored per-iteration (unnecessarily), causing the resulting kernel to be far too slow, or refuse to compile entirely

This issue will be easier to resolve once the IDifferentiablePtrType system is implemented (Issue #4998), as we can express PtrType<T : IDifferentiable> as an IDifferentiablePtrType
This issue also depends on #4211 (which implements an intermedate OpAccumulate inst that stands in for the OpLoad/OpCall("dadd")/OpStore combo that causes a ton of clutter in our reverse-mode pass)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
goal:client support Feature or fix needed for a current slang user.
Projects
None yet
Development

No branches or pull requests

2 participants