-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tt-explorer model execution #1210
Conversation
Most basic execution without overrides.
606d3d0
to
8bcd6c5
Compare
@@ -419,7 +419,7 @@ jobs: | |||
fail-fast: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this the first time, but is the reason for having a separate build for explorer in this file because you need the build binaries and explorer can't be installed like ttrt does? (aka a whls)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably could be installed as a whl, but explorer depends on most of tt-mlir and will always be used with it. We don't have a usecase for it as as standalone build which is why we don't need the whl.
This does make the job repeat tt-mlir build steps, but i don't see a clean way of reusing these for explorer. Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I think it'll probably be faster to just build it from source and run it in the same test flow, but a longer tt-explorer whls solution would make sense since you can effectively decouple it from compiler and ttrt already pip cleans anything you need from runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TT-explorer uses the ttir_to_ttnn_backend_pipeline / ttnn_to_flatbuffer_file flows and ttrt, so it essentially depends on the whole compiler. That's why i don't see a usecase in decoupling it and building as a standalone.
golden_tensor_dict["value"]["stride"], | ||
golden_tensor_dict["value"]["data"], | ||
) | ||
if "golden_info" in self.fbb_dict["programs"][i]["debug_info"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall changes are all reasonable and awesome! Happy to see the functionality, I just wanted to see if there was a better way to represent overrides in Adapter... I'll work on a Proof of Concept and keep it on a separate branch for your review to see which makes more sense.
class OptimizationPolicy(enum.Enum): | ||
DFSharding = "DF Sharding" | ||
L1Interleaved = "L1 Interleaved" | ||
OptimizerDisabled = "Optimizer Disabled" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this idea is reasonable, but I'm unsure if we want to "double" define the optimization policies. It seems below that the logic will become branching very quickly based on these different Optimization Policies, maybe just Enums
won't be enough to contain the logic that the overrides want conveyed.
For example, when we head into "op-specific" overrides, each would need their information parsed from the settings
, and they will have their own string arguments to be sent into the pipeline. I think that this is a good stepping stone, and it is functionally sound for the current state of execution and overrides. However, I'm not sure it represents something that will scale to the rest of the overrides that we have planned.
We can sync about this offline. My idea is something like an OverrideBuilder
that constructs the overrides based off these overarching policies and op-specific changes. The skeleton can be implemented now for just the policies, and it can be scaled up to include overrides later. I'll work on a Proof of Concept off of this PR so you can see an example of what I'm talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optimization_policy = settings.get("optimizationPolicy") | ||
if optimization_policy not in OPTIMIZATION_POLICIES: | ||
raise ValueError( | ||
f"Invalid optimization policy selected: {optimization_policy}" | ||
) | ||
optimization_policy = OptimizationPolicy(optimization_policy) | ||
|
||
memory_layout_analysis_enabled = True | ||
memory_layout_analysis_policy = optimization_policy.name | ||
|
||
if optimization_policy == OptimizationPolicy.OptimizerDisabled: | ||
memory_layout_analysis_enabled = False | ||
memory_layout_analysis_policy = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of what I mean by the quickly branching code that will go into Adapter. I just think that these would be better offloaded to some other Overrides Handler so it doesn't become cluttered. Let me know what you think though.
Most basic model execution without overrides to serve as a base for adding and cleaning up functionality.
Add some tests and leave placeholders for upcoming features.