Move explorer execution to background #1504

odjuricicTT · 2024-12-04T14:47:45Z

Explorer execute command now spawns a new thread in which all model compile and execution is handled. Results are stored as state on the server and will be sent to the frontend with the next covert command.

Additionally:

Use subprocesses for compile and to_flatbuffer in order for asserts to not kill the server
Setup high level error messages for execution steps
Send incremental logs to the frontend

nobradovictt · 2024-12-04T15:45:08Z

tools/explorer/test/run_tests.py

    assert result.ok
    if "error" in result.json():
        print(result.json())
        assert False


+def wait_for_execution_to_finish():
+    for _ in range(1000):  # Try for up to 100 seconds


Test model compilation should be performed under 100s? Might not be in the future. Would you want to define wait time?

Right. Will clean this up a bit and leave a nice error message if we timeout. But i wouldn't increase this much until we have bigger models, in order to catch hang-like bugs faster atm.

I was hitting the metal regression locally while testing this PR, that is why i added timeouts to all requests in the tests.

What I meant is defining wait time per test via some attribute, not having a single value.

Makes sense. Will add.

nobradovictt · 2024-12-04T15:57:47Z

tools/explorer/tt_adapter/src/tt_adapter/runner.py

+            )
+        self.reset_state()
+
+        options = [


Here should be new override class be plugged in with its toString method, also replacing hardcodings below. Ofc it needs to be populated from TTE frontend data first - feeding the override values.

nobradovictt · 2024-12-04T15:58:25Z

tools/explorer/tt_adapter/src/tt_adapter/runner.py

+            error = "Error running compile TTIR to TTNN Backend Pipeline"
+            self.log(error)
+            raise ExplorerRunException(error)
+        self.progress = 20


This will be more than 20 in the future, lot more. :) Would be cool to extend it in future with more granularity.

Yes :), once we have some compile and execution output we can build a good indication op progress based on that.

vprajapati-tt · 2024-12-04T21:04:41Z

tools/explorer/CMakeLists.txt

@@ -3,7 +3,7 @@ include(ExternalProject)
 set(TT_EXPLORER_SCRIPT ${CMAKE_CURRENT_SOURCE_DIR}/run.py)
 set(TTMLIR_BUILD_BIN_DIR ${TTMLIR_BINARY_DIR}/bin)

-set(MODEL_EXPLORER_VERSION "95d79ec933643c3b145537ce6fd5d9f0e735683d")
+set(MODEL_EXPLORER_VERSION "0ac804da35e6d42b2e946879e3b3bdd8bd17254a")


We should probably coordinate with the model-explorer repo to sync this version to the one with the logging window (when it gets merged in)

madcampos · 2024-12-04T22:19:46Z

tools/explorer/tt_adapter/src/tt_adapter/main.py

@@ -70,9 +74,23 @@ def execute(
            memory_layout_analysis_enabled = False
            memory_layout_analysis_policy = None

-        ttnn_ir = self.model_runner.run(
+        self.model_runner.run(
            model_path, memory_layout_analysis_enabled, memory_layout_analysis_policy
        )

        # TODO(odjuricic, #933) Parse TTNN IR and return the post optimized graph.
        return {"graphs": []}


Just to confirm about it here: this is the expected return in case the execution started, if something goes wrong, it should throw an error, right?

Yes. Is that an ok flow from frontend side?

Yes, it is okay, I just wanted to confirm as my previous implementation was expecting some response.

I made a change to accept graphs as an empty array as the success response :)

madcampos · 2024-12-04T22:34:35Z

tools/explorer/tt_adapter/src/tt_adapter/main.py

+                "isDone": done,
+                "progress": progress,
+                "total": 100,
+                "error": error,


Would it be interesting to add properties to send logs to the UI, like, actual text that gives a better perspective of what is going on?

Here is the current interface for the UI, let me know if there is some property that doesn't make sense: https://github.com/tenstorrent/model-explorer/blob/724abaf54e3ae87b70b8869d1cf7716bbc565a63/src/ui/src/common/extension_command.ts#L99-L107

Adding that today. Left a comment on your PR to discuss these properties.

vprajapati-tt · 2024-12-11T15:09:03Z

tools/explorer/tt_adapter/src/tt_adapter/main.py

-        graph = mlir.build_graph(module)
-        return {"graphs": [graph]}
+        graph, perf_data = mlir.build_graph(module, perf_trace)
+        return {"graphs": [graph], "perf_data": perf_data}


The perf_data will not be returned because it needs to be wrapped into an adapter_response and kept in graphs. It should look something like:

return to_adapter_format({ "graphs": [graph], "perf_data": perf_data })

In case the graph isn't passed we should escalate this to change how the frontend parses this as well since the graphs: graph property needs to exist.

Will try out, if this is not the case i'll try to merge now in order to unblock your further work and create an issue for this.

odjuricicTT · 2024-12-12T14:52:56Z

@svuckovicTT @mtopalovicTT Codeowners file got messed up and your review is required here now. Please take a look. I've put in fix for codeowners in seperate PR.

* Use subprocesses for compile and to_flatbuffer in order for asserts to not kill the server * Setup high level error messages for execution steps * Send incremental logs to the frontend

odjuricicTT requested review from nobradovictt and vprajapati-tt as code owners December 4, 2024 14:47

nobradovictt approved these changes Dec 4, 2024

View reviewed changes

vprajapati-tt reviewed Dec 4, 2024

View reviewed changes

madcampos reviewed Dec 4, 2024

View reviewed changes

odjuricicTT force-pushed the odjuricic/explorer-background branch 2 times, most recently from 405d848 to 83e6cb8 Compare December 11, 2024 11:40

odjuricicTT requested review from svuckovicTT and mtopalovicTT as code owners December 11, 2024 11:40

vprajapati-tt reviewed Dec 11, 2024

View reviewed changes

Move explorer execution to background

303beef

odjuricicTT force-pushed the odjuricic/explorer-background branch from 83e6cb8 to 303beef Compare December 12, 2024 14:55

odjuricicTT enabled auto-merge (squash) December 12, 2024 15:05

Merge branch 'main' into odjuricic/explorer-background

62ea7bc

svuckovicTT approved these changes Dec 13, 2024

View reviewed changes

odjuricicTT merged commit 30a7a9e into main Dec 13, 2024
21 checks passed

This was referenced Dec 24, 2024

Add backend endpoint for streaming execution output #1174

Closed

Add proper error handing for backend compile and run #1280

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move explorer execution to background #1504

Move explorer execution to background #1504

odjuricicTT commented Dec 4, 2024 •

edited

Loading

nobradovictt Dec 4, 2024

odjuricicTT Dec 5, 2024

nobradovictt Dec 5, 2024

odjuricicTT Dec 5, 2024

nobradovictt Dec 4, 2024

odjuricicTT Dec 5, 2024

nobradovictt Dec 4, 2024

odjuricicTT Dec 5, 2024

vprajapati-tt Dec 4, 2024

madcampos Dec 4, 2024

odjuricicTT Dec 5, 2024

madcampos Dec 5, 2024

madcampos Dec 4, 2024 •

edited

Loading

odjuricicTT Dec 5, 2024

vprajapati-tt Dec 11, 2024 •

edited

Loading

odjuricicTT Dec 12, 2024

odjuricicTT commented Dec 12, 2024

Move explorer execution to background #1504

Move explorer execution to background #1504

Conversation

odjuricicTT commented Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madcampos Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vprajapati-tt Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odjuricicTT commented Dec 12, 2024

odjuricicTT commented Dec 4, 2024 •

edited

Loading

madcampos Dec 4, 2024 •

edited

Loading

vprajapati-tt Dec 11, 2024 •

edited

Loading