Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hexagon] AoT with LLVM Codegen on Hexagon #11065

Merged
merged 6 commits into from
Apr 21, 2022

Conversation

mehrdadh
Copy link
Member

@mehrdadh mehrdadh commented Apr 19, 2022

This PR adds AoT executor implementation with LLVM codegen for Hexagon.

cc @areusch @kparzysz-quic @csullivan

@mehrdadh mehrdadh force-pushed the hexagon/aot-llvm-final branch from 6ebbb79 to 40765de Compare April 19, 2022 19:48
@mehrdadh mehrdadh changed the title [Hexagon] AoT with LLVM Codegen for on Hexagon [Hexagon] AoT with LLVM Codegen on Hexagon Apr 19, 2022
@mehrdadh mehrdadh force-pushed the hexagon/aot-llvm-final branch from 40765de to 3cd73c4 Compare April 19, 2022 21:42
Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @mehrdadh , a couple of comments

python/tvm/contrib/hexagon/build.py Outdated Show resolved Hide resolved
python/tvm/contrib/hexagon/session.py Outdated Show resolved Hide resolved
python/tvm/contrib/hexagon/session.py Show resolved Hide resolved
python/tvm/contrib/hexagon/session.py Outdated Show resolved Hide resolved
src/relay/backend/aot_executor_codegen.cc Show resolved Hide resolved
src/runtime/hexagon/rpc/hexagon/rpc_server.cc Outdated Show resolved Hide resolved
tests/python/contrib/test_hexagon/test_launcher.py Outdated Show resolved Hide resolved
tests/python/contrib/test_hexagon/test_launcher.py Outdated Show resolved Hide resolved
src/runtime/hexagon/rpc/hexagon/rpc_server.cc Outdated Show resolved Hide resolved
@mehrdadh mehrdadh requested review from masahi and areusch April 20, 2022 15:59
@mehrdadh mehrdadh force-pushed the hexagon/aot-llvm-final branch from b5c22b0 to 278a78c Compare April 20, 2022 16:03
python/tvm/contrib/hexagon/session.py Show resolved Hide resolved
python/tvm/contrib/hexagon/session.py Show resolved Hide resolved
Comment on lines 274 to 283
def get_target_and_session(target_kind: str):
if target_kind == "c":
target_hexagon = tvm.target.hexagon("v68")
session_name = "hexagon-rpc"
elif target_kind.startswith("llvm"):
target_hexagon = target_kind
session_name = "cpu-rpc"
else:
assert False, "Incorrect target_kind: {target_kind}. Options are [c, llvm]."
return target_hexagon, session_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good candidate for a test fixture

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added. thanks!

@@ -332,7 +346,7 @@ def test_aot_executor(hexagon_session):


@requires_hexagon_toolchain
def test_aot_executor_multiple_conv2d(hexagon_session):
def test_aot_executor_multiple_conv2d(hexagon_launcher, aot_target_kind):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead support aot_target_kind and its consumption via get_target_and_session in the hexagon_session fixture? That way we can keep the simplified calling code so that the user doesn't need to think about starting a hexagon session.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so AoT host targets are pytest parameters and they are not used for graph executor. If I add it in hexagon_session it will generate extra tests for graph executor which doesn't make sense. I added aot_target and aot_host_target as fixtures.
We could fix this by adding aot_hexagon_session fixture.

Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we have some complications due to information flowing backwards, from the device type to the session initialization. I think we can have a cleaner interface by delaying the device initialization until after we know which device type we have, and have some proposed changes that should bring it about.

@@ -44,7 +44,9 @@ class Session:
Remote configs for RPC tracker.

session_name : str
Hexagon RPC session name.
Hexagon RPC session name. Options are [hexagon-rpc, cpu-rpc].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat concerned with the change in semantics here. Going from a human-readable name that can be arbitrarily specified to a list of specific options feels like there's got to be a better way.

python/tvm/contrib/hexagon/session.py Outdated Show resolved Hide resolved
python/tvm/contrib/hexagon/session.py Show resolved Hide resolved
@mehrdadh mehrdadh force-pushed the hexagon/aot-llvm-final branch from 8c26cad to 31a7e80 Compare April 21, 2022 16:48
Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the changes, and it looks much better! I have a couple additional changes to request, after which it would look good to me.

def device(self):
"""Session device."""

if hasattr(self, "_device") and self._device is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend either removing the hasattr(self, "_device") from this condition, or removing self._device = None from the initializer. As it is, there are two different ways to represent an uninitialized session, which could cause confusion in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

if hasattr(self, "_device") and self._device is not None:
return self._device

if not hasattr(self, "_requires_cpu_device"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to supply a default for _requires_cpu_device, rather than requiring it to be explicitly set? I think general case is that we return a "hexagon" device, and that AOT is the special case that would require a CPU device. Rather than requiring it to be set for both paths, can we set self._requires_cpu_device = False in the initializer, and only set it to True in the AOT path. That way, the default is to return a "hexagon" device.

Also, as a general nitpick, I'd recommend assert(condition) instead of if not condition: assert(False).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think this default is required for some use cases. The load_module() function doesn't call set_device_type, so I expect the TE-based tests to fail. It is also legal to upload a module, and later refer to the uploaded module by its name. Since this can occur on an entirely different instance of TVM, there wouldn't be a way to determine the device type, and we can't avoid needing a default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current CI results have some failures for test_2d_physical_buffers and test_cache_read_write, which look to be caused by this lack of a default. https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-11065/8/tests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, added the default and changed set_device_type to be an internal function.

@@ -226,6 +244,28 @@ def get_executor_from_factory(self, module: ExecutorFactoryModule):

raise TypeError(f"Unsupported executor type: {type(module)}")

def set_device_type(self, module: Union[str, pathlib.Path, GraphExecutorFactoryModule]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should set_device_type be a private function? If it is only needed from the AOT path, and should only be called from within _get_aot_executor_from_factory, we should rename to _set_device_type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

if not hasattr(self, "_requires_cpu_device"):
assert (
False
), "Device type is not set. 'set_device_type' should be called before accessing device."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message instructs users to call set_device_type, which we may want to be an internal function. The concern would be for future breakage, if code outside this class calls hexagon_session.set_device_type, we wouldn't be able to remove set_device_type without breaking that code. I think there are some plans to remove the kDLHexagon type and instead use options within the kDLCPU, at which point we'd want to remove the device choice without breaking external usage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed set_device_type to be an internal function and removed the assert since it has default value now.

@mehrdadh mehrdadh requested a review from Lunderberg April 21, 2022 20:32
Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the updates. LGTM!

@Lunderberg
Copy link
Contributor

Looks like the linter wasn't recognizing the hasattr as ensuring that there was a definition for self._device. From this thread, it looks like that's a bug with pylint, fixed at some point between March 2020 and December 2021. The CI is currently running pylint 2.4.4, released in November 2019, so it wouldn't be unreasonable to update the pylint version at some point.

Updating to use self._device = None instead of hasattr LGTM.

@mehrdadh
Copy link
Member Author

@areusch, @driazati is there a plan on updating pylint?

@mehrdadh mehrdadh merged commit effc23d into apache:main Apr 21, 2022
@mehrdadh mehrdadh deleted the hexagon/aot-llvm-final branch April 21, 2022 22:49
shtinsa pushed a commit to Deelvin/tvm that referenced this pull request May 17, 2022
* AOT with LLVM Codegen on Hexagon

* Address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants