Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda executables: make optional #2546

Merged
merged 2 commits into from
Oct 27, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 21 additions & 5 deletions python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,9 @@ def get_thirdparty_packages(triton_cache_path):
# ---- package data ---


def download_and_copy(src_path, version, url_func):
def download_and_copy(src_path, variable, version, url_func):
if variable in os.environ:
return
base_dir = os.path.dirname(__file__)
arch = platform.machine()
if arch == "x86_64":
Expand All @@ -150,7 +152,6 @@ def download_and_copy(src_path, version, url_func):
src_path = os.path.join(temp_dir, src_path)
os.makedirs(os.path.split(dst_path)[0], exist_ok=True)
shutil.copy(src_path, dst_path)
return dst_suffix
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value wasn't being used so I dropped it. That OK?


# ---- cmake extension ----

Expand Down Expand Up @@ -298,9 +299,24 @@ def build_extension(self, ext):
subprocess.check_call(["cmake", "--build", ".", "--target", "mlir-doc"], cwd=cmake_dir)


download_and_copy(src_path='bin/ptxas', version='12.1.105', url_func=lambda arch, version: f"https://conda.anaconda.org/nvidia/label/cuda-12.1.1/linux-{arch}/cuda-nvcc-{version}-0.tar.bz2")
download_and_copy(src_path='bin/cuobjdump', version='12.1.111', url_func=lambda arch, version: f"https://conda.anaconda.org/nvidia/label/cuda-12.1.1/linux-{arch}/cuda-cuobjdump-{version}-0.tar.bz2")
download_and_copy(src_path='bin/nvdisasm', version='12.1.105', url_func=lambda arch, version: f"https://conda.anaconda.org/nvidia/label/cuda-12.1.1/linux-{arch}/cuda-nvdisasm-{version}-0.tar.bz2")
download_and_copy(
src_path="bin/ptxas",
variable="TRITON_PTXAS_PATH",
version="12.1.105",
url_func=lambda arch, version: f"https://conda.anaconda.org/nvidia/label/cuda-12.1.1/linux-{arch}/cuda-nvcc-{version}-0.tar.bz2",
)
download_and_copy(
src_path="bin/cuobjdump",
variable="TRITON_CUOBJDUMP_PATH",
version="12.1.111",
url_func=lambda arch, version: f"https://conda.anaconda.org/nvidia/label/cuda-12.1.1/linux-{arch}/cuda-cuobjdump-{version}-0.tar.bz2",
)
download_and_copy(
src_path="bin/nvdisasm",
variable="TRITON_NVDISASM_PATH",
version="12.1.105",
url_func=lambda arch, version: f"https://conda.anaconda.org/nvidia/label/cuda-12.1.1/linux-{arch}/cuda-nvdisasm-{version}-0.tar.bz2",
)

setup(
name="triton",
Expand Down
10 changes: 7 additions & 3 deletions python/triton/common/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ def get_backend(device_type: str):
def _path_to_binary(binary: str):
base_dir = os.path.join(os.path.dirname(__file__), os.pardir)
paths = [
os.environ.get("TRITON_PTXAS_PATH", ""),
os.environ.get(f"TRITON_{binary.upper()}_PATH", ""),
os.path.join(base_dir, "third_party", "cuda", "bin", binary)
]

Expand Down Expand Up @@ -174,6 +174,10 @@ def get_cuda_version_key():
global _cached_cuda_version_key
if _cached_cuda_version_key is None:
key = compute_core_version_key()
ptxas = path_to_ptxas()[0]
_cached_cuda_version_key = key + '-' + hashlib.sha1(subprocess.check_output([ptxas, "--version"])).hexdigest()
try:
ptxas = path_to_ptxas()[0]
ptxas_version = subprocess.check_output([ptxas, "--version"])
except RuntimeError:
ptxas_version = b"NO_PTXAS"
_cached_cuda_version_key = key + '-' + hashlib.sha1(ptxas_version).hexdigest()
Comment on lines +180 to +182
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an impression that get_cuda_version_key() was called even for cpu-only users, so I just modified the function rather than eliminating references to cuda in the cpu-only branches.

This try-except block has the adverse side-effect that users who actually want to use ptxas won't see this RuntimeError, but instead will observe some failure later. I think they'll see the same error upon a call to llir_to_ptx or ptx_to_cubin, so it's probably fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now that in v2.0.0 you had been doing effectively the same: catching the Exception and resetting ptxas_version to a fixed (empty) string literal. Was there a reason not to do that anymore?

return _cached_cuda_version_key
Loading