-
Notifications
You must be signed in to change notification settings - Fork 52
Issues: NVIDIA/Fuser
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
PTX code for async copy of bool type is not correctly generated
#3273
opened Oct 25, 2024 by
liqiangxl
Add dtype parameter to Tensor state in python frontend.
good first issue
Good for newcomers
Python API
Issues related to the Python API
#3269
opened Oct 24, 2024 by
rdspring1
Python test "at-exit" serialization has effect on repeat testing runs
serde
serde = serialization + deserialization
testing
e.g. improving test infra and test coverage
#3265
opened Oct 24, 2024 by
jacobhinkle
Examine the broadcast size for aliased input/output when identifying RW race
#3251
opened Oct 22, 2024 by
Priya2698
[matmul] Self-mapping error for some tile sizes when bias is present
bug
Something isn't working
matmul
#3213
opened Oct 18, 2024 by
jacobhinkle
ElectSync predicate is not working as expected due to index hoisting
H100 Perf
improve performance on H100
matmul
#3199
opened Oct 16, 2024 by
zasdfgbnm
Reshape/transpose operators in BFloat16 causing errors
Multidevice
#3194
opened Oct 16, 2024 by
cowanmeg
Tracking perf optimization of improve performance on H100
matmul
HopperMatmulTest.HSH_NT_128BSwizzle
for problem size (M=16384, N=16384, K=1024)
, CTA tile size (64, 256)
H100 Perf
#3137
opened Oct 8, 2024 by
zasdfgbnm
Add back the UCC backend for Bcast_sharded/PipelineTestTwoStages tests.
Multidevice
#3124
opened Oct 7, 2024 by
wujingyue
CommunicationTest.SendRecv/UCC hangs.
bug
Something isn't working
Multidevice
#3120
opened Oct 7, 2024 by
wujingyue
Check Tensor.dims equals the rank of the corresponding TensorView.
#3117
opened Oct 5, 2024 by
wujingyue
Fused pointwise + layernorm kernel slower than unfused kernels
perf
#3095
opened Oct 3, 2024 by
cowanmeg
[RFC] Multi-Gpu Python Frontend API
Multidevice
Python API
Issues related to the Python API
#3094
opened Oct 3, 2024 by
rdspring1
A Python distributed test don't exit properly when one rank fails.
Multidevice
#3092
opened Oct 3, 2024 by
wujingyue
Previous Next
ProTip!
no:milestone will show everything without a milestone.