-
Notifications
You must be signed in to change notification settings - Fork 101
Issues: triton-inference-server/tensorrtllm_backend
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Streaming Inference Failure
bug
Something isn't working
#626
opened Oct 20, 2024 by
imilli
2 of 4 tasks
The GPU memory usage is too high.
bug
Something isn't working
#625
opened Oct 19, 2024 by
imilli
2 of 4 tasks
Garbage response when input tokens is longer than 4096 on Llama-3.1-8B-Instruct
bug
Something isn't working
#624
opened Oct 18, 2024 by
winstxnhdw
2 of 4 tasks
Failed install in nvcr.io/nvidia/tritonserver:24.08-trtllm-python-py3
bug
Something isn't working
#623
opened Oct 18, 2024 by
wwx007121
4 tasks
fill_template.py and gpu_device_ids
bug
Something isn't working
#616
opened Oct 12, 2024 by
Alireza3242
2 of 4 tasks
Support dynamic path for gpt_model_path and token_dir based on Triton model repo
#615
opened Oct 11, 2024 by
rahchuenmonroe
An error that Something isn't working
Shape does not match true shape of 'data' field
occurs when using tensorrt_llm model alone in inflight_batcher_llm
bug
#613
opened Oct 10, 2024 by
junstar92
1 of 4 tasks
Is ReDrafter supported by the TensorRT-LLM backend?
bug
Something isn't working
#610
opened Oct 5, 2024 by
vkc1vk
2 of 4 tasks
Bad quality in answers (repetition, non stop...) when using Llama3.1-8B-Instruct and Triton
bug
Something isn't working
#603
opened Sep 25, 2024 by
alvaroalfaro612
2 of 4 tasks
generation logits dtype bug
bug
Something isn't working
#598
opened Sep 11, 2024 by
binhtranmcs
2 of 4 tasks
request is blocked and non output when using tensor parallelism with multi gpus
bug
Something isn't working
#596
opened Sep 9, 2024 by
dwq370
4 tasks
Is Something isn't working
no_repeat_ngram_size
generation option supported?
bug
#593
opened Sep 3, 2024 by
ghost
2 of 4 tasks
Error malloc(): unaligned tcache chunk detected Always Occur after tensorrt server handling a certain amount requests
bug
Something isn't working
#587
opened Aug 28, 2024 by
wangpeilin
2 of 4 tasks
Metrics "nv_inference_request_failure" value is always 0 even after getting 5xx at the client side
bug
Something isn't working
#582
opened Aug 22, 2024 by
ajagetia2001
1 of 4 tasks
The Docker container stops when using Something isn't working
python3 scripts/launch_triton_server.py --world_size 1 --model_repo=model_repo/
as the starting command in the Docker Compose YAML file.
bug
#580
opened Aug 21, 2024 by
Aquasar11
2 of 4 tasks
Unable to launch triton server with TP
bug
Something isn't working
#577
opened Aug 19, 2024 by
dhruvmullick
2 of 4 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.