-
Notifications
You must be signed in to change notification settings - Fork 15.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nvidia Nemotron integration with langchain with TritonTensorRTLLM #16719
Comments
🤖 Based on the error message you provided, it seems like the Triton Inference Server is not able to find the expected output 'text_output' for the model 'Nemotron-rlhf'. This could be due to a few reasons:
Here is a sample model configuration for a text generation model: name: "text_gen_model"
platform: "tensorrt_plan"
max_batch_size: 0
input [
{
name: "INPUT__0"
data_type: TYPE_INT32
dims: [ -1 ]
}
]
output [
{
name: "text_output"
data_type: TYPE_STRING
dims: [ 1 ]
}
]
instance_group [
{
count: 1
kind: KIND_GPU
}
]
dynamic_batching { } In this configuration, the output name is 'text_output', which matches the output name expected by the TritonTensorRTLLM class. If you've checked all these points and you're still encountering the For more information on how to configure a model for the Triton server, you can refer to the Triton Inference Server documentation. For more information on how the TritonTensorRTLLM class works, you can refer to the LangChain Nvidia Triton LLMs documentation. I hope this helps! If you have any more questions or need further clarification, feel free to ask. SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Description
I'm trying to integrate my Nemotron LLM with langchain, I use the source code in langchain_nvidia_trt.llms.py, for having streaming but gives an exception.
Example Code
Error Message and Stack Trace (if applicable)
and the Exeption is bellow
the InferenceServerException is bellow:
unexpected inference output 'text_output' for model 'Nemotron-rlhf'
System Info
System Information
Package Information
Packages not installed (Not Necessarily a Problem)
The following packages were not found:
The text was updated successfully, but these errors were encountered: