Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia-trt:add TritonTensorRTLLM(verbose_client=False) #16848

Merged
merged 4 commits into from
Mar 5, 2024

Conversation

mkhludnev
Copy link
Contributor

  • Description: adding verbose flag to TritonTensorRTLLM,
  • Issue: nope,
  • Dependencies: not any,
  • Twitter handle:

@efriis efriis added the partner label Jan 31, 2024
@efriis efriis self-assigned this Jan 31, 2024
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jan 31, 2024
Copy link

vercel bot commented Jan 31, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Mar 3, 2024 7:42am

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:improvement Medium size change to existing code to handle new use-cases labels Jan 31, 2024
@mkhludnev mkhludnev mentioned this pull request Feb 1, 2024
captured = StringIO()
sys.stdout = captured
with pytest.raises(InferenceServerException):
llm.client.is_server_live()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not perfect since it tries to request this address anyway, it might cause cloud/CI/security/etc issues. I don't know. Open for any other ideas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improved here in the recent push

Copy link
Collaborator

@baskaryan baskaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLMs already have a verbose attribute which is meant for configuring callbacks. should we give this a difference name? maybe client_verbose?

@mkhludnev mkhludnev requested a review from baskaryan February 13, 2024 20:35
@mkhludnev mkhludnev changed the title nvidia-trt:add TritonTensorRTLLM(verbose=False) nvidia-trt:add TritonTensorRTLLM(verbose_client=False) Feb 13, 2024
@mkhludnev mkhludnev force-pushed the nvidia-trt-verbose branch from ee7d24d to 9f05293 Compare March 3, 2024 07:42
@mkhludnev
Copy link
Contributor Author

@baskaryan let me know if I can improve it further.

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Mar 5, 2024
@baskaryan baskaryan merged commit d039dcb into langchain-ai:master Mar 5, 2024
20 checks passed
thebhulawat pushed a commit to thebhulawat/langchain that referenced this pull request Mar 6, 2024
…hain-ai#16848)

- **Description:** adding verbose flag to TritonTensorRTLLM, 
  - **Issue:** nope,
  - **Dependencies:** not any,
  - **Twitter handle:**
gkorland pushed a commit to FalkorDB/langchain that referenced this pull request Mar 30, 2024
…hain-ai#16848)

- **Description:** adding verbose flag to TritonTensorRTLLM, 
  - **Issue:** nope,
  - **Dependencies:** not any,
  - **Twitter handle:**
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging. Ɑ: models Related to LLMs or chat model modules partner size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants