Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Completions API support #88

Merged
merged 10 commits into from
Aug 28, 2024
Merged

add Completions API support #88

merged 10 commits into from
Aug 28, 2024

Conversation

mattf
Copy link
Collaborator

@mattf mattf commented Aug 16, 2024

        Create a new NVIDIA LLM for Completions APIs.

        This class provides access to a NVIDIA NIM for completions. By default, it
        connects to a hosted NIM, but can be configured to connect to a local NIM
        using the `base_url` parameter. An API key is required to connect to the
        hosted NIM.

        Args:
            model (str): The model to use for reranking.
            nvidia_api_key (str): The API key to use for connecting to the hosted NIM.
            api_key (str): Alternative to nvidia_api_key.
            base_url (str): The base URL of the NIM to connect to.

        API Key:
        - The recommended way to provide the API key is through the `NVIDIA_API_KEY`
            environment variable.

        Additional arguments that can be passed to the Completions API:
        - max_tokens (int): The maximum number of tokens to generate.
        - stop (str or List[str]): The stop sequence to use for generating completions.
        - temperature (float): The temperature to use for generating completions.
        - top_p (float): The top-p value to use for generating completions.
        - frequency_penalty (float): The frequency penalty to apply to the completion.
        - presence_penalty (float): The presence penalty to apply to the completion.
        - seed (int): The seed to use for generating completions.
        - best_of (int): The number of completions to generate and return the best of.
        - echo (bool): Whether to echo the prompt in the completion.
        - logit_bias (Dict[str, float]): The logit bias to apply to the completion.
        - logprobs (int): The number of logprobs to return.
        - n (int): The number of completions to generate.
        - suffix (str): The suffix to use for generating completions.
        - user (str): The user ID to use for generating completions.

        These additional arguments can also be passed with `bind()`, e.g.
        `NVIDIA().bind(max_tokens=512)`, or pass directly to `invoke()` or `stream()`,
        e.g. `NVIDIA().invoke("prompt", max_tokens=512)`.

@mattf mattf requested review from dglogo and raspawar August 16, 2024 11:05
@mattf mattf self-assigned this Aug 16, 2024
@mattf mattf force-pushed the mattf/add-completions-support branch from d9096cf to fdae5ab Compare August 16, 2024 11:12
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@mattf mattf requested a review from dglogo August 19, 2024 16:53
Copy link
Collaborator

@dglogo dglogo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together!

@mattf mattf force-pushed the mattf/add-completions-support branch from 0b9f521 to ac5d18a Compare August 27, 2024 21:09
@mattf mattf requested a review from raspawar August 28, 2024 10:56
@mattf mattf merged commit 9f9b762 into main Aug 28, 2024
12 checks passed
@mattf mattf deleted the mattf/add-completions-support branch August 28, 2024 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants