Skip to content

Commit

Permalink
feat(api): add token logprobs to chat completions (#980)
Browse files Browse the repository at this point in the history
  • Loading branch information
stainless-bot authored Dec 17, 2023
1 parent 215476a commit f50e962
Show file tree
Hide file tree
Showing 14 changed files with 255 additions and 61 deletions.
1 change: 1 addition & 0 deletions api.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ from openai.types.chat import (
ChatCompletionNamedToolChoice,
ChatCompletionRole,
ChatCompletionSystemMessageParam,
ChatCompletionTokenLogprob,
ChatCompletionTool,
ChatCompletionToolChoiceOption,
ChatCompletionToolMessageParam,
Expand Down
122 changes: 104 additions & 18 deletions src/openai/resources/chat/completions.py

Large diffs are not rendered by default.

66 changes: 36 additions & 30 deletions src/openai/resources/completions.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,14 +119,15 @@ def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.
The maximum value for `logprobs` is 5.
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.
The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -288,14 +289,15 @@ def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.
The maximum value for `logprobs` is 5.
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.
The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -450,14 +452,15 @@ def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.
The maximum value for `logprobs` is 5.
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.
The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -687,14 +690,15 @@ async def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.
The maximum value for `logprobs` is 5.
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.
The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -856,14 +860,15 @@ async def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.
The maximum value for `logprobs` is 5.
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.
The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -1018,14 +1023,15 @@ async def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.
logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.
The maximum value for `logprobs` is 5.
max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.
The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down
6 changes: 4 additions & 2 deletions src/openai/resources/files.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ def create(
The size of all the
files uploaded by one organization can be up to 100 GB.
The size of individual files can be a maximum of 512 MB. See the
The size of individual files can be a maximum of 512 MB or 2 million tokens for
Assistants. See the
[Assistants Tools guide](https://platform.openai.com/docs/assistants/tools) to
learn more about the types of files supported. The Fine-tuning API only supports
`.jsonl` files.
Expand Down Expand Up @@ -314,7 +315,8 @@ async def create(
The size of all the
files uploaded by one organization can be up to 100 GB.
The size of individual files can be a maximum of 512 MB. See the
The size of individual files can be a maximum of 512 MB or 2 million tokens for
Assistants. See the
[Assistants Tools guide](https://platform.openai.com/docs/assistants/tools) to
learn more about the types of files supported. The Fine-tuning API only supports
`.jsonl` files.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ class MessageCreationStepDetails(BaseModel):
message_creation: MessageCreation

type: Literal["message_creation"]
"""Always `message_creation``."""
"""Always `message_creation`."""
2 changes: 1 addition & 1 deletion src/openai/types/beta/threads/runs/run_step.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ class RunStep(BaseModel):
"""

object: Literal["thread.run.step"]
"""The object type, which is always `thread.run.step``."""
"""The object type, which is always `thread.run.step`."""

run_id: str
"""
Expand Down
3 changes: 3 additions & 0 deletions src/openai/types/chat/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@
from .chat_completion_message_param import (
ChatCompletionMessageParam as ChatCompletionMessageParam,
)
from .chat_completion_token_logprob import (
ChatCompletionTokenLogprob as ChatCompletionTokenLogprob,
)
from .chat_completion_message_tool_call import (
ChatCompletionMessageToolCall as ChatCompletionMessageToolCall,
)
Expand Down
11 changes: 10 additions & 1 deletion src/openai/types/chat/chat_completion.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,14 @@
from ..._models import BaseModel
from ..completion_usage import CompletionUsage
from .chat_completion_message import ChatCompletionMessage
from .chat_completion_token_logprob import ChatCompletionTokenLogprob

__all__ = ["ChatCompletion", "Choice"]
__all__ = ["ChatCompletion", "Choice", "ChoiceLogprobs"]


class ChoiceLogprobs(BaseModel):
content: Optional[List[ChatCompletionTokenLogprob]]
"""A list of message content tokens with log probability information."""


class Choice(BaseModel):
Expand All @@ -24,6 +30,9 @@ class Choice(BaseModel):
index: int
"""The index of the choice in the list of choices."""

logprobs: Optional[ChoiceLogprobs]
"""Log probability information for the choice."""

message: ChatCompletionMessage
"""A chat completion message generated by the model."""

Expand Down
10 changes: 10 additions & 0 deletions src/openai/types/chat/chat_completion_chunk.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from typing_extensions import Literal

from ..._models import BaseModel
from .chat_completion_token_logprob import ChatCompletionTokenLogprob

__all__ = [
"ChatCompletionChunk",
Expand All @@ -12,6 +13,7 @@
"ChoiceDeltaFunctionCall",
"ChoiceDeltaToolCall",
"ChoiceDeltaToolCallFunction",
"ChoiceLogprobs",
]


Expand Down Expand Up @@ -70,6 +72,11 @@ class ChoiceDelta(BaseModel):
tool_calls: Optional[List[ChoiceDeltaToolCall]] = None


class ChoiceLogprobs(BaseModel):
content: Optional[List[ChatCompletionTokenLogprob]]
"""A list of message content tokens with log probability information."""


class Choice(BaseModel):
delta: ChoiceDelta
"""A chat completion delta generated by streamed model responses."""
Expand All @@ -87,6 +94,9 @@ class Choice(BaseModel):
index: int
"""The index of the choice in the list of choices."""

logprobs: Optional[ChoiceLogprobs] = None
"""Log probability information for the choice."""


class ChatCompletionChunk(BaseModel):
id: str
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,14 @@

from __future__ import annotations

from typing import Optional
from typing_extensions import Literal, Required, TypedDict

__all__ = ["ChatCompletionFunctionMessageParam"]


class ChatCompletionFunctionMessageParam(TypedDict, total=False):
content: Required[str]
content: Required[Optional[str]]
"""The contents of the function message."""

name: Required[str]
Expand Down
47 changes: 47 additions & 0 deletions src/openai/types/chat/chat_completion_token_logprob.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# File generated from our OpenAPI spec by Stainless.

from typing import List, Optional

from ..._models import BaseModel

__all__ = ["ChatCompletionTokenLogprob", "TopLogprob"]


class TopLogprob(BaseModel):
token: str
"""The token."""

bytes: Optional[List[int]]
"""A list of integers representing the UTF-8 bytes representation of the token.
Useful in instances where characters are represented by multiple tokens and
their byte representations must be combined to generate the correct text
representation. Can be `null` if there is no bytes representation for the token.
"""

logprob: float
"""The log probability of this token."""


class ChatCompletionTokenLogprob(BaseModel):
token: str
"""The token."""

bytes: Optional[List[int]]
"""A list of integers representing the UTF-8 bytes representation of the token.
Useful in instances where characters are represented by multiple tokens and
their byte representations must be combined to generate the correct text
representation. Can be `null` if there is no bytes representation for the token.
"""

logprob: float
"""The log probability of this token."""

top_logprobs: List[TopLogprob]
"""List of the most likely tokens and their log probability, at this token
position.
In rare cases, there may be fewer than the number of requested `top_logprobs`
returned.
"""
23 changes: 21 additions & 2 deletions src/openai/types/chat/completion_create_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ class CompletionCreateParamsBase(TypedDict, total=False):
particular function via `{"name": "my_function"}` forces the model to call that
function.
`none` is the default when no functions are present. `auto`` is the default if
`none` is the default when no functions are present. `auto` is the default if
functions are present.
"""

Expand All @@ -99,8 +99,18 @@ class CompletionCreateParamsBase(TypedDict, total=False):
or exclusive selection of the relevant token.
"""

logprobs: Optional[bool]
"""Whether to return log probabilities of the output tokens or not.
If true, returns the log probabilities of each output token returned in the
`content` of `message`. This option is currently not available on the
`gpt-4-vision-preview` model.
"""

max_tokens: Optional[int]
"""The maximum number of [tokens](/tokenizer) to generate in the chat completion.
"""
The maximum number of [tokens](/tokenizer) that can be generated in the chat
completion.
The total length of input tokens and generated tokens is limited by the model's
context length.
Expand All @@ -127,6 +137,8 @@ class CompletionCreateParamsBase(TypedDict, total=False):
response_format: ResponseFormat
"""An object specifying the format that the model must output.
Compatible with `gpt-4-1106-preview` and `gpt-3.5-turbo-1106`.
Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
message the model generates is valid JSON.
Expand Down Expand Up @@ -180,6 +192,13 @@ class CompletionCreateParamsBase(TypedDict, total=False):
functions the model may generate JSON inputs for.
"""

top_logprobs: Optional[int]
"""
An integer between 0 and 5 specifying the number of most likely tokens to return
at each token position, each with an associated log probability. `logprobs` must
be set to `true` if this parameter is used.
"""

top_p: Optional[float]
"""
An alternative to sampling with temperature, called nucleus sampling, where the
Expand Down
12 changes: 7 additions & 5 deletions src/openai/types/completion_create_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,16 +88,18 @@ class CompletionCreateParamsBase(TypedDict, total=False):

logprobs: Optional[int]
"""
Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.
The maximum value for `logprobs` is 5.
"""

max_tokens: Optional[int]
"""The maximum number of [tokens](/tokenizer) to generate in the completion.
"""
The maximum number of [tokens](/tokenizer) that can be generated in the
completion.
The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down
Loading

0 comments on commit f50e962

Please sign in to comment.