Add top_p_size
step fn, StepFunctionArgs
class
#206
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR adds the following capabilities to the Inseq library:
A new
top_p_size_fn
(identifier:"top_p_size"
) step function returning the number of tokens needed to reach probabilityp
for a specific generation step (e.g. 5 withp=0.95
means that the top 5 tokens in the probability distribution over the vocabulary are needed to reach a CDF of 95%)The
kl_divergence
step function now supports a new parametertop_p: float
defined in [0,1] (default 1, full distribution) to preserve only tokens in the top p of either the original or the contrastive output distributions before computing the KL divergence between the two.🔥 Breaking change: This PR introduces a new
StepFunctionArgs
to better structure the inputs to step function methods. This doesn't change anything in the usage of pre-registered functions, but functions that were previously registered with explicit default params (attribution_model
,forward_output
,encoder_input_ids
, etc.) will now break, and should be converted to useStepFunctionArgs
. The step function registration tutorial will be updated accordingly.