You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @swj0419 and other authors, thanks for making this code available and easy to run so we can explore contamination in various open-source models.
Given that this repo/approach has gained some adoption in the community in terms of reporting contamination scores on benchmark datasets, I would like to clarify some things about how these scores are calculated:
the computed/returned value of the script seems to not be the min-k%-prob formula give in the paper. Is there a reason?
"If #the result < 0.1# with a percentage greater than 0.85, it is highly likely that the dataset has been trained." how this threshold is determined?
for the min-k%-prob number, assuming we were to compute it, is there similar guidance on thresholds above which we have high confidence the dataset has been trained?
I could not find these specific details in the original paper - apologies if I missed them. Thanks in advance!
The text was updated successfully, but these errors were encountered:
Hi @swj0419 and other authors, thanks for making this code available and easy to run so we can explore contamination in various open-source models.
Given that this repo/approach has gained some adoption in the community in terms of reporting contamination scores on benchmark datasets, I would like to clarify some things about how these scores are calculated:
min-k%-prob
formula give in the paper. Is there a reason?min-k%-prob
number, assuming we were to compute it, is there similar guidance on thresholds above which we have high confidence the dataset has been trained?I could not find these specific details in the original paper - apologies if I missed them. Thanks in advance!
The text was updated successfully, but these errors were encountered: