-
Notifications
You must be signed in to change notification settings - Fork 27.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding top_k
argument to text-classification
pipeline.
#17606
Conversation
The documentation is not available anymore as the PR was closed or merged. |
- Deprecate `return_all_scores` as `top_k` is more uniform with other pipelines, and a superset of what `return_all_scores` can do. BC is maintained though. `return_all_scores=True` -> `top_k=None` `return_all_scores=False` -> `top_k=1` - Using `top_k` will imply sorting the results, but using no argument will keep the results unsorted for backward compatibility.
132030b
to
fb69da1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, top_k
is a nice addition! Thanks for working on that, @Narsil!
…ce#17606) * Adding `top_k` and `sort` arguments to `text-classification` pipeline. - Deprecate `return_all_scores` as `top_k` is more uniform with other pipelines, and a superset of what `return_all_scores` can do. BC is maintained though. `return_all_scores=True` -> `top_k=None` `return_all_scores=False` -> `top_k=1` - Using `top_k` will imply sorting the results, but using no argument will keep the results unsorted for backward compatibility. * Remove `sort`. * Fixing the test. * Remove bad doc.
…ce#17606) * Adding `top_k` and `sort` arguments to `text-classification` pipeline. - Deprecate `return_all_scores` as `top_k` is more uniform with other pipelines, and a superset of what `return_all_scores` can do. BC is maintained though. `return_all_scores=True` -> `top_k=None` `return_all_scores=False` -> `top_k=1` - Using `top_k` will imply sorting the results, but using no argument will keep the results unsorted for backward compatibility. * Remove `sort`. * Fixing the test. * Remove bad doc.
…e#17606 - The legacy test actually tested `return_all_scores=False` (the actual default) instead of `return_all_scores=True` (the actual weird case). This commit adds the correct legacy test and fixes it.
…e#17606 - The legacy test actually tested `return_all_scores=False` (the actual default) instead of `return_all_scores=True` (the actual weird case). This commit adds the correct legacy test and fixes it.
…e#17606 Fixing a regression with `return_all_scores` introduced in huggingface#17606 - The legacy test actually tested `return_all_scores=False` (the actual default) instead of `return_all_scores=True` (the actual weird case). This commit adds the correct legacy test and fixes it. Tmp legacy tests. Actually fix the regression (also contains lists) Less diffed code.
…e#17606 Fixing a regression with `return_all_scores` introduced in huggingface#17606 - The legacy test actually tested `return_all_scores=False` (the actual default) instead of `return_all_scores=True` (the actual weird case). This commit adds the correct legacy test and fixes it. Tmp legacy tests. Actually fix the regression (also contains lists) Less diffed code.
…e#17606 Fixing a regression with `return_all_scores` introduced in huggingface#17606 - The legacy test actually tested `return_all_scores=False` (the actual default) instead of `return_all_scores=True` (the actual weird case). This commit adds the correct legacy test and fixes it. Tmp legacy tests. Actually fix the regression (also contains lists) Less diffed code.
…7906) Fixing a regression with `return_all_scores` introduced in #17606 - The legacy test actually tested `return_all_scores=False` (the actual default) instead of `return_all_scores=True` (the actual weird case). This commit adds the correct legacy test and fixes it. Tmp legacy tests. Actually fix the regression (also contains lists) Less diffed code.
…e#17606 (huggingface#17906) Fixing a regression with `return_all_scores` introduced in huggingface#17606 - The legacy test actually tested `return_all_scores=False` (the actual default) instead of `return_all_scores=True` (the actual weird case). This commit adds the correct legacy test and fixes it. Tmp legacy tests. Actually fix the regression (also contains lists) Less diffed code.
…e#17606 (huggingface#17906) Fixing a regression with `return_all_scores` introduced in huggingface#17606 - The legacy test actually tested `return_all_scores=False` (the actual default) instead of `return_all_scores=True` (the actual weird case). This commit adds the correct legacy test and fixes it. Tmp legacy tests. Actually fix the regression (also contains lists) Less diffed code.
This PR should be cleaned up, it changed the behavior of the text classification pipeline and the documentation is not clear. Can we get more explanations about the way we should expect to modify the code to get the same behavior with the top_k arg or continue supporting the return_all_scores with the same output order? Passing |
What does this PR do?
A lot of users are wondering why the API does not return results sorted.
This PR enables
transformers
to do that and enables functionnalityto users without causing any regression.
The API will simply override the default argument to get sorted results.
Deprecate
return_all_scores
astop_k
is more uniform with otherpipelines, and a superset of what
return_all_scores
can do.BC is maintained though.
return_all_scores=True
->top_k=None
return_all_scores=False
->top_k=1
Using
top_k
will imply sorting the results, but using no argumentwill keep the results unsorted for backward compatibility.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.