Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add truncate_text option to tokenize #126

Merged
merged 2 commits into from
Jul 19, 2021
Merged

Add truncate_text option to tokenize #126

merged 2 commits into from
Jul 19, 2021

Conversation

rom1504
Copy link
Contributor

@rom1504 rom1504 commented Jul 8, 2021

This makes it possible to run tokenize on texts that are longer than the number of tokens
that fit the context length without having to try to guess how to cut in number of
characters beforehand

rom1504 and others added 2 commits July 8, 2021 17:34
This makes it possible to run tokenize on texts that are longer than the number of tokens
that fit the context length without having to try to guess how to cut in number of 
characters beforehand
@jongwook jongwook merged commit a2737ac into openai:main Jul 19, 2021
rom1504 added a commit to rom1504/CLIP that referenced this pull request Jul 24, 2021
* Using non-JIT by default; compat fix with 1.8+

* Add truncate option to tokenize (openai#126)

* Add truncate_text option to tokenize

This makes it possible to run tokenize on texts that are longer than the number of tokens
that fit the context length without having to try to guess how to cut in number of 
characters beforehand

* add doc, rename to just "truncate", use eot_token

Co-authored-by: Jong Wook Kim <[email protected]>

* test fix

* Rename VisualTransformer -> VisionTransformer (openai#97)

Fixes openai#94

* add ViT-B/16 and RN50x16 models

* Update README.md

* truncate

Co-authored-by: Jong Wook Kim <[email protected]>
Co-authored-by: Romain Beaumont <[email protected]>
Co-authored-by: Haofan Wang <[email protected]>
Co-authored-by: Sam Sepiol <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants