-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move preprocessing to base classes #1807
Move preprocessing to base classes #1807
Conversation
Probably still some test breakages to work though, not mailing this out quite yet. |
ce944da
to
9250f79
Compare
I think this will overall be a nice simplification for maintenance. Push whatever logic we can down onto the base preprocessing classes. Saves a lot of code. To assist with this, I am adding a `special_tokens` property to tokenizers, which I think will be useful anyway.
9250f79
to
ea5e96c
Compare
Ok! Passing besides the nightly failure (which is unrelated). Mailing out. |
The nightly breakage is unrelated btw. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Matt! It's really nice to move all these common logic to the base classes!
Just left some nit comments!
Thanks for review! Will pull this in once tests pass. |
I think this will overall be a nice simplification for maintenance. Push whatever logic we can down onto the base preprocessing classes. Saves a lot of code. To assist with this, I am adding a
special_tokens
property to tokenizers, which I think will be useful anyway.