Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: edit making tokenizers #50

Merged
merged 1 commit into from
Mar 11, 2022
Merged

feat: edit making tokenizers #50

merged 1 commit into from
Mar 11, 2022

Conversation

HyeonhoonLee
Copy link
Member

@HyeonhoonLee HyeonhoonLee commented Mar 11, 2022

  1. 새로운 env에서 deberta large 모델의 tokenizer 생성이 가능하도록 변경.
    주석 해제를 하면 바로 만들 수 있도록 코드 추가해두었습니다.

  2. tokenizer parallelism 사용 가능하도록 환경 설정
    현재 run_train.py파일 실행 시 아래와 같은 오류가 나오고 있습니다.
    huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using tokenizers before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
    현재 kaggle top code들과 같이 tokenizer paralleism을 기본적으로 사용하는 env 설정하도록 추가하였습니다.

@HyeonhoonLee HyeonhoonLee self-assigned this Mar 11, 2022
Copy link
Contributor

@Kingthegarden Kingthegarden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

캐글 환경과 괴리를 좁혀주셨네요 ㅎㅎ감사합니다.

Copy link
Contributor

@ympaik87 ympaik87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

추가 감사합니다!

@Kingthegarden Kingthegarden merged commit 32ff2c0 into develop Mar 11, 2022
@HyeonhoonLee HyeonhoonLee deleted the develop-token branch March 13, 2022 14:18
@jerife jerife added the enhancement New feature or request label Mar 13, 2022
@ympaik87 ympaik87 mentioned this pull request Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants