Skip to content

QuoQA-NLP/T5_Translation

Repository files navigation

T5 Machine Translation: English ↔️ Korean

Streamlit App

Result

BLEU Score Translation Result
Korean ➡️ English 45.148 KE-T5-Ko2En-Base Inference Result
English ➡️ Korean -
  • Evaluation script is on metric.py
  • Korean ➡️ English Result evaluated on 553500 sentence pairs which are disjoint from the train set.

How to Use

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Korean -> English Machine Translation
tokenizer = AutoTokenizer.from_pretrained("QuoQA-NLP/KE-T5-Ko2En-Base")
model = AutoModelForSeq2SeqLM.from_pretrained("QuoQA-NLP/KE-T5-Ko2En-Base")

# English -> Korean Machine Translation
tokenizer = AutoTokenizer.from_pretrained("QuoQA-NLP/KE-T5-En2Ko-Base")
model = AutoModelForSeq2SeqLM.from_pretrained("QuoQA-NLP/KE-T5-En2Ko-Base")
  • For batch translation, please refer to inference.py.
    • P100 16GB supports inferencing of 250 pairs per batch on device.
    • A100 40GB supports inferencing of 600 pairs per batch on device.
  • For single sentence translation, please refer to inference_single.py.

References