Record the learning of NLPDL in PKU (2023 fall)
Task2 Corpus:
from datasets import load_dataset
dataset = load_dataset("wikipedia", "20220301.simple")
corpus = dataset['train']['text']
Task1 nmt github repo: https://github.com/linhaowei1/NLPDL/tree/main/Assignment_2/nmt
the corpus and model are downloaded from huggingface and google drive:
allenai/scibert_scivocab_uncased
because of some network issues, I downloaded them and thus my code is internet-free
What's more, the graphs of loss/acc etc. can be check at the W&B site: NLPDL-Assignment