Skip to content

RoBERTa에 LSTM 레이어를 추가하여 Multi label dataset 분류

License

Notifications You must be signed in to change notification settings

shareourenthusiasm/lstm_roberta_dacon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BI_LSTM with RoBERTa Embedding

Dacon competitions

How to Use

  • Run train.py

Requirements

  • transformers == 4.25.1
  • pandas == 1.3.5
  • numpy == 1.21.6
  • torch == 1.13.1+cu116
  • scikit-learn == 1.0.2
  • tqdm == 4.64.1
  • pyperclip == 1.8.2
  • selenium == 4.7.2

Metric

  • weighted F1 score

Score

  • Public score : 74.78 (35/565) (with koElectra ensemble seed = 777) (it isn't public 6th solution)
  • Private score : non checked

Future works

  • loss can't coverge with klue/RoBERTa-large (I think it's because of hyperparameter. it can be get higher score)

Workers

  • Seed ensemble
  • CV ensemble
  • Exploratory Data Analysis
  • Code refactoring
  • Project Managing
  • Focal Loss function debug
  • RoBERTaForSequenceClassification debug
  • Backtranslatation function for data augmentation
  • Focal Loss function (for fix data imbalance)
  • Add BI_LSTM layer in RoBERTaForSequenceClassification

Citation

About

RoBERTa에 LSTM 레이어를 추가하여 Multi label dataset 분류

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages