Add Bert mesh experiments #160

nsorros · 2021-11-24T10:38:13Z

Description

Fixes #127

Checklist

Linked to Notion or GitHub issue
Added tests
Updated README
DVC up to date

pdan93 · 2021-11-25T12:14:17Z

grants_tagger/bertmesh/prepare_data.py

+
+    tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
+    X_vec = tokenizer(X, truncation=True, padding=True)["input_ids"]
+    X_vec = np.array(X_vec)


why transform it to a np.array here?
you can do same .shape calls and save it also as a tensor.

i guess so that we can use it both for tensorflow and pytorch training

pdan93 · 2021-11-25T12:22:19Z

grants_tagger/bertmesh/train_torch.py

+        epochs: int=5, pretrained_model="bert-base-uncased"):
+    dataset = MeshDataset(x_path, y_path)
+
+    model = AutoModelForSequenceClassification.from_pretrained(pretrained_model, num_labels=dataset.num_labels)


did you check if this trains ok on GPU?
(you might have to do .to(device) for the model and the inputs)

nsorros · 2021-12-12T15:44:08Z

As results are not submitted so far. The first Pubmed Bert ran gave 0.40 and with multi label attention 0.49.

nsorros · 2021-12-12T16:11:59Z

Examples from weights and biases tracking GPU utilization out of the box

pdan93 · 2021-12-13T08:46:10Z

grants_tagger/bertmesh/model.py

+        super().__init__()
+        self.pretrained_model = pretrained_model
+        self.num_labels = num_labels
+        self.multilabel_attention = multilabel_attention


is this line needed if you overwrite it in 2 lines below? or maybe an if statement was intended there but forgotten?

pdan93 · 2021-12-13T08:47:25Z

grants_tagger/bertmesh/model.py

+        self.multilabel_attention = MultiLabelAttention(
+            768, num_labels
+        )  # num_labels, 768
+        self.linear_1 = torch.nn.Linear(768, 512)  # num_labels, 512


consider using the sequential api for these 3 layers?
nn.Sequential(...)

actually in this case we use different layers depending on whether we add a multilabel attention or not. not all are used so this reduces duplication but generally good shout

grants_tagger/bertmesh/train.py

nsorros · 2021-12-13T13:59:17Z

grants_tagger/bertmesh/model.py

+        else:
+            cls = self.bert(input_ids=inputs)[1]
+            outs = torch.nn.functional.relu(self.linear_1(cls))
+            outs = torch.nn.functional.relu(self.linear_2(outs))


this should be remove as it is only relevant for multilabel attention

nsorros · 2022-01-10T13:38:27Z

Opening this PR for review. Note that flake8 error has surfaced a bug 🐛 to how I evaluate snapshot ensemble.

The best result is achieved in commit 4c2f68e

aCampello · 2022-01-17T15:51:15Z

Results look good, would be good to include time to train.

aCampello · 2022-01-17T15:51:34Z

What's your intention with this PR - merge as is? Or will you work more on that still?

nsorros · 2022-01-17T16:35:29Z

What's your intention with this PR - merge as is? Or will you work more on that still?

This should be merged as is, it represents all the experiments I have run so far. It is experimental and not production ready i.e. there is no class abstraction that represents BertMesh and the code duplicates some things but it serves the purpose of moving fast towards trying a couple of ideas from the literature. It should also not be that far from being production ready anyway.

aCampello

I had read this PR before, don't know why I didn't approve.

How long did it take to train? I am keen to try and reproduce this with me AWS account.

aCampello · 2022-02-24T13:57:20Z

grants_tagger/bertmesh/results.json

@@ -0,0 +1 @@
+{"p": 0.6955707011887685, "r": 0.569327457868485, "f1": 0.626149221959104, "th": 0.5}


It'd be nice to have train information such as time to train if I want to reproduce.

Sure it can be added, I was not using our main train script to accelerate experimentation but I can add it easily.

nsorros · 2022-02-25T06:40:35Z

I had read this PR before, don't know why I didn't approve.

How long did it take to train? I am keen to try and reproduce this with me AWS account.

Well actually there is a good opportunity to do this as soon as I change the model to produce a hugging face model which will make uploading to a hub super easy. So let me open this PR today and you can run it. It takes ~4 days in g4dn.metal. You can track progress with wandb

nsorros changed the title ~~Bert mesh~~ Add Bert mesh experiments Nov 24, 2021

pdan93 reviewed Nov 25, 2021

View reviewed changes

nsorros changed the base branch from master to main December 7, 2021 09:54

nsorros mentioned this pull request Dec 8, 2021

Add year, journal in preprocess mesh #174

Merged

4 tasks

nsorros force-pushed the bert-mesh branch from ebc1853 to 72f4471 Compare December 10, 2021 08:47

nsorros mentioned this pull request Dec 13, 2021

Stop using aws sync in favour of dvc pull #133

Closed

pdan93 reviewed Dec 13, 2021

View reviewed changes

grants_tagger/bertmesh/train.py Show resolved Hide resolved

nsorros commented Dec 13, 2021

View reviewed changes

This was referenced Dec 14, 2021

Remove torch and core dependency from WellcomeML #153

Closed

Add a way to test dvc repro on sample / test dataset #176

Open

nsorros force-pushed the bert-mesh branch from 7eef75d to 4f14a8f Compare December 21, 2021 08:45

nsorros requested review from aCampello and ivyleavedtoadflax January 10, 2022 13:17

nsorros marked this pull request as ready for review January 10, 2022 13:17

nsorros added 8 commits February 14, 2022 15:52

Add npy npz in gitignore

20ec34d

Add journal year in meta

f104cd7

Add experimental bertmesh code

b15c5a8

Rename bertmesh files and add dvc yaml

9d94f55

Introduce params to BertMesh DVC

d57f177

Train BertMesh using PyTorch

e4644b5

Multilabel attention in pytorch

f0a6938

Switch to custom training loop

ed4889e

nsorros added 23 commits February 14, 2022 15:52

Add GPU support to torch train

4bce113

Add multilabel attention as option

5db820a

Add evaluate and run dvc

d9b5ba7

Add loss and run black

99ec3f5

Split model and data to separate files for bertmesh

6155167

Run black after rebase

6683e69

dvc commit the minor change in preprocess

41ece06

Run first experiment with BertMesh

49aa740

Run bertmesh with pubmedbert for 5 epochs

d2d3a15

Train with multilabel attention

95b22ad

Add results to git. It was previously ignored

f40de90

Run experiment with hidden size 1024

63eaf3e

Fix non multilabel attention by removing layer that was squeezing to 1

124f217

Fix multilabel attention flag

06fdbf4

Add clip norm and params in wandb

6272ce7

Add dropout

80931b6

Add train metrics and run black

e493abb

Add learning rate schedule and val loss

9243aa6

Add dvc run and results

39fe590

Add pr curve and wandb in evaluate

a27b0cd

Add accelerate

2336abb

Add accelerate in train

c70ef3c

Run dvc and record results

ff301c2

nsorros force-pushed the bert-mesh branch from 971d016 to ff301c2 Compare February 22, 2022 09:38

Run black

d96754d

aCampello approved these changes Feb 24, 2022

View reviewed changes

nsorros merged commit 087c7da into main Feb 25, 2022

nsorros deleted the bert-mesh branch February 25, 2022 06:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Bert mesh experiments #160

Add Bert mesh experiments #160

nsorros commented Nov 24, 2021 •

edited

Loading

pdan93 Nov 25, 2021

nsorros Dec 13, 2021

pdan93 Nov 25, 2021

nsorros commented Dec 12, 2021

nsorros commented Dec 12, 2021 •

edited

Loading

pdan93 Dec 13, 2021 •

edited

Loading

nsorros Dec 13, 2021

pdan93 Dec 13, 2021 •

edited

Loading

nsorros Dec 13, 2021

nsorros Dec 13, 2021

nsorros Dec 13, 2021

nsorros commented Jan 10, 2022

aCampello commented Jan 17, 2022

aCampello commented Jan 17, 2022

nsorros commented Jan 17, 2022

aCampello left a comment

aCampello Feb 24, 2022

nsorros Feb 25, 2022

nsorros commented Feb 25, 2022

		@@ -0,0 +1 @@
		{"p": 0.6955707011887685, "r": 0.569327457868485, "f1": 0.626149221959104, "th": 0.5}

Add Bert mesh experiments #160

Add Bert mesh experiments #160

Conversation

nsorros commented Nov 24, 2021 • edited Loading

Description

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nsorros commented Dec 12, 2021

nsorros commented Dec 12, 2021 • edited Loading

pdan93 Dec 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pdan93 Dec 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nsorros commented Jan 10, 2022

aCampello commented Jan 17, 2022

aCampello commented Jan 17, 2022

nsorros commented Jan 17, 2022

aCampello left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nsorros commented Feb 25, 2022

nsorros commented Nov 24, 2021 •

edited

Loading

nsorros commented Dec 12, 2021 •

edited

Loading

pdan93 Dec 13, 2021 •

edited

Loading

pdan93 Dec 13, 2021 •

edited

Loading