[feat] Adding LXMERT to list of available VL models for finetuning an…

…d pretraining (#339) Summary: Adding model LXMERT. To pretrain on VQA run: `mmf_run config=projects/lxmert/configs/vqa2/pretrain.yaml run_type=train_val dataset=masked_vqa2 model=lxmert ` Added config files for GQA, COCO, VQA2.0, Visual Genome pretraining and VQA2 finetuning. Pull Request resolved: #339 Reviewed By: apsdehal Differential Revision: D22858282 Pulled By: vedanuj fbshipit-source-id: 09c8c04458df88866bf95ed6ed466042eb969900
facebookresearch · Aug 19, 2020 · 6b94a8b · 6b94a8b
1 parent 962a7ea
commit 6b94a8b
Show file tree

Hide file tree

Showing 16 changed files with 1,337 additions and 1 deletion.
diff --git a/mmf/configs/models/lxmert/defaults.yaml b/mmf/configs/models/lxmert/defaults.yaml
@@ -0,0 +1,64 @@
+model_config:
+  lxmert:
+    bert_model_name: bert-base-uncased
+    training_head_type: pretraining
+    random_initialize: false
+    num_labels: 3129
+    gqa_labels: 1534
+    num_hidden_layers : 12
+    num_attention_heads : 12
+    intermediate_size : 3072
+    hidden_size : 768
+    hidden_act : gelu
+    hidden_dropout_prob : 0.1
+    attention_probs_dropout_prob : 0.1
+    max_position_embeddings : 512
+    type_vocab_size : 2
+    initializer_range : 0.02
+    pad_token_id : 0
+    layer_norm_eps : 1e-12
+    mode: 'lxr'
+    l_layers: 9  # 12
+    x_layers: 5  # 5
+    r_layers: 5  # 0
+    visual_feat_dim: 2048
+    visual_pos_dim: 4
+    task_matched: true
+    task_mask_lm: true
+    task_obj_predict: true
+    task_qa: true
+    visual_losses:
+    - obj
+    - feat
+    visual_loss_config:
+      obj:
+      - 3129
+      - ce
+      - [-1,]
+      - 6.67
+      attr:
+      - 400
+      - ce
+      - [-1,]
+      - 6.67
+      feat:
+      - 2048
+      - l2
+      - [-1, 2048]
+      - 6.67
+    special_visual_initialize: true # i dont know what this is
+    hard_cap_seq_len: 36
+    cut_first: text
+    embedding_strategy: plain
+    bypass_transformer: false
+    output_attentions: false # need to implement
+    output_hidden_states: false # need to implement
+    text_only: false
+    freeze_base: false
+    finetune_lr_multiplier: 1
+    vocab_size: 30522
+    fast_mode: false
+    dynamic_attention: false #  need to implement
+    in_batch_pairs: false
+    visualization: false # need to implement
+    model: "bert"
diff --git a/mmf/configs/models/lxmert/pretrain.yaml b/mmf/configs/models/lxmert/pretrain.yaml
@@ -0,0 +1,2 @@
+includes:
+- configs/models/lxmert/defaults.yaml
diff --git a/mmf/datasets/builders/vqa2/masked_dataset.py b/mmf/datasets/builders/vqa2/masked_dataset.py
@@ -40,7 +40,8 @@ def load_item(self, idx):
             current_sample.update(features)
 
         current_sample = self._add_masked_question(sample_info, current_sample)
-
+        if self._add_answer:
+            current_sample = self.add_answer_info(sample_info, current_sample)
         return current_sample
 
     def _add_masked_question(self, sample_info, current_sample):
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		includes:
		- configs/models/lxmert/defaults.yaml