Skip to content

Commit

Permalink
Merge pull request #23 from wj-Mcat/master
Browse files Browse the repository at this point in the history
@wj-Mcat Congratulations on finishing [PaddlePaddle Hackathon 2 Task 93](#17)! 
Many thanks for your interests in this repo. Together, we can make it better!
  • Loading branch information
tata1661 authored May 9, 2022
2 parents 688ca6e + b2f5e44 commit 4e0dae5
Show file tree
Hide file tree
Showing 11 changed files with 893 additions and 1 deletion.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -115,11 +115,12 @@ dmypy.json
.pyre/

# pycharm
*/.DS_Store
*.DS_Store
**/__pycache__/
.idea/
FETCH_HEAD

# vscode
.vscode
*.DS_Store
PaddleFSL/raw_data/
48 changes: 48 additions & 0 deletions PaddleFSL/examples/optim/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Image Classification Tasks

Here, we provide examples of applying PaddleFSL to few-shot image classification tasks which is similarity to example with [model_zoo](../image_classification/README.md).


## Datasets

We evaluate the performance on 5 benchmark datasets, including Omniglot, *mini*ImageNet, CIFAR-FS, FC100 and Tiered-ImageNet, which can be accessed as described in [raw_data/README.md](../../raw_data/README.md).


## Results

We provide results of using MAML [1], ANIL [2] below. The exact model configuration and pretrained models can be downloaded from [here](https://drive.google.com/file/d/1pmCI-8cwLsadG6JOcubufrQ2d4zpK9B-/view?usp=sharing), which can reproduce these results.

### [MAML](http://proceedings.mlr.press/v70/finn17a/finn17a.pdf?source=post_page---------------------------)


| Dataset | Backbone | Way | Shot | Original paper | Other reports | model zoo(first order) | Optim(first order) |
| :-------------: | :------: | :--: | :--: | :------------: | :----------------------------------------------------------: | :--------------------: | ------------------ |
| Omniglot | MLP | 5 | 1 | 89.7 ± 1.1 | 88.9<br>([learn2learn](http://learn2learn.net/)) | 88.88 ± 2.99 | -- |
| Omniglot | MLP | 5 | 5 | 97.5 ± 0.6 | -- | 97.50 ± 0.47 | -- |
| Omniglot | CNN | 5 | 1 | 98.7 ± 0.4 | 99.1<br/>([learn2learn](http://learn2learn.net/)) | 97.13 ± 1.25 | 92.7 |
| Omniglot | CNN | 5 | 5 | 99.9 ± 0.1 | 99.9 ± 0.1<br/>([R2D2](https://arxiv.org/pdf/1805.08136.pdf)) | 99.23 ± 0.40 | ***93.1*** |
| *mini*ImageNet | CNN | 5 | 1 | 48.70 ± 1.84 | 48.3<br/>([learn2learn](http://learn2learn.net/)) | 49.81 ± 1.78 | |
| *mini*ImageNet | CNN | 5 | 5 | 63.11 ± 0.92 | 65.4<br/>([learn2learn](http://learn2learn.net/)) | 64.21 ± 1.33 | -- |
| CIFAR-FS | CNN | 5 | 1 | -- | 58.9 ± 1.9<br/>([R2D2](https://arxiv.org/pdf/1805.08136.pdf)) | 57.06 ± 3.83 | 49.1 |
| CIFAR-FS | CNN | 5 | 5 | -- | 76.6<br/>([learn2learn](http://learn2learn.net/)) | 72.24 ± 1.71 | -- |
| FC100 | CNN | 5 | 1 | -- | -- | 37.63 ± 2.23 | 30.2 |
| FC100 | CNN | 5 | 5 | -- | 49.0<br/>([learn2learn](http://learn2learn.net/)) | 49.14 ± 1.58 | -- |
| CUB | CNN | 5 | 1 | -- | 54.73 ± 0.97<br/>([CloseLookFS](https://arxiv.org/pdf/1904.04232.pdf)) | 53.31 ± 1.77 | 20.7 |
| CUB | CNN | 5 | 5 | -- | 75.75 ± 0.76<br/>([CloseLookFS](https://arxiv.org/pdf/1904.04232.pdf)) | 69.88 ± 1.47 | -- |
| Tiered-ImageNet | CNN | 5 | 5 | -- | -- | 67.56 ± 1.80 | -- |

### [ANIL](https://openreview.net/pdf?id=rkgMkCEtPB)

| Dataset | Backbone | Way | Shot | Author Report | Other Report | model zoo(first order) | Optimizer(First Order) |
| :------------: | :------: | :--: | :--: | :-----------: | :-----------------------------------------------: | :--------------------: | ---------------------- |
| Omniglot | CNN | 5 | 1 | -- | -- | 96.06 ± 1.00 | 96.34 ± 1.98 |
| Omniglot | CNN | 5 | 5 | -- | -- | 98.74 ± 0.48 | |
| *mini*ImageNet | CNN | 5 | 1 | 46.7 ± 0.4 | -- | 48.31 ± 2.83 | 45.31 ± 1.43 |
| *mini*ImageNet | CNN | 5 | 5 | 61.5 ± 0.5 | -- | 62.38 ± 1.96 | 61.81 ± 1.2 |
| CIFAR-FS | CNN | 5 | 1 | -- | -- | 56.19 ± 3.39 | ***30.8 ± 2.5*** |
| CIFAR-FS | CNN | 5 | 5 | -- | 68.3<br/>([learn2learn](http://learn2learn.net/)) | 68.60 ± 1.25 | 48.6 |
| FC100 | CNN | 5 | 1 | -- | -- | 40.69 ± 3.32 | 38.4 ± 1.3 |
| FC100 | CNN | 5 | 5 | -- | 47.6<br/>([learn2learn](http://learn2learn.net/)) | 48.01 ± 1.22 | 35.0 |
| CUB | CNN | 5 | 1 | -- | -- | 53.25 ± 2.18 | -- |
| CUB | CNN | 5 | 5 | -- | -- | 69.09 ± 1.12 | -- |

143 changes: 143 additions & 0 deletions PaddleFSL/examples/optim/anil_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
"""MAML example for optimization"""
from __future__ import annotations
import os
import paddle
from paddle import nn
from paddle.optimizer import Adam
import paddlefsl
from paddlefsl.metaopt.anil import ANILLearner
from examples.optim.meta_trainer import Config, Trainer, load_datasets


def init_models(config: Config):
"""Initialize models."""
if config.dataset == 'cub':
config.meta_lr = 0.002
config.inner_lr = 0.01
config.test_epoch = 10
config.meta_batch_size = 32
config.train_inner_adapt_steps = 5
config.test_inner_adapt_steps = 10
config.epochs = 10000

if config.k_shot == 5:
config.meta_lr = 0.003
config.inner_lr = 0.05
config.epochs = 10000

feature_model = paddlefsl.backbones.Conv(input_size=(3, 84, 84), output_size=config.n_way, conv_channels=[32, 32, 32, 32])
feature_model.output = paddle.nn.Flatten()
head_layer = paddle.nn.Linear(in_features=feature_model.feature_size, out_features=config.n_way,
weight_attr=feature_model.init_weight_attr, bias_attr=feature_model.init_bias_attr)

if config.dataset == 'cifarfs':
config.meta_lr = 0.001
config.inner_lr = 0.02
config.test_epoch = 10
config.meta_batch_size = 32
config.train_inner_adapt_steps = 5
config.test_inner_adapt_steps = 10
config.epochs = 20000
if config.k_shot == 5:
config.meta_lr = 0.001
config.inner_lr = 0.08

feature_model = paddlefsl.backbones.Conv(input_size=(3, 32, 32), output_size=config.n_way, conv_channels=[32, 32, 32, 32])
feature_model.output = paddle.nn.Flatten()
head_layer = paddle.nn.Linear(in_features=32, out_features=config.n_way,
weight_attr=feature_model.init_weight_attr, bias_attr=feature_model.init_bias_attr)

if config.dataset == 'miniimagenet':

config.meta_lr = 0.002
config.inner_lr = 0.05
config.test_epoch = 10
config.meta_batch_size = 32
config.train_inner_adapt_steps = 5
config.test_inner_adapt_steps = 10
config.epochs = 30000

feature_model = paddlefsl.backbones.Conv(input_size=(3, 84, 84), output_size=config.n_way, conv_channels=[32, 32, 32, 32])
feature_model.output = paddle.nn.Flatten()
head_layer = paddle.nn.Linear(in_features=feature_model.feature_size, out_features=config.n_way,
weight_attr=feature_model.init_weight_attr, bias_attr=feature_model.init_bias_attr)

if config.dataset == 'omniglot':
config.meta_lr = 0.005
config.inner_lr = 0.5

if config.k_shot == 5:
config.meta_lr = 0.06
config.inner_lr = 0.12
config.train_inner_adapt_steps = 3
config.test_inner_adapt_steps = 5

config.test_epoch = 10
config.meta_batch_size = 32
config.train_inner_adapt_steps = 1
config.test_inner_adapt_steps = 3
config.epochs = 30000

feature_model = paddlefsl.backbones.Conv(input_size=(1, 28, 28), output_size=config.n_way, pooling=False)
feature_model.output = paddle.nn.Flatten()
head_layer = paddle.nn.Linear(in_features=feature_model.feature_size, out_features=config.n_way,
weight_attr=feature_model.init_weight_attr, bias_attr=feature_model.init_bias_attr)

if config.dataset == 'fc100':
config.meta_lr = 0.005
config.inner_lr = 0.1
config.test_epoch = 10
config.meta_batch_size = 32
config.train_inner_adapt_steps = 5
config.test_inner_adapt_steps = 10
config.epochs = 5000
if config.k_shot == 5:
config.meta_lr = 0.002
config.epochs = 2000

feature_model = paddlefsl.backbones.Conv(input_size=(3, 32, 32), output_size=config.n_way)
feature_model.output = paddle.nn.Flatten()
head_layer = paddle.nn.Linear(in_features=feature_model.feature_size, out_features=config.n_way,
weight_attr=feature_model.init_weight_attr, bias_attr=feature_model.init_bias_attr)

return feature_model, head_layer


if __name__ == '__main__':

config = Config().parse_args(known_only=True)
config.device = 'gpu'
config.k_shot = 1

# config.dataset = 'omniglot'
config.dataset = 'miniimagenet'
# config.dataset = 'cifarfs'
# config.dataset = 'fc100'
# config.dataset = 'cub'

config.tracking_uri = os.environ.get('TRACKING_URI', None)
config.experiment_id = os.environ.get('EXPERIMENT_ID', None)

# Config: ANIL, Omniglot, Conv, 5 Ways, 1 Shot
train_dataset, valid_dataset, test_dataset = load_datasets(config.dataset)
feature_model, head_layer = init_models(config)

criterion = nn.CrossEntropyLoss()
learner = ANILLearner(
feature_model=feature_model,
head_layer=head_layer,
learning_rate=config.inner_lr,
)
scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=config.meta_lr, T_max=config.epochs)
optimizer = Adam(parameters=learner.parameters(), learning_rate=scheduler)
trainer = Trainer(
config=config,
train_dataset=train_dataset,
dev_dataset=valid_dataset,
test_dataset=test_dataset,
learner=learner,
optimizer=optimizer,
scheduler=scheduler,
criterion=criterion
)
trainer.train()
62 changes: 62 additions & 0 deletions PaddleFSL/examples/optim/anil_text_classification.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
"""ANIL example for optimization"""
from __future__ import annotations
import os
import paddle
from paddle import nn
from paddle.optimizer import Adam
import paddlefsl
from paddlefsl.metaopt.anil import ANILLearner
from paddlenlp.transformers.ernie.modeling import ErnieModel
from paddlenlp.transformers.ernie.tokenizer import ErnieTokenizer

from examples.optim.meta_trainer import Config, Trainer, load_datasets

class SequenceClassifier(nn.Layer):
"""Sequence Classifier"""
def __init__(self, hidden_size: int, output_size: int, dropout: float = 0.1):
super().__init__()
self.dropout = nn.Dropout(dropout)
self.classifier = nn.Linear(hidden_size, output_size)

def forward(self, embedding):
"""handle the main logic"""
embedding = self.dropout(embedding)
logits = self.classifier(embedding)
return logits


if __name__ == '__main__':

config = Config().parse_args(known_only=True)
config.device = 'gpu'

train_dataset = paddlefsl.datasets.few_rel.FewRel('train')
valid_dataset = paddlefsl.datasets.few_rel.FewRel('valid')
test_dataset = paddlefsl.datasets.few_rel.FewRel('valid')

config.tracking_uri = os.environ.get('TRACKING_URI', None)
config.experiment_id = os.environ.get('EXPERIMENT_ID', None)

tokenzier = ErnieTokenizer.from_pretrained('ernie-1.0')
feature_model, head_layer = ErnieModel.from_pretrained('ernie-1.0'), SequenceClassifier(hidden_size=768, output_size=config.n_way)

criterion = nn.CrossEntropyLoss()
learner = ANILLearner(
feature_model=feature_model,
head_layer=head_layer,
learning_rate=config.inner_lr,
)
scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=config.meta_lr, T_max=config.epochs)
optimizer = Adam(parameters=learner.parameters(), learning_rate=scheduler)
trainer = Trainer(
config=config,
train_dataset=train_dataset,
dev_dataset=valid_dataset,
test_dataset=test_dataset,
learner=learner,
optimizer=optimizer,
scheduler=scheduler,
criterion=criterion,
tokenizer=tokenzier
)
trainer.train()
48 changes: 48 additions & 0 deletions PaddleFSL/examples/optim/data_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
"""Data Utils for Meta Optimzations Algorithms"""
from __future__ import annotations
from typing import Tuple, Dict
import paddlefsl
from paddlefsl.datasets.cv_dataset import CVDataset


def load_datasets(name: str) -> Tuple[CVDataset, CVDataset, CVDataset]:
"""load CV Dataset by name, which can be omniglot, miniimagenet, or cifar10
Args:
name (str): the name of datasets
Returns:
Tuple[CVDataset, CVDataset, CVDataset]: train, dev, test dataset
"""
datasets_map: Dict[str, CVDataset] = {
"omniglot": (
paddlefsl.datasets.Omniglot(mode='train', image_size=(28, 28)),
paddlefsl.datasets.Omniglot(mode='valid', image_size=(28, 28)),
paddlefsl.datasets.Omniglot(mode='test', image_size=(28, 28))
),
# "miniimagenet": (
# paddlefsl.datasets.MiniImageNet(mode='train'),
# paddlefsl.datasets.MiniImageNet(mode='valid'),
# paddlefsl.datasets.MiniImageNet(mode='test')
# ),
# "cifarfs": (
# paddlefsl.datasets.CifarFS(mode='train', image_size=(28, 28)),
# paddlefsl.datasets.CifarFS(mode='valid', image_size=(28, 28)),
# paddlefsl.datasets.CifarFS(mode='test', image_size=(28, 28))
# ),
# "fc100": (
# paddlefsl.datasets.FC100(mode='train'),
# paddlefsl.datasets.FC100(mode='valid'),
# paddlefsl.datasets.FC100(mode='test')
# ),
# "cub": (
# paddlefsl.datasets.CubFS(mode='train'),
# paddlefsl.datasets.CubFS(mode='valid'),
# paddlefsl.datasets.CubFS(mode='test')
# )
}
if name not in datasets_map:
names = ",".join(list(datasets_map.keys()))
raise ValueError(f"{name} is not a valid dataset name, which should be in {names}")

return datasets_map[name]
Loading

0 comments on commit 4e0dae5

Please sign in to comment.