forked from PaddlePaddle/Paddle
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request PaddlePaddle#220 from nemonameless/fix_docs
fix docs of eva02 and evaclip
- Loading branch information
Showing
5 changed files
with
67 additions
and
68 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -40,13 +40,6 @@ python setup.py install --prefix=$INSTALL_DIR | |
export $PATH=$PATH:$INSTALL_DIR | ||
``` | ||
|
||
4)安装paddlemix | ||
|
||
``` | ||
git clone [email protected]:PaddlePaddle/PaddleMIX.git | ||
cd PaddleMix | ||
python setup.py install | ||
``` | ||
|
||
## 3. 数据准备 | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -42,14 +42,6 @@ python setup.py install --prefix=$INSTALL_DIR | |
export $PATH=$PATH:$INSTALL_DIR | ||
``` | ||
|
||
4)安装paddlemix | ||
|
||
``` | ||
git clone [email protected]:PaddlePaddle/PaddleMIX.git | ||
cd PaddleMix | ||
python setup.py install | ||
``` | ||
|
||
## 3. 数据准备 | ||
|
||
1) coco数据 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -99,33 +99,29 @@ python setup.py install --prefix=$INSTALL_DIR | |
export $PATH=$PATH:$INSTALL_DIR | ||
``` | ||
|
||
4)安装paddlemix | ||
|
||
``` | ||
git clone [email protected]:PaddlePaddle/PaddleMIX.git | ||
cd PaddleMix | ||
python setup.py install | ||
``` | ||
|
||
## 2. 数据集和预训练权重 | ||
|
||
1) ImageNet 1k数据 | ||
1) ImageNet-1k数据 | ||
|
||
我们使用标准的ImageNet-1K数据集(ILSVRC 2012,1000类的120万张图像),从 http://image-net.org 下载,然后使用[shell script](https://github.com/pytorch/examples/blob/main/imagenet/extract_ILSVRC.sh) 将训练和验证图像移动并提取到标记的子文件夹中。 | ||
我们使用标准的ImageNet-1K数据集(ILSVRC 2012,1000类的120万张图像),从 http://image-net.org 下载,然后使用[shell script](https://github.com/pytorch/examples/blob/main/imagenet/extract_ILSVRC.sh) 将训练和验证图像移动并提取到标记的子文件夹中。注意其train和val文件夹里均需为1000个子文件夹即1000类。 | ||
|
||
|
||
## 4. 使用说明 | ||
|
||
### 4.1 Pretrain预训练 | ||
|
||
使用`paddlemix/examples/eva02/run_eva02_pretrain_dist.py` | ||
使用`paddlemix/examples/eva02/run_eva02_pretrain_dist.py`。 | ||
|
||
训练命令及参数配置示例: | ||
注意: | ||
|
||
这里示例采用单机8卡程序: | ||
1. 如果采用分布式策略,分布式并行关系有:`nnodes * nproc_per_node == tensor_parallel_degree * sharding_parallel_degree * dp_parallel_degree`,其中`dp_parallel_degree`参数根据其他几个值计算出来,因此需要保证`nnodes * nproc_per_node >= tensor_parallel_degree * sharding_parallel_degree`; | ||
2. `model_name` 可单独使用创建模型,如果更换teacher,则需自己改写`paddlemix/EVA/EVA02/eva02_Ti_for_pretrain`中config.json and model_config.json的teacher_config这个字段的内容,比如将默认的 `paddlemix/EVA/EVA01-CLIP-g-14` 改为 "paddlemix/EVA/EVA02-CLIP-bigE-14"。而student_config是dict,student模型本身是train from scratch的; | ||
3. 如果 model_name=None,也可采用 teacher_name 和 student_name 来创建模型,但它们必须都各自具有config.json和model_state.pdparams,一般eval或加载全量权重debug时采用 model_name=None 的形式; | ||
4. `TEA_PRETRAIN_CKPT`通常情况下设置为None,模型训练前已加载来自`teacher_name`中的对应teacher预训练权重。但是**如果设置 MP_DEGREE > 1**时,则必须再次设置`TEA_PRETRAIN_CKPT`的路径去加载,一般设置绝对路径,也可从对应的下载链接单独下载相应的`model_state.pdparams`并放置; | ||
|
||
注意如果采用分布式策略,分布式并行关系有:`nnodes * nproc_per_node == tensor_parallel_degree * sharding_parallel_degree * dp_parallel_degree`,其中`dp_parallel_degree`参数根据其他几个值计算出来,因此需要保证`nnodes * nproc_per_node >= tensor_parallel_degree * sharding_parallel_degree`. | ||
|
||
训练命令及参数配置示例,这里示例采用单机8卡程序: | ||
```shell | ||
export FLAGS_embedding_deterministic=1 | ||
export FLAGS_cudnn_deterministic=1 | ||
|
@@ -149,24 +145,23 @@ TRAINING_MODEL_RESUME="None" | |
TRAINER_INSTANCES='127.0.0.1' | ||
MASTER='127.0.0.1:8080' | ||
|
||
TRAINERS_NUM=1 | ||
TRAINING_GPUS_PER_NODE=8 | ||
DP_DEGREE=8 | ||
MP_DEGREE=1 | ||
SHARDING_DEGREE=1 | ||
TRAINERS_NUM=1 # nnodes, machine num | ||
TRAINING_GPUS_PER_NODE=8 # nproc_per_node | ||
DP_DEGREE=8 # dp_parallel_degree | ||
MP_DEGREE=1 # tensor_parallel_degree | ||
SHARDING_DEGREE=1 # sharding_parallel_degree | ||
|
||
model_name="paddlemix/EVA/EVA02/eva02_Ti_for_pretrain" | ||
# model_name=None # if set None, will use teacher_name and student_name from_pretrained, both should have config and pdparams | ||
teacher_name="paddlemix/EVA/EVA01-CLIP-g-14" | ||
#teacher_name="paddlemix/EVA/EVA02-CLIP-bigE-14" | ||
student_name="paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_p14" | ||
|
||
TEA_PRETRAIN_CKPT=https://bj.bcebos.com/v1/paddlenlp/models/community/paddlemix/EVA/EVA01-CLIP-g-14/model_state.pdparams # must add if MP is used | ||
TEA_PRETRAIN_CKPT=None # /root/.paddlenlp/models/paddlemix/EVA/EVA01-CLIP-g-14/model_state.pdparams # must add if MP_DEGREE > 1 | ||
STU_PRETRAIN_CKPT=None | ||
|
||
OUTPUT_DIR=./output/pretrain_eva02_ti | ||
OUTPUT_DIR=./output/eva02_Ti_pt_in21k_p14 | ||
|
||
DATA_PATH=./dataset/ILSVRC2012 | ||
DATA_PATH=./dataset/ILSVRC2012 # put your ImageNet-1k val data path | ||
input_size=224 | ||
num_mask_patches=105 ### 224*224/14/14 * 0.4 | ||
batch_size=10 # 100(bsz_per_gpu)*8(#gpus_per_node)*5(#nodes)*1(update_freq)=4000(total_bsz) | ||
|
@@ -223,23 +218,38 @@ ${TRAINING_PYTHON} paddlemix/examples/eva02/run_eva02_pretrain_dist.py \ | |
|
||
|
||
默认teacher为`paddlemix/EVA/EVA01-CLIP-g-14`,如果更换teacher,可改为类似如下: | ||
|
||
``` | ||
model_name=None | ||
model_name="paddlemix/EVA/EVA02/eva02_Ti_for_pretrain" # should modify teacher_config in config.json and model_config.json | ||
# model_name=None # if set None, will use teacher_name and student_name from_pretrained, both should have config and pdparams | ||
teacher_name="paddlemix/EVA/EVA02-CLIP-bigE-14" | ||
student_name="paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_p14" | ||
TEA_PRETRAIN_CKPT=paddlemix/EVA/EVA02-CLIP-bigE-14/model_state.pdparams | ||
TEA_PRETRAIN_CKPT=None # /root/.paddlenlp/models/paddlemix/EVA/EVA02-CLIP-bigE-14/model_state.pdparams # must add if MP_DEGREE > 1 | ||
STU_PRETRAIN_CKPT=None | ||
``` | ||
|
||
注意 `model_name` 可单独使用创建模型,默认teacher_config是`paddlemix/EVA/EVA01-CLIP-g-14`,而student_config是dict,student模型本身是train from scratch的; | ||
如果model_name=None,也可采用teacher_name 和 student_name来创建模型,但它们必须都各自具有config.json和model_state.pdparams,一般eval或加载全量权重debug时采用model_name=None的形式; | ||
注意: | ||
1. `model_name` 可单独使用创建模型,如果更换teacher,则需自己改写`paddlemix/EVA/EVA02/eva02_Ti_for_pretrain`中config.json and model_config.json的teacher_config这个字段的内容,比如将默认的 `paddlemix/EVA/EVA01-CLIP-g-14` 改为 "paddlemix/EVA/EVA02-CLIP-bigE-14"。而student_config是dict,student模型本身是train from scratch的; | ||
2. 如果model_name=None,也可采用teacher_name 和 student_name来创建模型,但它们必须都各自具有config.json和model_state.pdparams,一般eval或加载全量权重debug时采用model_name=None的形式; | ||
3. `TEA_PRETRAIN_CKPT`通常情况下设置为None,模型训练前已加载来自`teacher_name`中的对应teacher预训练权重。但是**如果设置 MP_DEGREE > 1**时,则必须再次设置`TEA_PRETRAIN_CKPT`的路径去加载,一般设置绝对路径,也可从对应的下载链接单独下载相应的`model_state.pdparams`并放置; | ||
|
||
|
||
|
||
### 4.2 Finetune训练 | ||
|
||
使用`paddlemix/examples/eva02/run_eva02_finetune_dist.py`。 | ||
|
||
注意: | ||
|
||
1. 如果采用分布式策略,分布式并行关系有:`nnodes * nproc_per_node == tensor_parallel_degree * sharding_parallel_degree * dp_parallel_degree`,其中`dp_parallel_degree`参数根据其他几个值计算出来,因此需要保证`nnodes * nproc_per_node >= tensor_parallel_degree * sharding_parallel_degree`; | ||
|
||
### 4.2 Finetune微调 | ||
2. 如果训练`paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_ft_in1k_p14`, 则必须加载**其对应的预训练权重**`paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_p14`,然后设置预训练权重的`model_state.pdparams`的绝对路径,或单独从[这个链接](https://bj.bcebos.com/v1/paddlenlp/models/community/paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_ft_in1k_p14/model_state.pdparams)下载并放置。 | ||
|
||
使用`paddlemix/examples/eva02/run_eva02_finetune_dist.py` | ||
3. tiny/s是336尺度训练,B/L是448尺度训练,而它们的预训练权重均是224尺度训练得到的。 | ||
|
||
|
||
训练命令及参数配置示例,这里示例采用单机8卡程序: | ||
```shell | ||
export FLAGS_embedding_deterministic=1 | ||
export FLAGS_cudnn_deterministic=1 | ||
|
@@ -257,30 +267,31 @@ CLIP_GRAD=0.0 | |
num_train_epochs=100 | ||
save_epochs=2 # save every 2 epochs | ||
|
||
warmup_epochs=5 # set 0 will fast convergence in 0 epoch | ||
warmup_epochs=5 # set 0 will fast convergence in 1 epoch | ||
warmup_steps=0 | ||
drop_path=0.1 | ||
|
||
TRAINING_MODEL_RESUME="None" | ||
TRAINER_INSTANCES='127.0.0.1' | ||
MASTER='127.0.0.1:8080' | ||
|
||
TRAINERS_NUM=1 | ||
TRAINING_GPUS_PER_NODE=8 | ||
DP_DEGREE=8 | ||
MP_DEGREE=1 | ||
SHARDING_DEGREE=1 | ||
TRAINERS_NUM=1 # nnodes, machine num | ||
TRAINING_GPUS_PER_NODE=8 # nproc_per_node | ||
DP_DEGREE=8 # dp_parallel_degree | ||
MP_DEGREE=1 # tensor_parallel_degree | ||
SHARDING_DEGREE=1 # sharding_parallel_degree | ||
|
||
MODEL_NAME="paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_ft_in1k_p14" | ||
PRETRAIN_CKPT=https://bj.bcebos.com/v1/paddlenlp/models/community/paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_ft_in1k_p14/model_state.pdparams | ||
PRETRAIN_CKPT=/root/.paddlenlp/models/paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_p14/model_state.pdparams # pretrained model, input_size is 224 | ||
|
||
OUTPUT_DIR=./output/eva02_Ti_pt_in21k_ft_in1k_p14 | ||
|
||
OUTPUT_DIR=./output/finetune_eva02_ti | ||
DATA_PATH=./dataset/ILSVRC2012 # put your ImageNet-1k val data path | ||
|
||
DATA_PATH=./dataset/ILSVRC2012 | ||
input_size=336 | ||
batch_size=128 # 128(bsz_per_gpu)*8(#gpus_per_node)*1(#nodes)*1(update_freq)=1024(total_bsz) | ||
num_workers=10 | ||
accum_freq=2 # update_freq | ||
accum_freq=1 # update_freq | ||
logging_steps=10 # print_freq | ||
seed=0 | ||
|
||
|
@@ -298,8 +309,6 @@ ${TRAINING_PYTHON} paddlemix/examples/eva02/run_eva02_finetune_dist.py \ | |
--input_size ${input_size} \ | ||
--layer_decay ${layer_decay} \ | ||
--drop_path ${drop_path} \ | ||
--smoothing ${smoothing} \ | ||
--do_train \ | ||
--optim ${optim} \ | ||
--learning_rate ${lr} \ | ||
--weight_decay ${weight_decay} \ | ||
|
@@ -332,26 +341,31 @@ ${TRAINING_PYTHON} paddlemix/examples/eva02/run_eva02_finetune_dist.py \ | |
--fp16 ${USE_AMP} \ | ||
``` | ||
|
||
注意tiny/s是336尺度训,B/L是448尺度训,而它们的预训练权重均为224尺度。 | ||
|
||
### 4.3 评估 | ||
|
||
使用`paddlemix/examples/eva02/run_eva02_finetune_eval.py`。 | ||
|
||
### 4.3 评估 | ||
注意: | ||
|
||
1. 默认加载的是下载的`paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_ft_in1k_p14`里的训好的权重,所以PRETRAIN_CKPT=None,如果是本地新训好的权重,则可设置PRETRAIN_CKPT的具体路径去加载和评估; | ||
|
||
使用`paddlemix/examples/eva02/run_eva02_finetune_eval.py` | ||
|
||
```shell | ||
MODEL_NAME="paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_ft_in1k_p14" | ||
DATA_PATH=./datasets/ILSVRC2012 | ||
DATA_PATH=./dataset/ILSVRC2012 # put your ImageNet-1k val data path | ||
OUTPUT_DIR=./outputs | ||
|
||
input_size=336 | ||
batch_size=128 | ||
num_workers=10 | ||
|
||
PRETRAIN_CKPT=None # output/eva02_Ti_pt_in21k_ft_in1k_p14/checkpoint-best/model_state.pdparams | ||
|
||
CUDA_VISIBLE_DEVICES=0 python paddlemix/examples/eva02/run_eva02_finetune_eval.py \ | ||
--do_eval \ | ||
--model ${MODEL_NAME} \ | ||
--pretrained_model_path ${PRETRAIN_CKPT} \ | ||
--eval_data_path ${DATA_PATH}/val \ | ||
--input_size ${input_size} \ | ||
--per_device_eval_batch_size ${batch_size} \ | ||
|
@@ -364,7 +378,7 @@ CUDA_VISIBLE_DEVICES=0 python paddlemix/examples/eva02/run_eva02_finetune_eval.p | |
``` | ||
# 参数说明 | ||
--model #设置实际使用的模型,示例为`EVA/EVA02/eva02_Ti_pt_in21k_ft_in1k_p14`,注意必须用`EVA/EVA02/`开头,后面的模型可自行替换 | ||
--model #设置实际使用的模型,示例为`paddlemix/EVA/EVA02/eva02_Ti_pt_in21k_ft_in1k_p14`,会自动下载,也可自己写本地机器上的路径,后面的模型可自行替换 | ||
--eval_data_path #评估数据路径 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -67,14 +67,6 @@ python setup.py install --prefix=$INSTALL_DIR | |
export $PATH=$PATH:$INSTALL_DIR | ||
``` | ||
|
||
4)安装paddlemix | ||
|
||
``` | ||
git clone [email protected]:PaddlePaddle/PaddleMIX.git | ||
cd PaddleMix | ||
python setup.py install | ||
``` | ||
|
||
## 3. 数据准备 | ||
|
||
1) coco数据 | ||
|