Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidelines on how to train the model your own dataset. #14

Closed
IemProg opened this issue May 10, 2020 · 4 comments
Closed

Guidelines on how to train the model your own dataset. #14

IemProg opened this issue May 10, 2020 · 4 comments
Labels
good first issue Good for newcomers

Comments

@IemProg
Copy link

IemProg commented May 10, 2020

Could you please, improve the documentation about how can we use the library with pre-trained model ?

I would like to use it on my own dataset if possible.
Thanks

@PkuRainBow
Copy link
Contributor

@IemProg Thanks for your advice and we will improve the Doc.

Do you mean the details on how to train the models on your own dataset?

@IemProg
Copy link
Author

IemProg commented May 10, 2020

Yeah, please, especially if the dataset is not in "Yaml" extension, I have dataset in JPG format.

Thanks !

@PkuRainBow
Copy link
Contributor

PkuRainBow commented May 11, 2020

@IemProg In fact, the dataset is not required to be "Yaml" extension, and JPG is totally OK.

We illustrate an overall (coarse) guidelines on how to train the model on your own dataset as below and hope it helps.

  • first of all, you need to create a set of config files under the folder openseg.pytorch/configs/your_dataset_name following the other dataset. For example, we take the coco_stuff dataset as an example (as below),

"dataset": "coco_stuff",
"method": "fcn_segmentor",
"data": {
"image_tool": "cv2",
"input_mode": "BGR",
"num_classes": 171,
"label_list": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 67, 70, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90, 92, 93, 94, 95, 96,
97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112,
113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128,
129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144,
145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160,
161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176,
177, 178, 179, 180, 181, 182],
"reduce_zero_label": true,
"data_dir": "~/DataSet/pascal_context",
"workers": 8
},
"train": {
"batch_size": 16,
"data_transformer": {
"size_mode": "fix_size",
"input_size": [520, 520],
"align_method": "only_pad",
"pad_mode": "random"
}
},
"val": {
"batch_size": 4,
"mode": "ss_test",
"data_transformer": {
"size_mode": "diverse_size",
"align_method": "only_pad",
"pad_mode": "pad_right_down"
}
},
"test": {
"mode": "ss_test",
"batch_size": 4,
"crop_size": [520, 520],
"scale_search": [0.5, 0.75, 1, 1.25, 1.5, 1.75, 2],
"data_transformer": {
"size_mode": "diverse_size"
}
},

You need to change a set of keywords in the json file including the "dataset", "num_classes", "label_list", "reduce_zero_label", "input_size","crop_size", "base_lr" and so on. Of course, you can also reset these parameters in the training script file (listed as below),

if [ "$1"x == "train"x ]; then
${PYTHON} -u main.py --configs ${CONFIGS} \
--drop_last y \
--nbb_mult 10 \
--phase train \
--gathered n \
--loss_balance y \
--log_to_file n \
--backbone ${BACKBONE} \
--model_name ${MODEL_NAME} \
--gpu 0 1 2 3 \
--data_dir ${DATA_DIR} \
--loss_type ${LOSS_TYPE} \
--max_iters ${MAX_ITERS} \
--checkpoints_name ${CHECKPOINTS_NAME} \
--pretrained ${PRETRAINED_MODEL} \
2>&1 | tee ${LOG_FILE}

  • second, you need to organize your training/validation dataset following the folder structure like below,
├── your_dataset_name
│   ├── train
│   │   ├── image
│   │   └── label
│   ├── val
│   │   ├── image
│   │   └── label
  • third, you need to prepare the training script following the example below and change the DATA_DIR, SAVE_DIR, CONFIGS, and all of the other settings accordingly.

https://github.com/openseg-group/openseg.pytorch/blob/db0d3894673015e9350881db2d02175b0a263368/scripts/coco_stuff/run_h_48_d_4_ocr_train.sh

@PkuRainBow PkuRainBow added the good first issue Good for newcomers label May 11, 2020
@PkuRainBow PkuRainBow pinned this issue May 11, 2020
@PkuRainBow PkuRainBow changed the title Documentation Guidelines on how to train the model your own dataset. May 11, 2020
@IemProg IemProg closed this as completed May 14, 2020
@jhyin12
Copy link

jhyin12 commented Oct 9, 2022

It seems this is not suitable for training segfix on my own dataset

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants