This guide outlines the steps for fine-tuning the SAM model using a custom dataset. This process is particularly useful for complex scenarios involving target-specific training within images. It focuses on excluding non-target areas, allowing for more precise model behavior.
- Preparing Your Dataset
- Training the Model
- Making Predictions
- Evaluating the Model
- Additional Tools
- Contact Information
-
Dataset Preparation: Store your annotated images and corresponding JSON files in the
datasets\before
folder. This includes the original images and their JSON annotation files. -
Data Conversion:
- Execute
json_to_dataset.py
orjson_to_dataset_only.py
for PNG format annotated masks and JPEG format original images.json_to_dataset.py
generates semantic segmentation-style masks and original images.json_to_dataset_only.py
produces separate annotated masks for individual entities within an image, alongside corresponding JSON annotations.
- Original images go to
datasets\JPEGImages
, JSON files todatasets\json
, and annotated masks todatasets\SegmentationClass
.
- Execute
-
Preparing VOCdevkit/VOC2007 Folder:
- Use
resize.py
to resize and place original images and masks inVOCdevkit\VOC2007\JPEGImages
andVOCdevkit\VOC2007\SegmentationClass
. - Utilize
json_scaling.py
to resize JSON files forVOCdevkit\VOC2007\json
. - Note: Inconsistent image dimensions can misalign masks during training. Normalize the size of each image, mask, and JSON annotation in
VOCdevkit/VOC2007
usingresize.py
andjson_scaling.py
.
- Use
-
Dataset Splitting: Employ
voc_annotation.py
for dividing the dataset into training and validation sets, with results inVOCdevkit\VOC2007\ImageSets\Segmentation
. -
Modifying Training File (train.py):
- Choose optimizer:
optimizer_type = "adam"
. - Select SAM model type:
model_type = 'vit_h'
. - Specify the model path:
model_path = "checkpoint/sam_vit_h_4b8939.pth"
. Download from SAM GitHub Repository. - Set training iterations:
epoch = 200
. - Initial learning rate:
Init_lr = 1e-5
. - Model saving frequency:
save_period = 10
. - Training results directory:
save_dir = 'logs'
. - Training dataset path:
train_dataset_path = 'VOCdevkit/VOC2007'
. - If applicable, set
class_weight
for loss function balancing.
- Choose optimizer:
-
Starting Training: Launch
train.py
to commence training.
Post-training, employ predict.py
for model predictions.
- Prediction path setup:
test_dataset_path = 'img'
. - Specify target class:
label_filter='pore'
. - Test various bounding box sizes: Set
min_scale = 1
andmax_scale = 1
. - Execute
predict.py
for predictions, with results inimg_out
.
- Real masks of test images are located in
img\test
, while predictions reside inimg_out
. - Run
Metric Calculation.py
for various evaluation metrics.
extract_matching_images.py
: For extracting images from a folder based on txt file listings.convert_labels_color.py
: Aligns training mask colors with prediction mask colors for consistency.
For any issues or queries regarding the code or dataset, feel free to reach out.
- Email: [email protected]