Content:
- Build docker image:
docker build -t ai4vn-teamrealcottoncandy .
- Run docker container:
docker run ai4vn-teamrealcottoncandy
- Mount folder in docker container to real folder (change the source folder to a appropriate folder):
docker run -d -it --name ai4vn-teamrealcottoncandy --mount source=/mnt/disk1/ai4vn-teamrealcottoncandy, target=/workspace ai4vn-teamrealcottoncandy
-
Download OCR training data, then extract it to the folder
training/ocr/vietocr_data
. Notice that, the images in this dataset is cropped from the original prescription images of VAIPE contest. After this step, this folder should be in this format:---training/ocr/vietocr_data/ |---images/ |---train_annotations.txt |---valid_annotations.txt
-
Run the OCR training with this following command:
python training/ocr/train_ocr.py
-
When the training finished successfully, the weights will be stored at
seq2seq_finetuned.pth
. To this it for inference, let's copy it to this folder:weights/ocr/seq2seq_finetuned.pth
.
-
We will training 2 version of YOLOv5, with different types of dataset:
- Download training data with 107 class, then extract it to the folder
training/detection/yolo_107/yolo_107_data
. We created this dataset by cutting out the pill of class id 107 (out of prescription) from the images (so we will have 107 class id, from 0 to 106). After this step, this folder should be in this format:
---training/detection/yolo_107/yolo_107_data/ |---train/ |---val/
- Follow the similar step with the training data with 1 class and folder
training/detection/yolo_1/yolo_1_data
. This dataset is created by replacing the class_id of all pills with 0 (use YOLOv5 for bounding box detection).
- Download training data with 107 class, then extract it to the folder
-
Run the following commands to train YOLOv5 models:
- YOLOv5 with 107 class:
python training/detection/yolo_107/train.py --img 640 --batch 16 --epochs 100 --data training/detection/yolo_107/vaipe.yaml --weights yolov5s.pt
- YOLOv5 with 1 class:
python training/detection/yolo_1/train.py --img 640 --batch 16 --epochs 100 --data training/detection/yolo_1/vaipe.yaml --weights yolov5s.pt
-
When the trainings finished successfully, the weights will be stored at these folders:
training/detection/yolo_107/runs/exp/train/weights/best.pt
for YOLOv5 with 107 class. We move it to the pathweights/detection/yolov5_weights_with_label.pt
for inference (that weights file was renamed).training/detection/yolo_1/runs/exp/train/weights/best.pt
for YOLOv5 with 1 class. Do the similar steps to the previous with the file nameyolov5_weights_without_label.pt
.
-
Download the training data and extract it to the folder
training/classification/cls_data
. We created this dataset by cropping the pills from original images of VAIPE contest. After this step, this folder should be in this format:---training/classification/cls_data/ |---single_pill/ |---single_pill.json
-
Run the following command:
python training/classification/train_cls.py
For inference, we can use the weights from section 2. Training, or download the weights from our Drive.
Note: We have also included these weights in the docker container. If you can't find the weights
folder, let's follow section 3.1. Download trained weights.
-
OCR weights:
- Download text detector weights from PaddleOCR and extract it, then download text classifier weights (fine-tuning from pre-trained model of vietocr).
- Put these weights in the path
weights/ocr/
. After these steps, folderweights/ocr
should be in this format:
---weights/ocr/ |---ch_PP-OCRv3_det_infer/ |--- files |---seq2seq_finetuned.pth
-
Pill detection:
- First, you have to download YOLOv5 weights without label and YOLOv5 weights with label. After that, you should put them in this path:
weights/detection/
.
- First, you have to download YOLOv5 weights without label and YOLOv5 weights with label. After that, you should put them in this path:
-
Pill classification:
- Download the entire weight folders of Swin Transformer V2 and Swin Tiny, then put them in this path
weights/cls/
. - After that, the folder
weights/cls/
should be in this format:
---weights/cls/ |---swinv2_kfold/ |--- *.pth |---swin_tiny_kfold/ |--- *pth
- Download the entire weight folders of Swin Transformer V2 and Swin Tiny, then put them in this path
The data folder data_path
's structure has to be in the following format:
---data_path/
|---pill/
|---image/
|---prescription/
|---image/
|---pill_pres_map.json
Before running the generate results script, we have to specify the path to data folder in the file configs/config_inference.yaml
, at these 3 lines (the following values are for illustration):
pill_image_dir: data_path/pill/image
pres_image_dir: data_path/prescription/image
pill_pres_map_path: data_path/pill_pres_map.json
Currently, we set data_path
to data/public_test
.
To generate result, we run the following command:
python generate_results.py
The result file will be stored at results/csv/results.csv
.