This repository contain the code that given a image or a series of images, fit a tennis court model inside.
modelFitting.py
: run with HAWP model and line scoring.modelFitting_letr.py
: run using the LETR model and line scoring.training
contain all the scripts to build the datasets to transfer the neural networkstesting-dataset
: dataset with sample images of tennis, basket and football courts, it is used to test the performance of the algorithm, it include also the annotation file for the tennis court fieldstraining-dataset
: dataset with sample images of tennis courts from various viewpoint, contain also the annotation file used to make transfer learning to LETR
The other branch contain the same file structure, but use different system to score the line fitting. We suggest to use the main branch that are the one that have shown the best results.
First clone the repository:
git clone https://github.com/TEXflip/sport-court-detection.git
git submodule init
git submodule update --remote
To run the system need Python installed. Currently the system support Python <= 3.8 (mainly for the HAWP part while the LETR part support also Python 3.9). Most of the package listed below can be installed with recent versions of pip.
The required packages to install using are:
scikit-learn
tested with 0.24.2torch
tested with 1.7.1 (but also more recent version are known to work). The program is set to automatically detect if a GPU is available and otherwise run it on CPU.torchvision
tested with 0.8.2. Version older that this missing some required functionsnumpy
tested with 1.19.5. Other versions probably work without problemsmatplotlib
tested with 3.3.4. Other versions probably work without problemsPIL
tested with 8.1.0. Other versions probably work without problemsCocoAPI
. We suggested to install it directly from source with the commandpip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
docopt
tested with 0.6.2. Other versions probably work without problemscv2
tested with 4.5.1. We usually recommend to install OpenCV from source. Other version probably work as well, but must be made special attention to the relation withPyTorch
networkx
shapely
The already pretrained model is available here.
To create the dataset run the following command, replacing the part within square bracket with the file locations.
python3 training/build_dataset_letr.py [training annotations filepath] [training image directory] [output directroy dirpath] --test_cvat_annotations_filepath [filepath to the testing annotation]
--test_img_directory [dirpath to the directory of the image to use as test]
LETR must be trained in segments. While it is available as pretrained the entire model, the single pieces are not available, and it is not possible to start the transfer learning of first segment from the final model. So it is first necessary to download the Wireframe Dataset, please check the LETR repository on how to do it. The run this command to train the first part, replace the square bracket parts with the correct path:
PYTHONPATH=$PYTHONPATH:./LETR/src python -m torch.distributed.launch \
--master_port=$((1000 + RANDOM % 9999)) --nproc_per_node=8 --use_env src/main.py --coco_path [wireframe processed dirpath] \
--output_dir [output dirpath] --backbone resnet50 --resume https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth \
--batch_size 1 --epochs 500 --lr_drop 200 --num_queries 1000 --num_gpus 8 --layer1_num 3 | tee -a [output dirpath]/history.txt
Then for start transfer learning run the following command, replacing the dirpath as before:
PYTHONPATH=$PYTHONPATH:./LETR/src python -m torch.distributed.launch \
--master_port=$((1000 + RANDOM % 9999)) --nproc_per_node=8 --use_env src/main.py --coco_path [tennis court dataset dirpath] \
--output_dir [output dirpath] --backbone resnet50 --resume [Last output dirpath] \
--batch_size 1 --epochs 1000 --lr_drop 200 --num_queries 1000 --num_gpus 8 --layer1_num 3 | tee -a [output dirpath]/history.txt
You can then train the other layers:
PYTHONPATH=$PYTHONPATH:./LETR/src python -m torch.distributed.launch \
--master_port=$((1000 + RANDOM % 9999)) --nproc_per_node=8 --use_env src/main.py --coco_path [tennis court dataset dirpath] \
--output_dir [output dirpath] --LETRpost --backbone resnet50 --layer1_frozen --frozen_weights [stage 1 TL checkpoint] --no_opt \
--batch_size 1 ${@:2} --epochs 300 --lr_drop 120 --num_queries 1000 --num_gpus 8 | tee -a [output dirpath]/history.txt
PYTHONPATH=$PYTHONPATH:./LETR/src python -m torch.distributed.launch \
--master_port=$((1000 + RANDOM % 9999)) --nproc_per_node=8 --use_env src/main.py --coco_path [tennis court dataset dirpath] \
--output_dir [output dirpath] --LETRpost --backbone resnet50 --layer1_frozen --resume [stage 2 TL checkpoint] \
--no_opt --batch_size 1 --epochs 25 --lr_drop 25 --num_queries 1000 --num_gpus 8 --lr 1e-5 --label_loss_func focal_loss \
--label_loss_params '{"gamma":2.0}' --save_freq 1 | tee -a [output dirpath]/history.txt
At this point it is possible to use the checkpoint of the last stage to evaluate the performance of the system.
For running the LETR-based system with line scoring use the following command, replacing the square brackets with the corresponding dirpath.
PYTHONPATH=$PYTHONPATH:./LETR/src python modelFitting_letr.py --checkpoint-filepath [last stage checkpoint filepath] --img [image filepath] --output_path [dirpath where save the result]
To run the HAWP-based pipeline is it necessary to first download the pretrained model, please reference to the HAWP repository for it.
Then it is possible to run the pipeline using the command (replace the square brackets with the corresponding paths):
python modelFitting.py --config-file [hawp config filepath, in the default config should be hawp/config-files/hawp.yaml] --img [image filepath] --output_path [dirpath where save the result]
- The image is feeded to one of the Line Detection Neural Networks
- The output is a set of lines defined as a couple of 2 points
- Then the lines are filtered using 3 different filters
- The resulting lines are used to find an homography
- for each pair of lines of both lines output and model template, 4 points are used to find the homography and then the projection (using the resulting matrix) is evaluated using one of the two scoring techniques
- The projection with the best scoring is the final result
In the image below is illustrated the flow of the image
In order to reduce the number of lines, 3 filters have been implemented:
It mainly removes the overlapped lines.
for every couple of lines AB and CD: if the angle of the intersection between AB and CD is smaller than a threshold and min(AC,AD,BC,BD) < threshold keeps the shorter line.
It removes the lines not overlapping the white (or the color of the court lines) pixels.
- create a mask LinesMask from n×m black image and draw the lines (with thickness=6px)
- apply mask on the image
- Init a gaussian mixture with 3 gaussians and the masked image
- get the gaussian g fitting (255,255,255) color or the court line color
- produce a mask CandidateLinesMask by selecting the pixels fitted with g and applying LinesMask
- for each line:
- produce a n×m$ black image and draw the line
- get the number of pixels p overlapped with CandidateLinesMask
- keep the lines with p>0.5*(length of the line)
It removes lonely lines and it keeps only big intersected groups of lines.
- extend the lines with min(n,m)/20
- init graph G with set of nodes = set of lines
- for each couple of line a,b:
- if a intersect b, connect them on the graph G
- compute the connected components of G
- keep only the components > 3 (or in hard mode keep only biggest 2 connected components)
The homography is computed with findHomography()
function of openCV (so RANSAC or other algorithms can be used).
Then is searched the best homography trying all the combinations of the points of 2 lines with 2 lines of the model template and using a scoring.
In this project have are been implemented 3 scoring techniques: template matching based, gaussian mixture based and line based, but only the last 2 give reasonable results and will be described.
- create a mask LinesMask from n×m black image and draw the lines (with thickness=6px)
- apply LinesMask on the image
- Init a gaussian mixture with 3 gaussians and the precedently masked image
- get the gaussian g fitting (255,255,255) color or the court line color
- get the projected lines of the model template using the Homography matrix
- create a mask ModelLinesMask using the projected lines
- apply ModelLinesMask on the image
- the score is computed as the sum of the pixels of the precedently masked image predicted with g
- for each pair of lines compute the local score only if distance(AB, CD) < distance threshold and α < angle threshold
- local score(AB, CD)=min(AC,BD,AD,BC)² + (min(AC,CB) + min(BD,BC))² - 200α
- total score is the sum of the smallest local score of each pair of lines (so smaller is better)