This repository is part of a research project conducted at Esslingen University of Applied Sciences from March 2020 to March 2022. It is based on a fork of the original FasterSeg repository to train the semantic segmentation network with a custom dataset.
The goal of the project was to compare two different perspectives regarding the accuracy and investigate the real-time capability of FasterSeg on embedded hardware. Therefore, a synthetic dataset in a simulation environment in the context of the Carolo-Cup was generated. This synthetic data set was extended with some real data to obtain a more realistic environment. To examine the accuracy of the different perspectives, two FasterSeg models were trained. One with images from the first-person perspective and the other one with images from the bird's eye view perspective. For a 320x256 input image, FasterSeg has achieved 65.44% mean Intersection over Union (mIoU) from the first-person perspective and 64.08% mIoU from the bird's eye view perspective. Both models have reached 247.11 frames per second on NVIDIA Jetson AGX Xavier embedded hardware.
The image sequence below shows a test drive from the model car on the test track in the it:movES lab at Esslingen University of Applied Sciences, where the first-person image is on the left and the bird's eye image is on the right side. Each color indicates the prediction of a predefined class like left lane, right lane, crosswalk, etc.
For more details, please refer to our paper: "Semantic Segmentation for Autonomous Driving:Model Evaluation, Dataset Generation, PerspectiveComparison, and Real-Time Capability"
Perspective comparsion of two FasterSeg models trained with custom data. First-person view is on the left side and bird's eye view is on the right side.
The following steps describe how FasterSeg can be trained with custom data.
FasterSeg: Searching for Faster Real-time Semantic Segmentation [PDF]
Wuyang Chen, Xinyu Gong, Xianming Liu, Qian Zhang, Yuan Li, Zhangyang Wang
FasterSeg is an automatically designed semantic segmentation network with not only state-of-the-art performance but also faster speed than current methods.
- CUDA supported NVIDIA GPU (>= 11G graphic memory)
This repository has been tested on Tesla V100s. Configurations (e.g batch size, image patch size) may need to be changed on different platforms.
- Install Docker and NVIDIA container runtime what is needed to use a preconfigured training environment.
- Download TensorRT 5.1.5.0 GA for Ubuntu 16.04 and CUDA 10.1 tar package.
- Move
TensorRT-5.1.5.0.Ubuntu-16.04.5.x86_64-gnu.cuda-10.1.cudnn7.5.tar.gz
to the previously cloned FasterSeg repository - Clone this repo:
cd path/where/to/store/the/repository
git clone https://github.com/Gaussianer/FasterSeg.git
- Move
TensorRT-5.1.5.0.Ubuntu-16.04.5.x86_64-gnu.cuda-10.1.cudnn7.5.tar.gz
to the previously cloned FasterSeg repository - Build image from the Dockerfile:
cd FasterSeg
sudo docker build -t fasterseg:latest -f Dockerfile .
- Run the container
sudo docker run --rm --gpus all -it -p 6006:6006 fasterseg:latest
We also provide an image for this on DockerHub. To run the container image from our repository, execute the following command:
sudo docker run --rm -e NVIDIA_VISIBLE_DEVICES=all -it -p 6006:6006 docker.io/gaussianer/fasterseg:5
- If Docker is not a solution for you, we also offer a solution using Podman. Here you can find the setup for a rootless GPU container solution using Podman. Since Docker and Podman use the same commands, replace "sudo docker" with "podman" for each docker command in this guide.
Note: To execute the same instance of the container at a later time, you can execute the following command:
sudo docker exec -it <Container ID> bash
- Work flow: pretrain the supernet → search the archtecture → train the teacher → train the student.
- You can monitor the whole process in the Tensorboard.
cd /home/FasterSeg/dataset
- If you want to train with custom data, your data set should consist of annotations and raw images. The patch size (height/width) should be divisible by 64. For example, we have included raw images and the corresponding annotations for training, validation and test data in the repository. (See here for the annotations or here for the raw images). Split your dataset into the folders train, val and test and place them there.
- Create a file or modify the existing file with the name
labelDefinitions.csv
, it should contain your classes and other attributes that you can find in the example file. For detailed information, check here. - If you want to clean up the dataset folder, run the following command:
# Clean up the directories so that the example data no longer exists.
python clean_up_datasets.py
- To prepare the annotations of the dataset, execute the following command:
# Prepare the annotations
python createTrainIdLabelImgs.py
- To create the mapping lists for the dataset, execute the following command:
# Create the mapping lists for the data
python create_mapping_lists.py
cd /home/FasterSeg/search
We first pretrain the supernet without updating the architecture parameter for 20 epochs. Adjust the following parameters for this:
- Set
C.num_classes = 4
in config_search.py. In this example we have 4 custom classes (Including the unlabeled class). - Make sure that in this step the parameter
C.pretrain = True
is set in config_search.py.
You may need to adjust the following parameters due to your hardware or data set:
- Set
C.niters_per_epoch = max(C.num_train_imgs // 2 // C.batch_size, 400)
in config_search.py. Maybe you reduce the value 400 to a value that corresponds to your circumstances. - Set
C.nepochs = 20
in config_search.py. Maybe you reduce the value 20 to a value that corresponds to your circumstances.
Start now the pretrain process:
CUDA_VISIBLE_DEVICES=0 python train_search.py
-
The pretrained weight will be saved in a folder like
/home/FasterSeg/search/search-pretrain-256x512_F12.L16_batch3-20200101-012345
. -
If you want to monitor the process with TensorBoard, run the following commands in a new terminal:
sudo docker exec -it 555c637442f3 bash
cd /home/FasterSeg/search
tensorboard --bind_all --port 6006 --logdir search-pretrain-256x512_F12.L16_batch3-20200101-012345
Open on your Host http://localhost:6006/ to monitor the process with TensorBoard.
We start the architecture searching for 30 epochs.
- Set the name of your pretrained folder (see above)
C.pretrain = "search-pretrain-256x512_F12.L16_batch3-20200101-012345"
in config_search.py.
You may need to adjust the following parameters due to your hardware or data set:
- Set
C.niters_per_epoch = max(C.num_train_imgs // 2 // C.batch_size, 400)
in config_search.py. Maybe you reduce the value 400 to a value that corresponds to your circumstances. - Set
C.nepochs = 30
in config_search.py. Maybe you reduce the value 30 to a value that corresponds to your circumstances.
Start the search process:
CUDA_VISIBLE_DEVICES=0 python train_search.py
-
The searched architecture will be saved in a folder like
/home/FasterSeg/search/search-224x448_F12.L16_batch2-20200102-123456
. -
arch_0
andarch_1
contains architectures for teacher and student networks, respectively. -
If you want to monitor the process with TensorBoard, cancel the previously executed TensorBoard process in the other terminal and execute the following commands there:
cd /home/FasterSeg/search
tensorboard --bind_all --port 6006 --logdir search-224x448_F12.L6_batch2-20200611-155014/
Open on your Host http://localhost:6006/ to monitor the process with TensorBoard.
- Copy the folder which contains the searched architecture into
/home/FasterSeg/train/
or create a symlink vialn -s ../search/search-224x448_F12.L16_batch2-20200102-123456 ./
. Use the following commands to copy the folder into/home/FasterSeg/train/
:
cd /home/FasterSeg/search
cp -r search-224x448_F12.L16_batch2-20200102-123456/ /home/FasterSeg/train/
- Change to the train directory:
cd /home/FasterSeg/train
- Set
C.num_classes = 4
in config_train.py. In this example we have 4 custom classes (Including the unlabeled class). - Set
C.mode = "teacher"
in config_train.py. - Set the name of your searched folder (see above)
C.load_path = "search-224x448_F12.L16_batch2-20200102-123456"
in config_search.py. This folder containsarch_0.pt
andarch_1.pth
for teacher and student's architectures.
You may need to adjust the following parameters due to your hardware or data set:
- Set
C.nepochs = 600
in config_train.py. Maybe you reduce the value 600 to a value that corresponds to your circumstances. - Set
C.niters_per_epoch = 1000
in config_train.py. Maybe you reduce the value 1000 to a value that corresponds to your circumstances.
Start the teacher's training process:
CUDA_VISIBLE_DEVICES=0 python train.py
- If you want to monitor the process with TensorBoard, cancel the previously executed TensorBoard process in the other terminal and execute the following commands there:
cd /home/FasterSeg/train
tensorboard --bind_all --port 6006 --logdir train-512x1024_teacher_batch12-20200103-234501/
Open on your Host http://localhost:6006/ to monitor the process with TensorBoard.
- The trained teacher will be saved in a folder like
train-512x1024_teacher_batch12-20200103-234501
- Set
C.mode = "student"
in config_train.py. - Set the name of your searched folder (see above)
C.load_path = "search-224x448_F12.L16_batch2-20200102-123456"
in config_train.py. This folder containsarch_0.pt
andarch_1.pth
for teacher and student's architectures. - Set the name of your teacher's folder (see above)
C.teacher_path = "train-512x1024_teacher_batch12-20200103-234501"
in config_train.py. This folder contains theweights0.pt
which is teacher's pretrained weights.
Start the student's training process:
CUDA_VISIBLE_DEVICES=0 python train.py
- If you want to monitor the process with TensorBoard, cancel the previously executed TensorBoard process in the other terminal and execute the following commands there:
cd /home/FasterSeg/train
tensorboard --bind_all --port 6006 --logdir train-512x1024_student_batch12-20200103-234501/
Open on your Host http://localhost:6006/ to monitor the process with TensorBoard.
To evaluate your custom FasterSeg model, follow the steps below:
cd /home/FasterSeg/train
- Copy
arch_0.pt
andarch_1.pt
into/home/FasterSeg/train/fasterseg
. For this, execute the following commands:
cd /home/FasterSeg/train/search-224x448_F12.L16_batch2-20200102-123456
cp {arch_0.pt,arch_1.pt} /home/FasterSeg/train/fasterseg/
- Copy
weights0.pt
andweights1.pt
into/home/FasterSeg/train/fasterseg
. For this, execute the following commands:
cd /home/FasterSeg/train/train-512x1024_student_batch12-20200103-234501
cp {weights0.pt,weights1.pt} /home/FasterSeg/train/fasterseg/
- Set
C.is_eval = True
in config_train.py. - Set the name of the searched folders as
C.load_path = "fasterseg"
in config_train.py (for teacher in config_train.py) andC.teacher_path="fasterseg"
in config_train.py.
Start the evaluation process:
CUDA_VISIBLE_DEVICES=0 python train.py
- You can switch the evaluation of teacher or student by changing
C.mode
in config_train.py.
We support generating prediction files (masks as images) during training.
- Set
C.is_test = True
in config_train.py andC.is_eval = True
in config_train.py. - If you want to create RGB predictions during the test, set
show_prediction=True
in train.py - During the training process, the prediction files will be periodically saved in a folder like
train-512x1024_student_batch12-20200104-012345/test_1_#epoch
.
To evaluate the accuracy of the model trained with customized labels, we use the evalPixelLevelSemanticLabeling.py
from cityscapes.
Install cityscapes scripts with pip3 install cityscapesscripts
on your machine.
For evaluating FasterSeg with customized labels, you need to replace the default evalPixelLevelSemanticLabeling.py
and labels.py
with the modified versions from ~/tools/cityscapes/.
The path to the groundtruth images, the predicted images, the label definitions and the result folder are set as environment variables. You can add the following lines to your '/home/.profile' to set the environment variables permanently.
- path to ground truth images:
export CITYSCAPES_DATASET="/home/user/.../groundtruth"
- path to predicted images:
export CITYSCAPES_RESULTS="/home/user/.../predictions"
- path to label definitions:
export DATASET_PATH="/home/user/FasterSeg/dataset"
- path to the evaluation result:
export CITYSCAPES_EXPORT_DIR="/home/user/.../resultfolder"
If the name of the predicted image differs from the name of the ground truth image, the filter has to be adjusted in this line.
Open evalPixelLevelSemanticLabeling.py
in an IDE like VS code and run it. The results will be printed in the terminal window and stored in the CITYSCAPES_EXPORT_DIR
.
cd /home/FasterSeg/latency
- If you have successfully installed TensorRT, you will automatically use TensorRT for the following latency tests (see function here).
- Otherwise you will be switched to use Pytorch for the latency tests (see function here).
If you want to run the latency test on a Jetson Xavier, we have a guide for that.
- Run the script:
CUDA_VISIBLE_DEVICES=0 python run_latency.py
cd FasterSeg/latency
- Run the script:
CUDA_VISIBLE_DEVICES=0 python latency_lookup_table.py
which will generate an .npy
file. Be careful not to overwrite the provided latency_lookup_table.npy
in this repo.
- The
.npy
contains a python dictionary mapping from an operator to its latency (in ms) under specific conditions (input size, stride, channel number etc.)
- FasterSeg: Searching for Faster Real-time Semantic Segmentation.