-
Notifications
You must be signed in to change notification settings - Fork 5
Object Detection
We are using the TensorFlow Object Detection API to retrain the models with the objects that we need. Check installation and more info there. All this has been tried for python2/3 and tensorflow==1.15.
Colab offers free access to a computer that has reasonable GPU good for training, it is a cloud service based on Jupyter Notebooks and internet connectivity is required for access.
The first step is to take pictures of the objects to train, after doing that, you have to reduce the resolution of the pictures (this reduces training time) and split the images into train/test folders.
run the transform_image_resolution.py script:
python transform_image_resolution.py -d ../images/ -s 800 600
-
-d
is the directory containing all the images. -
-s
is the resolution that will be applied to the images.
run the split_images script.
python split_images.py -d ../images/complete_dataset -o ../images --train 80
-
-d
is the directory containing all the images. -
-o
is the directory where the train and test folder will be created. -
--train
is the percentage of images that will be used for training, 80% for train and %20 for test is the recommended.
Now you need to label the images, use the labelimg open source tool to label all the pictures in both train/test directories (this is a tedious and long process).
As of this point you should have a image
directory that contains your train and test images with respective xml file of each images.
Download the generate_tfrecord.py script and change the class_text_to_int
function by adding your own labels.
def class_text_to_int(row_label):
if row_label == 'powerade':
return 1
elif row_label == 'chocolate':
return 2
elif row_label == 'dr_pepper':
return 3
elif row_label == 'danup':
return 4
else:
None
Download the labelmap file and change the id
and name
to your own labels. NOTE: Be consistent with the id you wrote on the class_text_to_int
function.
We are using the Faster-RCNN-Inception-V2 model. Download the model here. Open the downloaded faster_rcnn_inception_v2_coco_2018_01_28.tar.gz file with a file archiver and extract the faster_rcnn_inception_v2_coco_2018_01_28 folder.
Download the faster_rcnn_inception_v2_pets file. Here you can find several parameters like batch size, learning rate, etc. Then, change:
- Line 9. Change
num_classes
to the number of different objects you want the classifier to detect. - Line 130. Change
num_examples
to the number of images you have in the\images\test
directory.
- Create a directory in your google drive.
- Download the train_model notebook from the RoBorregos @Home repo.
- Go to Colab, sign in with the same account you used to create the directory, create a new notebook and open the train_model notebook.
- In the notebook go to Runtime > Change Runtime Type and make sure to select GPU as Hardware accelerator.
- Click connect to start using the notebook
The first command is to check if you are using GPU.
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))
You should see Found GPU at: /device:GPU:0
The second command is to mount Google Drive with the notebook, click on the link. Then sign in with your google drive account, and grant it access. you will be redirected to a page, copy the code on that page and paste it in the text-box of the Colab session you are running.
The following commands are very straightforward, just make sure to change the paths by replacing the name of your folder. E.g:
cd /content/gdrive/My Drive/@Home/models/research/
To:
cd /content/gdrive/My Drive/Your_Folder/models/research/
After cloning the repo, you have to upload the previous files you edited to the object_detection
folder.
- Upload the
generate_tfrecord.py
script toyour_folder/models/research/object_detection
- Download the xml_to_csv.py script and upload it to
your_folder/models/research/object_detection
- Create a folder in
your_folder/models/research/object_detection
namedtraining
and upload thelabelmap.pbtxt
andfaster_rcnn_inception_v2_pets.config
files. - Create a folder in
your_folder/models/research/object_detection
namedimages
and upload the test/train folders containing the images and xml files. - Upload the
faster_rcnn_inception_v2_coco_2018_01_28
folder to theobject_detection
folder.
Follow the notebook and continue with the commands.
Once you run the xml_to_csv
and generate_tfrecords
scripts, the next step is to run the train.py
script to start the training, just keep following the notebook and after you run the train script in absence of errors, you should see and output like this:
INFO:tensorflow:global step 1: loss = 25.45 (5.327 sec/step)
........
........
INFO:tensorflow:global step 1350: loss = 0.6345 (0.231 sec/step)
INFO:tensorflow:global step 1351: loss = 0.5220 (0.332 sec/step)
INFO:tensorflow:global step 1352: loss = 0.6718 (0.133 sec/step)
INFO:tensorflow:global step 1353: loss = 0.6758 (0.432 sec/step)
INFO:tensorflow:global step 1354: loss = 0.7454 (0.452 sec/step)
INFO:tensorflow:global step 1355: loss = 0.8354 (0.323 sec/step)
As we are running this in Colab, with 3-4 hours its pretty enough, every 5 minutes aprox the changes are automatically saved in the training
folder. Stop the training with CTRL + C
.
The final step is to export your inference graph, the command is already in the notebook, the only thing you have to change is the --trained_checkpoint_prefix
flag. Go to the training
folder and you will see some of the last checkpoints the model saved, copy the id
of the last checkpoint and change it on the command:
--trained_checkpoint_prefix training/model.ckpt-158879
Finally, you can run the code in the last shell of the notebook to test your model, take new pictures to test the model and place them on a folder named test_images
. Remember you should use for inference at least the same tf version that the one used for training.
Change the IMAGE_NAME
on the script:
IMAGE_NAME = 'test_images/IMG_0681.jpg'
Run the shell, you should see the image with the objects detected:
In order to use the pre-trained object detection model, please make sure to follow the next release forms
Some useful scripts have been written under to handle different type of output detections.
In order to run the scripts just run the following command in the shell with the desired script:
python scripts/object_detection_image.py
Most of the python scripts have been optimized to be run from any directory within robocup-home:
base_directory = 'object_detection'
path_to_file = os.path.abspath(__file__) # get absolute path to current working file
index_of_base_directory = path_to_file.find(base_directory)
WORKING_DIR = path_to_file[0:index_of_base_directory + len(base_directory)]
The model and label_map files are relative to the file been run and the working directory:
MODEL_NAME = 'saved_model'
CWD_PATH = os.path.join(WORKING_DIR, 'models', 'model_tf2')
PATH_TO_SAVED_MODEL = os.path.join(CWD_PATH, MODEL_NAME)
PATH_TO_LABELS = os.path.join(CWD_PATH,'label_map.pbtxt')
video_detection.py is a script dedicated to perform object_detection from live feed from either a pc webcam or the Intel D435i:
# load model from get_object_and_coordinates
run_inferance_on_image = load_model()
if use_intelRS_camera == True:
pipeline = create_intelrs_pipeline()
run_inference_with_intel_camera(run_inferance_on_image, pipeline, show_video)
else:
cap = cv2.VideoCapture(0)
run_inference_with_pc_camera(run_inferance_on_image, cap, show_video)
In order to decide which device is going to be used, the -i
flag is needed to choose if the Intel Camera will be used and the -s
flag to enable streaming of the video to see the live detections:
python scripts/video_detection.py -i True -s False