Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can i train my own model? #2

Closed
fjzpcmj opened this issue Dec 14, 2017 · 8 comments
Closed

How can i train my own model? #2

fjzpcmj opened this issue Dec 14, 2017 · 8 comments

Comments

@fjzpcmj
Copy link

fjzpcmj commented Dec 14, 2017

How can i train my own model? I'll appreciate it !

@cguindel
Copy link
Owner

The original py-faster-rcnn allows training using Pascal VOC 2007/2010/2012 and COCO, and I have modified it to include the KITTI object dataset, where viewpoint labels are available. To train using KITTI:

  1. Create a training script in experiments/scripts, similar to leaderboard.sh.
  2. Add a configuration file with the training parameters in experiments/cfgs; check leaderboard.yml for example.
  3. Create a model folder in models/kitti/{VGG16,ZF}/ with a solver and its train/test prototxt files.

Then, run the training script. If you run leaderboard.sh as it is, it will train my model to get the official results at here. Trained networks are saved under output/{experiment directory}/{dataset name}/. If you need to evaluate the results, you can create a test_*.sh script in experiments/scripts; then you can use this to obtain the KITTI official metrics.

In case you want to use your own dataset, the process is unfortunately not so straightforward, and you will probably need to understand the code and modify it, particularly the one placed in lib/datasets. You can get some help in the official py-faster-rcnn repository; e.g. this issue. Another option is just adapting the annotations of the new dataset to follow the format of the KITTI (or Pascal, or COCO) annotations and "fake" its folder structure. Please keep in mind that my modifications should be dataset-agnostic, but I have not actually tested them outside the KITTI dataset.

If you provide more details about what you are trying to do, maybe I can further help you.

@fjzpcmj
Copy link
Author

fjzpcmj commented Dec 15, 2017

Hi @cguindel , thanks for your detailed answer sincerely.I use train_kitti_lsi_frcnn.sh to retrain the model.Is there any difference between leaderboard.sh and train_kitti_lsi_frcnn.sh? Can i use test_kitti_lsi_frcnn.sh to test my model? I am a starter in object detection,first i just want to finish my class project about car detection and viewpoint estimation.Again thanks very much.

@cguindel
Copy link
Owner

train_kitti_lsi_frcnn.sh is a former version which does not include some improvements that I've discovered later. I would strongly recommend using leaderboard.sh, if you want to detect Car/Pedestrian/Cyclist or leaderboard_7cls.sh, to train the model using all the available classes (Person_sitting, Tram, etc.)

Do you want to obtain quantitative results (i.e., Average Precision, etc.)? If not, you do not need to use a test script. Otherwise, you will need to split the training set into train/validation subsets. To that end, you will need to modify this line in leaderboard.sh:
TRAIN_IMDB="kitti_training"
to:
TRAIN_IMDB="kitti_trainsplit"
and then use test_leaderboard.sh replacing:

TRAIN_IMDB="kitti_training"
TEST_IMDB="kitti_test_testing"

by

TRAIN_IMDB="kitti_trainsplit"
TEST_IMDB="kitti_valsplit"

That way, you will be training with 3682 images and obtaining results based on 3799 validation images. I think this is the only change that you will need to undertake in order to re-train the model after the last commit 63dbdc3. Please try it and tell me what you got; once you have the model, I can help you to obtain detection stats if you need it.

@fjzpcmj
Copy link
Author

fjzpcmj commented Dec 19, 2017

It's very nice of you to help me so much.My project only need to detect car and car's viewpoint.I want to change your code to classify only two class(car and background).Also,have you ever used continuous viewpoint estimation,that is to say, consider the problem as a regression problem..First,I should read your code carefully.Then I will try my best to do some change to fit my project.I will be grateful if you can help me to obtain quantitative results.

@cguindel
Copy link
Owner

To modify the number of classes, you need to change the parameter CLASSES in the configuration file under experiments/cfgs/ (in your case, you should set it to CLASSES: ['__background__', 'Car']) and the train/test prototxt files under models/. For the latter, I tried to add a comment just above every instance that depends on the number of classes; I recommend searching for N_CLASSES and modify the corresponding values from 4 to 2.

Regarding the continuous viewpoint estimation: yes, I am currently trying different alternatives (e.g., regression using Smooth L1 loss), but I have found that they do not improve the results, at least in terms of Average Orientation Similarity (the metric used in the KITTI benchmark). I plan to release the code to replicate these experiments, though; but that is a long-term project.

If you finally achieve some results using this code, please let me know!

@fjzpcmj
Copy link
Author

fjzpcmj commented Dec 21, 2017

Should I change the matrix H?because a problem occurs that
in infogain_loss_layer.cpp line 72 : infogain->count()==num_labels*num_lables_ (16 vs 4)_
,it seems that I should make the matrix H for my case.Following your paper, I find the formulation of matrix H, but how can i get f_k and f_min?

@cguindel
Copy link
Owner

Oh, yes, sorry. I forgot about the infogain matrix. As you may have read in the paper, I use the infogain loss to tackle the effect of imbalanced classes in the dataset; particularly, the presence of more cars than pedestrians and cyclists. If you are only using the 'Car' label, you probably can replace the infogain loss with a regular 'SoftmaxWithLoss' loss layer. If you still want to manage the imbalance between 'Car' and background samples, you will need to edit the infogainH.binaryproto; the relevant values that I use in the H matrix are h00 for background regions and h11 for cars. Anyway, note that equation 9 from the paper cannot be applied here because you only have a class.

@cguindel
Copy link
Owner

I am closing this issue due to the lack of activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants