-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can you share a trained ucf24 weight file? #11
Comments
I did most of the experiments at old uni lab computers, I do not have access to those anymore. I will ask someone about it. Can you check GPU utilisations? also check the number of workers being used? |
Thanks for your reply. It runs faster later. In the past nearly 30 hours, it finishes 6 epochs. So not too bad. I believe the reason is that my data is saved on an HDD, not an SSD. |
My validation results at the 6th epoch are as follows, [INFO: val.py: 121]: Evaluating detections for epoch number 6 However, in Table 5 of your paper, you have frame mAP 75.2, which is a lot lower than the numbers I have. So is there anything wrong with my understanding? Thanks. |
Those are probably validation results not testing. Validation is done on subset if I rember right. run test_det.py |
Can you report what are the results when you run |
I stopped it. It took too long to run. Multiple GPU does not work. |
My data is saved on HDD, not on SSD. But still, it is too slow to take a week to run on the test set. I changed it to multiple GPUs but only one GPU is used. I do not have time to track down why. |
I think testing should work on multiple GPU. Might be worth spending 100 USD on SSD. Yes, I know that works. I haven't tried their code. Seems like a backward step from 3D-retinanet. I was waiting for https://github.com/ShoufaChen/WOO, but it seems the author is busy there. |
I already tried CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --MODE gen_dets. The multiple GPUs worked on training but not on testing. I think it should be something simple. YOWO backbone is a mixture of 3D Resnet and 2D YOLOv3. It works great. The frame mAP on UCF24 is >80. I think your paper should cite YOWO. If you want to publish, the reviewer will ask for it. I know the paper of WOO. I do not think they will ever open source their code. Can you give me your email so we can communicate more? Thanks. Or you can contact me at [email protected]. |
I used SSD before. I was an HDD/SDD chip design engineer for 13 years before I work on machine learning now. SSD can easily crash, particularly when we change data frequently, which is what I do. |
We wrote paper before YOWO, at start of 2021, it is already published https://www.computer.org/csdl/journal/tp/5555/01/09712346/1AZL0P4dL1e |
Working great alone is not good enough, you need to have simplicity and useability as well. Which is better in WOO and 3Dretinenet rather than YOWO. Don't take me wrong, YOWO is good but I don't like things that are overly complicated, same goes to interaction head in WOO. Seems shortsighted where it be good for AVA dataset but unnecessary for sports-based dataset like UCF24 and Multisports. |
You must test YOWO yourself. It is not complicated. It runs very fast. The paper states that as a contribution, see Table 9 of https://arxiv.org/pdf/1911.06644.pdf. It is faster than your early paper with 2D-SSD. |
On the same machine, it only takes less than 2 hours to process all the test videos with 4 1080Ti GPUs. I will double-check if the test video list (testlist_video.txt) is the same as in your 3D-RetinaNet. He uses the same dataset your cleaned up in your earlier repo. |
I just checked your annotation file 'pyannot_with_class_names.pkl' and your code. There is no separate test set, the test set is the same as the validation set. Here is where the dataset is defined in data/datasets.py,
You only have two datasets defined in main.py: So the validation results in the middle of training are meaningful test results. After 6 epochs I got results better than the ones you published in your paper. And there is no code test_det.py in your repo. |
Found out one issue. With the command you give on your front page, the gen_dets run on all videos, including training and testing videos. That is why it takes so long. We should add --TEST_SUBSET=test. I already added TEST_BATCH_SIZE=4 in my previous test.
When "--TEST_SUBSET=test" is not added, the above two lines do not run, and all videos are processed. |
The multiple GPU may be a Pytorch version problem. I am not sure. Anyway, I made some minor changes, now the gen_dets is running on 4 GPUs now. I will let you know the results later. |
There is no test set in ucf24, the subset is still |
OK. Got it. My full-test evaluation is still running. It will take a few more hours. I will post the results once I have them. |
Here are the results of eval_framewise_dets( ). I do not know which number is the one reported in Table 5. And this is after 6 epoch training. [INFO: evaluation.py: 545]: Evaluating frames for datasets ucf24 [INFO: evaluation.py: 626]: MAP:: 18.945189223935206 Results for test & action_ness [INFO: gen_dets.py: 379]: action_ness : 170580 : 5567603 : 22.376298904418945 : 95.04982829093933 Results for test & action [INFO: gen_dets.py: 379]: Basketball : 1323 : 2103555 : 11.37794777750969 : 97.73242473602295 Results for test & frame_actions [INFO: gen_dets.py: 379]: Non_action : 20762 : 159289 : 70.35434605863412 |
Here are the results of build_eval_tubes( ). Again this is after 6 epoch training. [INFO: evaluation.py: 347]: Evaluating tubes for datasets ucf24 Results for test & action @ 0.20 stiou [INFO: tubes.py: 87]: Basketball : 35 : 19411 : 10.160548239946365 : 91.42857193946838 Results for test & action @ 0.50 stiou [INFO: tubes.py: 87]: Basketball : 35 : 19411 : 0.5247263237833977 : 48.571428656578064 |
Honestly, what is going on here? I will rerun it myself on my new machine and let you know the results. Results look too bad to be true. |
I did not make changes to the algorithm itself. I guess the biggest factor is that I only trained the model for 6 epochs. How many epochs should I train to get good performance? |
CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py /home/user/ /home/user/ /home/user/kinetics-pt/ --MODE=train --ARCH=resnet50 --MODEL_TYPE=I3D --DATASET=ucf24 --TRAIN_SUBSETS=train --VAL_SUBSETS=val --SEQ_LEN=8 --TEST_SEQ_LEN=8 --BATCH_SIZE=4 --LR=0.00245 --MILESTONES=6,8 --MAX_EPOCHS=10 |
I went to have look at YOWO, performance on AVA is lacking by quite a bit almost 2/3rd from slowfast. |
The directory ucf24 google drive is empty. I am training now with 4 x 1080TI GPUs on ucf24. It runs very slowly, "Itration [1/10]006291/121470" took almost 2 hours. At this speed, 1 epoch will take 40 hours. I do not know what is wrong and why it runs so slowly.
Thanks.
The text was updated successfully, but these errors were encountered: