Possible causes for low StrongSORT++ results on MOT17 evaluation #74

Edo745 · 2023-02-17T00:18:09Z

Hi, I would like to use StrongSORT for a university project, and my current goal is to replicate the results reported in your paper. I have started with the validation set, as follows:

I run StrongSORT on MOT17-val using the following command:

!python strong_sort.py MOT17 val --BoT --ECC --NSA --EMA --MC --woC --AFLink --GSI --save_dir StrongSORT/results/MOT17-val_results/StrongSORT++

I evaluated the performance of StrongSORT++ using the gt.txt files of the MOT17/train directory, which I downloaded from https://motchallenge.net/data/MOT17/:

!python scripts/run_mot_challenge.py
--GT_FOLDER StrongSORT/TrackEval/data/gt/mot_challenge/
--BENCHMARK MOT17
--SPLIT_TO_EVAL train
--TRACKERS_TO_EVAL StrongSORT/results/MOT17-val_results/StrongSORT++
--TRACKER_SUB_FOLDER ''
--METRICS HOTA CLEAR Identity VACE
--USE_PARALLEL False
--NUM_PARALLEL_CORES 1
--GT_LOC_FORMAT '{gt_folder}/{seq}/gt/gt.txt'
--OUTPUT_SUMMARY False
--OUTPUT_EMPTY_CLASSES False
--OUTPUT_DETAILED False
--PLOT_CURVES False
--SEQMAP_FILE StrongSORT/TrackEval/data/gt/mot_challenge/seqmaps/MOT17-train.txt"

Although the results are relative to the validation set, they seem much lower than those reported in the paper:

HOTA DetA AssA DetRe DetPr AssRe AssPr LocA OWTA HOTA(0) LocA(0) HOTALocA(0)
41.587 33.999 51.061 35.731 81.15 54.548 84.575 86.627 42.682 48.385 83.596 40.448

MOTA MOTP MODA CLR_Re CLR_Pr MTR PTR MLR sMOTA CLR_TP CLR_FN CLR_FP IDSW MT PT ML Frag
37.753 84.933 37.878 40.955 93.013 20.696 31.136 48.168 31.582 45.991 66.306 3.455 141 113 170 263 183

IDF1 IDR IDP IDTP IDFN IDFP
53.254 38.351 87.099 43.067 69.230 6.379

Dets GT_Dets IDs GT_IDs
49.446 112.297 395 546

I would appreciate any guidance on what might be the cause of this discrepancy.

dyhBUPT · 2023-02-17T01:15:19Z

Hi, your downloaded gt files contains all sequence frames.
However, our tracking results on validation set only contains the second half of frames.
Please use our provided gt files instead.
Details in https://github.com/dyhBUPT/StrongSORT#evaluation.

Best wishes.

Jumabek · 2023-10-06T09:47:16Z

@dyhBUPT even so, isnt the results too low? compared to the ones on Test set of MOT17?

I mean 40 HOTA vs 63 HOTA suggests me there could be some bug.

Jumabek · 2023-10-06T09:55:01Z

@dyhBUPT I see now

that is because detections are only provided for the second half of the videos.
Since there is no detection in the first half of the videos, tracker has poor performance on the first half which leads to overall low performance.

Jumabek · 2023-10-06T23:19:24Z

For those who has similar issue,

I solved it by downloading these author provided splits of MOT17
https://drive.google.com/drive/folders/1wdG-vJOMGynf5QjqEa1jNZNYtz0PQqPu

Then put those into each of original MOT17's 7 train sequence's gt folders as gt_val_half_v2.txt

Then, Running below command produices expected results

python scripts/run_mot_challenge.py --BENCHMARK MOT17 --SPLIT_TO_EVAL train --TRACKERS_TO_EVAL /home/juma/code/StrongSORT/results/StrongSORT_Git/tmp/ --TRACKER_SUB_FOLDER '' --METRICS HOTA CLEAR Identity VACE --USE_PARALLEL False --NUM_PARALLEL_CORES 1 --GT_LOC_FORMAT '{gt_folder}/{seq}/gt/gt_val_half_v2.txt' --OUTPUT_SUMMARY False --OUTPUT_EMPTY_CLASSES False --OUTPUT_DETAILED False --PLOT_CURVES False

** Some notes on why split the MOT17 train set into train and val ?**
SO that detector sees the background of MOT17 by seeing first half of the each of 7 train sequences/videos

Isn't invalid evaluation?
(+) In practical scenarios such as mounted camera, we already know the background. So, it is actually good if we can obtain better performance by incorporatring background + object scale + other info by showing first half (some portion) of the videos to be tracked.
(-) MOT17 seuqnce is already short to evaluate trackers. Longer sequences can reveal which trackers are better at association. Howver, if we split it into train/aval then sequnces becaome even shorter. This in turn biases evaluation more on the detector side. Meaning strong detector based trackers perform well.

Edo745 closed this as completed Feb 17, 2023

mikel-brostrom mentioned this issue Nov 23, 2023

Cannot Reproduce the MOT17 Scores mikel-brostrom/boxmot#1156

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible causes for low StrongSORT++ results on MOT17 evaluation #74

Possible causes for low StrongSORT++ results on MOT17 evaluation #74

Edo745 commented Feb 17, 2023

dyhBUPT commented Feb 17, 2023

Jumabek commented Oct 6, 2023

Jumabek commented Oct 6, 2023

Jumabek commented Oct 6, 2023

Possible causes for low StrongSORT++ results on MOT17 evaluation #74

Possible causes for low StrongSORT++ results on MOT17 evaluation #74

Comments

Edo745 commented Feb 17, 2023

dyhBUPT commented Feb 17, 2023

Jumabek commented Oct 6, 2023

Jumabek commented Oct 6, 2023

Jumabek commented Oct 6, 2023