Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading MOT dataset with seqinfo produces 0-based indexing in frames #560

Closed
maartenvds opened this issue Nov 23, 2021 · 5 comments · Fixed by #564
Closed

Reading MOT dataset with seqinfo produces 0-based indexing in frames #560

maartenvds opened this issue Nov 23, 2021 · 5 comments · Fixed by #564
Assignees
Labels
BUG Something isn't working data formats PR is related to dataset formats

Comments

@maartenvds
Copy link
Contributor

According to the MOT data format spec, "All frame numbers, target IDs and bounding boxes are 1-based" (quote from https://github.com/dendorferpatrick/MOTChallengeEvalKit/tree/master/MOT#data-format). If this claim is correct, there is a bug in mot_format.py.

When testing the CVAT tool, which uses datumaro, I failed to import a MOT dataset with an error "ValueError: Unknown internal frame id -1\n". To debug, I installed CVAT in developer mode (with VS code) and found two issues that lead to the cause of this error, where one affects this repository. In mot_format.py on line 147 (latest develop branch) there are the following lines:

for row in csv.DictReader(csv_file, fieldnames=MotPath.FIELDS):
    frame_id = int(row['frame_id'])
    item = items.get(frame_id)

Since items contains frame ids that start with zero, the item with id zero never gets accessed in this loop because frame_id starts with 1 when mot annotation files is loaded. Also, a possible overrun can occur when frame_id equals the length of the sequence (which is possible since its base-1).

My suggested fix:

for row in csv.DictReader(csv_file, fieldnames=MotPath.FIELDS):
    frame_id = int(row['frame_id']) - 1 # one based frame ids
    item = items.get(frame_id)
@zhiltsov-max
Copy link
Contributor

zhiltsov-max commented Nov 23, 2021

Hi, thank you for reporting the problem! Probably, we need to review the indexing logic. As I see, we already subtract 1 in the CVAT format handler. Do you use a seqinfo file?

@maartenvds
Copy link
Contributor Author

maartenvds commented Nov 23, 2021

Yes I use a seqinfo file. I also reported a bug on the CVAT repo that is related this this one cvat-ai/cvat#3940 (I included a description of the .zip file I used over there). Since datumaro returned zero indexed item ids, regardless of my suggested fix, the -1 in CVAT format handler is wrong and also contributed to this problem. But indeed its a good thing to review the indexing logic. However, it would be nice if this issue got fixed as soon as possible.

@zhiltsov-max
Copy link
Contributor

Yes I use a seqinfo file.

Then, probably, the fix should be done here: https://github.com/openvinotoolkit/datumaro/blob/develop/datumaro/plugins/mot_format.py#L125-L126

We should just start from 1 instead of 0.

@maartenvds
Copy link
Contributor Author

We should just start from 1 instead of 0.

That also works and does not require modifications to the CVAT tool.
I implemented the following to test your suggestion:

if self._seq_info:
            for frame_id in range(1, self._seq_info['seqlength'] + 1):  # base-1 frame ids
                items[frame_id] = DatasetItem(
                    id=frame_id,

And it works!

Shall I make a PR for this?

@zhiltsov-max
Copy link
Contributor

Glad to hear this.

Shall I make a PR for this?

Yes, it would be great!

@zhiltsov-max zhiltsov-max added BUG Something isn't working data formats PR is related to dataset formats labels Nov 24, 2021
@zhiltsov-max zhiltsov-max changed the title Bug in mot_format.py Reading MOT dataset with seqinfo produces 0-based indexing in frames Nov 24, 2021
maartenvds added a commit to maartenvds/datumaro that referenced this issue Nov 24, 2021
zhiltsov-max pushed a commit that referenced this issue Nov 24, 2021
* Suggested fix for upstream issue #560

* Added unit test for mot_format.py that covers a dataset with seqinfo.ini

* Updated changelog with bugfix info
@zhiltsov-max zhiltsov-max linked a pull request Dec 21, 2021 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG Something isn't working data formats PR is related to dataset formats
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants