Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduction of Perception Models #107

Open
mrabiabrn opened this issue Nov 14, 2024 · 4 comments
Open

Reproduction of Perception Models #107

mrabiabrn opened this issue Nov 14, 2024 · 4 comments

Comments

@mrabiabrn
Copy link

Hi,
When I checked the CVT results in Table 1 and Table 3, I noticed a discrepancy. In Table 1, the oracle performance for vehicle segmentation is reported as 33.66, which I understood to be your reproduction of CVT. However, in Table 3, the performance is listed as 36.0, consistent with the original paper. Additionally, it seems that the augmentation performance in Table 3 is added on top of the originally reported results.

Does the increase comes from 33.66 or 36.0? Could you clarify this?

Thank you

image image
@flymin
Copy link
Member

flymin commented Nov 18, 2024

33.66 is tested with 224x400 inputs, which are pre-downsampled and upsampled to keep the same pipeline as we test the generated views.

36.00 is from the original settings of CVT, using the raw data from the validation set. Our reproduction matched this performance.

@mrabiabrn
Copy link
Author

So the evaluation result of 36.00 corresponds to the 224x448 resolution (as in the original paper), using the raw nuScenes validation set.
The result of 33.66 was obtained by evaluating at 224x400 (model was trained on 224x448).
For Table 3, the model was trained at 224x448 using a mixed dataset (real + generated).

@flymin
Copy link
Member

flymin commented Nov 18, 2024

If the original CVT setting is 224x448, then yes (I forgot the details of CVT).

We did not change the original data processing pipeline of CVT, which loads from 900x1600 images. Therefore, to use generated views, we upsample and pad to 900x1600 and then go through CVT. Oracle is obtained with 900x1600 -> 224x400 -> 900x1600 -> CVT.

@mrabiabrn
Copy link
Author

Thank you for the clarification 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants