Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CosPlace for global features extraction #257

Merged
merged 2 commits into from
Feb 12, 2023
Merged

Add CosPlace for global features extraction #257

merged 2 commits into from
Feb 12, 2023

Conversation

gmberton
Copy link
Contributor

@gmberton gmberton commented Jan 4, 2023

I added CosPlace as a global features extractor as it is the latest SOTA in image retrieval for geolocalization.
There are multiple models available, I set as default the one with ResNet50 and features dimensionality of 2048, which provides a good trade-off between speed and accuracy.
However, I didn't provide any way to use the other models without changing the hardcoded default parameters in default_conf here. Perhaps I could add the option to select other models? I'm thinking that it might be useful to have lightweight features for large datasets (e.g. CosPlace with ResNet-101 with features dimensionality 128 which still outperforms 30x larger features from NetVLAD and OpenIBL).
Link to CosPlace paper - sorry for the self-promotion 😉 but the link should be useful to back the claims

@sarlinpe
Copy link
Member

Thanks for the PR, this is great! Nice work, self-promotion is of course encouraged :) Just out of curiosity: are you able to run the reconstruction + localization on Aachen v1.1 (using this pipeline) and submit the results to visuallocalization.net so we see how much this actually helps over NetVLAD? You can set --num_loc to 10-20 to magnify the improvements.

@gmberton
Copy link
Contributor Author

gmberton commented Feb 7, 2023

Hello!
It took me a while to reply, because honestly I couldn't make sense of the results (and there was CVPR rebuttal), so I started computing more and more experiments.
Here is a table of my experiments:

Method Backbone Features Dim --num_loc Day Night
NetVLAD VGG-16 4096 10 88.5 / 95.0 / 98.2 78.6 / 88.8 / 95.9
CosPlace ResNet-18 256 10 87.5 / 94.8 / 97.6 79.6 / 85.7 / 92.9
CosPlace ResNet-50 2048 10 88.6 / 95.3 / 98.5 79.6 / 87.8 / 96.9
CosPlace ResNet-152 2048 10 89.1 / 95.8 / 98.9 80.6 / 86.7 / 96.9
NetVLAD VGG-16 4096 20 89.4 / 95.9 / 99.3 84.7 / 92.9 / 99.0
CosPlace ResNet-18 256 20 89.0 / 95.1 / 98.3 80.6 / 89.8 / 96.9
CosPlace ResNet-50 2048 20 89.8 / 96.0 / 99.2 82.7 / 89.8 / 98.0
CosPlace ResNet-152 2048 20 90.4 / 96.0 / 99.3 84.7 / 90.8 / 99.0
NetVLAD VGG-16 4096 50 90.3 / 96.1 / 99.4 83.7 / 93.9 / 100.0
CosPlace ResNet-18 256 50 90.3 / 95.8 / 99.0 85.7 / 93.9 / 99.0
CosPlace ResNet-50 2048 50 90.8 / 96.5 / 99.4 82.7 / 92.9 / 98.0
CosPlace ResNet-152 2048 50 89.9 / 96.4 / 99.4 83.7 / 93.9 / 99.0

I was expecting to see CosPlace vastly outperform NetVLAD (the image below are results on geo-localization retrieval dataset, and the gap between CosPlace and NetVLAD is very wide especially on Tokyo which contains night images), but it seems that results are quite similar. My guess is that for such a small dataset as Aachen, the retrieval method is not that important, and that Aachen is quite saturated as a dataset (at least for what concerns the retrieval method).
What is your take on this? I'm not very familiar with visual localization methods and this repo, so I'm just speculating. Also, do you think it would be helpful to compute results with num_loc in [1, 2, 5]?

backbones

PS: these results are on Aachen Day-Night, not on Aachen Day-Night v1.1. Somehow using the pipeline from Aachen_v1_1, the output file has 922 lines, which is the number of queries in Aachen Day-Night.

PPS: I see that there are a few closed issues (e.g. this) from people who had troubles running the pipeline on Aachen. I had the same issue, because I downloaded the dataset from visuallocalization.net. Once I saw the README in the pipelines/Aachen_v1_v1 then my problems were solved :)

@sarlinpe
Copy link
Member

sarlinpe commented Feb 12, 2023

Nice, thanks for the extensive results! Aachen is indeed pretty saturated, so these results only confirm that CosPlace works at least as well as NetVLAD. Differences of <1% are not reliable because of the small number of queries (especially on night) and the randomness in the pipeline. Aachen v1.1 is a tad better w.r.t. size but I don't expect a large difference. Our LaMAR benchmark is much better on this point so I'll run a few evals when I find the time.

Thanks again for all your work!

@sarlinpe sarlinpe merged commit 61e0cd0 into cvg:master Feb 12, 2023
@gmberton
Copy link
Contributor Author

Hi @sarlinpe @Phil26AT
I'm back on this PR following our conversation on adding different architectures and EigenPlaces.

So here is the list of the architectures available for CosPlace (multiple backbones/output dimensionality). To change the architecture it is only needed to change the two parameters of default_conf

What do you think is the cleanest way to allow the user to change the architecture?

And regarding EigenPlaces, the code is basically the same as for CosPlace (and also for EigenPlaces multiple architectures are available from torch.hub) and it is only required to change the "CosPlace" in this line to "EigenPlaces".

So basically passing 3 parameters (backbone, dimensionality and method, where method can be CosPlace or EigenPlaces) would provide the opportunity to choose among >40 models.

@sarlinpe
Copy link
Member

sarlinpe commented Oct 16, 2023

This is already great then, users can already overwrite any of these parameters when calling extract_features.main from Python. This is currently not supported from the CLI but I'll fix this soon. Could you send a PR to add an additional config entry (say variant) to switch between CosPlace and EigenPlaces? Maybe making the latter as default? Thanks!!

@gmberton
Copy link
Contributor Author

Sure I'd be happy to make a PR.
Just to clarify: are you referring to adding one more entry to this default_conf dict?
If that's what you mean, I think it would be a bit confusing to have the model for EigenPlaces within a file called cosplace.py and in a model called CosPlace. I would create a separate file eigenplaces.py: this has the disadvantage of duplicating some lines of code, but I think it is more readable and anyway I think those parts of the code are very rarely modified, making code duplication not too big of an issue.
What do you think about it?

@sarlinpe
Copy link
Member

Just to clarify: are you referring to adding one more entry to this default_conf dict?

Yes. You might rename the model+file to accommodate both Eigen/CosPlace, but I'd really prefer having a single file since the two models are conceptually very similar.

@sarlinpe
Copy link
Member

Is EigenPlaces always better than CosPlace? if so, can we just replace CosPlace with EigenPlaces? I haven't seen it much used so far anyway.

@gmberton
Copy link
Contributor Author

Yes, EigenPlaces is always better than CosPlace on standard VPR datasets. I'm not sure if this directly translates to better performances when used as a pre-processing step for tasks like SfM and visual localization, but my guess is that yes, it does.
If you don't want to add a new file for EigenPlaces, then I'm okay with adding the variant entry to the dictionary, and set EigenPlaces as the default, while also allowing the user to switch to CosPlace.
Should I set the ResNet101 as the default model (instead of the current ResNet50)? It gives slightly better results.
I'll be waiting for your final confirmation and then create the PR

@sarlinpe
Copy link
Member

I'm okay with adding the variant entry to the dictionary, and set EigenPlaces as the default, while also allowing the user to switch to CosPlace. Should I set the ResNet101 as the default model (instead of the current ResNet50)?

Sounds good, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants