-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional issues trying to finetune on custom (VQA-like) dataset (VizWiz) #105
Comments
Hi, I'm afraid there is a misunderstanding. The |
Thank you for the very quick response. I figured that might be the case, thanks for the clarification. Is there maybe a specific resource on how to prepare the file for a custom dataset? Another minor question: In my created .tsv files, the index starts at 0 for each subset. Is that fine or do I have to add an offset for the other subsets? |
The The index of samples from each subset can be overlapped. The most important thing is to keep the index of the test samples conformed with the original dataset to make sure you get the correct evaluation score when submitting to the official evaluation server. |
Thanks again for the extensive answer. So, the contents of the pickle file are the most frequent answers from both the train and validation subset (combined), right? Is there a guideline on how many frequent answers to include? (Small correction for anyone reading this in the future: the overflow problem is mentioned in issue #59) |
Thanks a lot for noticing the wrong reference of issue! ❤️ I've corrected my comment above. On VQA, our choice of candidate answer set is mainly following the common practice in previous works (VLMo, UNITER, etc.), which have employed a fixed set containing 3,129 frequent answers. I would recommend to refer to previous works for Vizwiz on how to determine the proper size of the frequent candidate set. Please check the answers in the pre-processed (maybe filtered) training samples all be covered in the pickle dict to avoid the overflow training issue. |
Many thanks for all the help. It seems to be running pretty well now, given the small size of the dataset. |
Hi Velcorn, |
Hey, sorry for the late answer, I've written this script to generate the .pkl file and .tsv files from the VizWiz-VQA dataset: https://github.com/Velcorn/OFA/blob/main/dataset/preprocess_vizwiz.py |
Thank you so much Velcorn for sharing! It helps a lot! |
You're welcome. Let me know if you have any questions! |
Hello, first I'd like to thank you for your amazing work and especially all the detailed answers to issues.
I've been following the different issues on the finetuning on a custom dataset (VizWiz) and produced the .tsv files according to your format. You stated in issue #76 that the trainval_ans2label.pkl file is not used when using beam-search evaluation - is this correct?
I've skipped its creation and training does run for the first epoch. However, upon validation on the valid subset, I get an assertion error in the sequence_generator.py - I've tracked down the error and I can "fix" it by removing the one extra step that is for the EOS marker, but my understanding of how to properly fix that error is limited.
To give some more information of how the .tsv files look, I have attached an image for the train and val subset.
Thank you very much for any kind of input in advance!
The text was updated successfully, but these errors were encountered: