Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Training GQNs on CLEVR dataset #15

Open
loganbruns opened this issue Jun 19, 2019 · 0 comments
Open

Training GQNs on CLEVR dataset #15

loganbruns opened this issue Jun 19, 2019 · 0 comments

Comments

@loganbruns
Copy link

I have done some limited experiments with training Generative Query Networks with CLEVR dataset so I can experiment with using them instead of RESNET based embeddings for CLEVR VQA.

I modified the CLEVR dataset generator to create additional images and metadata to allow GQNs to be trained on the CLEVR dataset domain. Namely to create multiple views of the same scene from different perspectives using a camera moving along a ring. Also preserving camera pose so it can be used both for GQN training and for the original image for generating embeddings for the CLEVR image.

Here as an example during training where for GQNs the objective is to train the model to predict a new view, previously unseen, after being given multiple contexts from different angles.

Screen Shot 2019-06-11 at 6 04 24 AM

You can see even with limited training time it does a decent job of predicting the new view even with mostly accurate shadows. Below is a test time example.

Screen Shot 2019-06-16 at 1 41 00 PM

I saw some promising preliminary results using the baseline models in clevr-iep but I also think that this might be an interesting area for others to investigate too. At least the intuition is that neural scene representations could improve scene understanding.

Before I clean up my code for a pull request I was wondering if there might be interest in a pull request? Below is my branch code that I would generalize and clean up.

master...loganbruns:clevr_gqn

Thanks,
logan

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant