Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some questions on evaluating #5

Open
ALR-alr opened this issue Aug 26, 2024 · 4 comments
Open

some questions on evaluating #5

ALR-alr opened this issue Aug 26, 2024 · 4 comments

Comments

@ALR-alr
Copy link

ALR-alr commented Aug 26, 2024

How can the model evaluate on GLEU tasks?The tasks are text-pure, but in the paper it said “Similar to PLM, when prefix image is none, this task will degenerate into “text-to-image generation” task, forcing the model to generate an image with the input caption”, so how can the model complete text-pure tasks?

@ALR-alr
Copy link
Author

ALR-alr commented Aug 26, 2024

Maybe only use the transformer encoder in DAVINCI?

@shizhediao
Copy link
Owner

shizhediao commented Aug 27, 2024

Yes, you can just use the text encoder separately.

@ALR-alr
Copy link
Author

ALR-alr commented Aug 27, 2024

Yes, you can just use the text encoder separately.

It is said in the paper that "We follow the practice of BART (Lewis et al., 2020) and feed the
same input to the encoder and decoder, and the hidden state of the final decoder token is fed into a
new multi-class linear classifier or regression head." In my understanding, isn't the decoder input here similar to the one in the transformer, where already generated tokens are used as decoder input to generate the next token through autoregression? Why is it said that the decoder input is manually passed in and is the same as the encoder input?

@ALR-alr
Copy link
Author

ALR-alr commented Aug 27, 2024

How can the model evaluate on GLEU tasks?The tasks are text-pure, but in the paper it said “Similar to PLM, when prefix image is none, this task will degenerate into “text-to-image generation” task, forcing the model to generate an image with the input caption”, so how can the model complete text-pure tasks?

As in the code

gen_text=text,
,
same text is passed in DAVINCI model, but we need to predict the relationship between two sentences. Why is the hidden state from just one sentence can be passed in the classifier and predict two sentences' relationship?
I'm sorry that I have so many beginner level question...

@github-staff github-staff deleted a comment from ALR-alr Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@shizhediao @ALR-alr and others