You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When training the doc2query-T5, we just use the qrels which each passage do not have other info like doc_title or doc_headings, but in query generation stage, we concatenate all infos about each passage, is there a distribution mismatch to affect the final performance? Or would it be better to use these additional infos?
The text was updated successfully, but these errors were encountered:
Hi,
I'm curious about the input template you use when generating the queries in V2.
In V1, i found it in convert_msmarco_doc_to_t5_format.py
Maybe in V2, it seems like the following
When training the doc2query-T5, we just use the qrels which each passage do not have other info like doc_title or doc_headings, but in query generation stage, we concatenate all infos about each passage, is there a distribution mismatch to affect the final performance? Or would it be better to use these additional infos?
The text was updated successfully, but these errors were encountered: