Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RRF limited to 1000 docs? #22

Closed
lintool opened this issue May 26, 2020 · 3 comments
Closed

RRF limited to 1000 docs? #22

lintool opened this issue May 26, 2020 · 3 comments
Assignees

Comments

@lintool
Copy link

lintool commented May 26, 2020

RRF allows user to specify max_docs:

def reciprocal_rank_fusion(trec_runs, k=60, max_docs=1000, output=sys.stdout):

However, it seems only up to 1000 hits:

docs_for_run = r.get_top_documents(topic, n=1000)

Shouldn't max_docs be passed into n?

@x65han
Copy link

x65han commented Jun 2, 2020

I got this

@joaopalotti
Copy link
Owner

Thanks, Jimmy for reporting it!
Johnson Han, thanks for taking care of it. I really appreciate it!
Looking forward to receiving your pull request.

@joaopalotti
Copy link
Owner

I had a look at this issue and we do not necessarily want to have:
docs_for_run = r.get_top_documents(topic, n=max_docs)

The parameter max_docs is only used for the final list of documents after the merging. Another parameter could be created to filter the top X documents of each run, if this is a use case.

Closing this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants