Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search results are different when logged in #4452

Closed
davidism opened this issue Jul 30, 2018 · 22 comments
Closed

Search results are different when logged in #4452

davidism opened this issue Jul 30, 2018 · 22 comments
Labels
Accepted Accepted issue on our roadmap Bug A bug

Comments

@davidism
Copy link

davidism commented Jul 30, 2018

Details

Expected Result

Searching using the Sphinx-generated search page should return the same results whether I am logged in to RTD or not.

Actual Result

When I am logged in to RTD, search returns different results that are missing API links, context, and highlighting. If I do the same search in a private windows, I get the expected results that match our current hosted docs. @davidfischer got the correct results even though he was logged in, the difference is that I'm an owner of the project.

@stsewd stsewd added the Bug A bug label Jul 30, 2018
@davidfischer
Copy link
Contributor

This definitely looks like an API bug. It seems like there is a search API that 404s when a user is logged out.

@davidfischer
Copy link
Contributor

This API 404s when the version is not public. I'm still investigating this to see exactly what the benefits and tradeoffs are.

@safwanrahman safwanrahman added this to the Search improvements milestone Jul 31, 2018
@safwanrahman
Copy link
Member

I do not think its a bug. If the project is not public, we do not show to them to public user.

@davidism
Copy link
Author

Not sure what you mean. I accessed the docs when logged out, so they must have been accessible to the public user. And I got different, correct results doing that than what was served when I was logged in.

@safwanrahman
Copy link
Member

@davidism So the logic is according to permission. In the readthedocs.org site, all the docs are accessed publically, but that does not mean the project information is disclosed publically. So user can see the doc, but they can not search inside the doc. I think @ericholscher can explain more about the feature. While I was working on search rewrite, I was also confused by this issue!

@davidism
Copy link
Author

davidism commented Jul 31, 2018

How is getting different, incorrect results when logged in not a bug? Why is the Sphinx-generated search calling out to the RTD API and changing the results when logged in? RTD should only be serving the built docs at that point, there should be no authentication or API involved.

@safwanrahman
Copy link
Member

With sphinix, its not possible to have many search features. That can be a different design issue.
This can be discussed if we can open up the search regardless of permission.

@safwanrahman
Copy link
Member

I strongly hope our ongoing search upgrade will have more accurate result than current one.

@davidfischer
Copy link
Contributor

I think a combination of proceeding on the search upgrade and an opt-out flag for custom search seems like a good path forward.

@ericholscher
Copy link
Member

ericholscher commented Aug 1, 2018

Showing different results is a bug with privacy levels, which shouldn't effect projects once they are live and public. The poor search results are another issue.

The goal here is to provide better search than Sphinx can with it's basic JS backed search indexes. Sphinx search isn't full-text, and doesn't provide results for a large number of words in the normal documentation text. Our goal is to build something better built on top of elastic search.

The specific issue that would address this is #4289 -- currently we are just doing full-text indexing, without taking into account the code-level objects. Indexing the code objects will make this work much better, and the goal is to have it done in the next couple weeks as part of our Search Improvement GSOC, which will also offer many other search improvements.

Interestingly, the results should be in the pages that we are indexing, so I'm not 100% sure why they aren't showing up.

@ericholscher
Copy link
Member

ericholscher commented Aug 1, 2018

I tried to test flask against our revamped search, but ended up with an issue building the docs with the theme:

Traceback (most recent call last):
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/sphinx/builders/html.py", line 1030, in handle_page
    output = self.templates.render(templatename, ctx)
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/readthedocs_ext/readthedocs.py", line 135, in rtd_render
    content = old_render(template, render_context)
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/sphinx/jinja2glue.py", line 196, in render
    return self.environment.get_template(template).render(context)
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/jinja2/environment.py", line 1008, in render
    return self.environment.handle_exception(exc_info, True)
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/jinja2/environment.py", line 780, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/jinja2/_compat.py", line 37, in reraise
    raise value.with_traceback(tb)
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/sphinx/themes/basic/page.html", line 10, in top-level template code
    {%- extends "layout.html" %}
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/pallets_sphinx_themes/pocoo/layout.html", line 16, in top-level template code
    {% set version_warning = current_version.banner() if current_version %}
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/jinja2/sandbox.py", line 427, in call
    return __context.call(__obj, *args, **kwargs)
  File "/Users/eric/projects/readthedocs.org/user_builds/flask/envs/rtd/lib/python3.5/site-packages/pallets_sphinx_themes/versions.py", line 141, in banner
    ).format(latest=latest.name, href=latest.href(context))
AttributeError: 'NoneType' object has no attribute 'name'

Seems related to the removal of versions from the flask conf.py and the theme still wanting it.

Edit: Got it working -- looks like it depended on a tagged version being active

@ericholscher
Copy link
Member

I've got results for this in the new branch, and we're working to make sure the proper API reference gets weighted properly.

@davidism
Copy link
Author

davidism commented Aug 1, 2018

When I said "API result", I meant the link to specific API methods, not the literal "API" link, although that does appear to be weighted differently too. The header and api links have a #id hash that takes you to the specific location on the page. None of the results in the RTD search version have these hashes, whereas the Sphinx search version has these at the top in addition to the 18 results. The ability to link to specific sections of docs is one feature Sphinx has over Google and other searches, so I don't want to lose that.

@stale
Copy link

stale bot commented Jan 10, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: stale Issue will be considered inactive soon label Jan 10, 2019
@davidism
Copy link
Author

Seems to still be an issue.

@stale stale bot removed the Status: stale Issue will be considered inactive soon label Jan 10, 2019
@stsewd stsewd added the Accepted Accepted issue on our roadmap label Jan 10, 2019
@davidism
Copy link
Author

davidism commented Jan 28, 2019

Looks like the flag added by #4570 can't be overridden in <head> because the RTD extension always injects its content at the end of <head> irrespective of what the theme templates do. I want to add the following to <head> to disable new search until it returns the same API results as old search:

  {%- if READTHEDOCS and not readthedocs_docsearch %}
    <script>
      if (typeof READTHEDOCS_DATA !== 'undefined') {
        if (!READTHEDOCS_DATA.features) {
          READTHEDOCS_DATA.features = {};
        }
        READTHEDOCS_DATA.features.docsearch_disabled = true;
      }
    </script>
  {%- endif %}

However, RTD's JS, which is injected after the code above, doesn't check if READTHEDOCS_DATA is already defined, and just overwrites it, so the flag gets wiped away.

I can put my override script in the {% block footer %} block,. However, it would be nice to be able to control where RTD injects code.

@davidism
Copy link
Author

Fix for search disable flag in Pallets-Sphinx-Themes: pallets/pallets-sphinx-themes#20

@davidfischer
Copy link
Contributor

However, it would be nice to be able to control where RTD injects code.

Ya, this isn't ideal. We try to inject late for performance reasons but that isn't ideal. I think the longer term solution here is to have a better way to turn off docsearch, the flyout menu and other features (probably in the YAML file).

@dojutsu-user
Copy link
Member

dojutsu-user commented Sep 6, 2019

@davidism
I think this is fixed now.
You can see the search results for flask project here: https://readthedocs.org/projects/flask/search/?q=render_template_string&type=file&project=flask&version=master

@davidism
Copy link
Author

davidism commented Sep 6, 2019

I'm referring to the ones within Sphinx, not from the readthedocs project page.

@dojutsu-user
Copy link
Member

dojutsu-user commented Sep 8, 2019

@davidism
The search results shown on the readthedocs project page will be same within Sphinx.
I have built a demo project for flask which uses Read the Docs's search.
https://flask-dojutsu-user.readthedocs.io/en/latest/search.html?q=render_template_string

@stsewd
Copy link
Member

stsewd commented May 13, 2020

We dropped the privacy levels from .org, and we don't use that to filter results anymore. We are also indexing more content for search now (#7063). The last PR should be deployed next week.

@stsewd stsewd closed this as completed May 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Bug A bug
Projects
None yet
Development

No branches or pull requests

6 participants