Page views: use origin URL instead of page name #7293

stsewd · 2020-07-15T00:39:27Z

Page name is a sphinx only concept,
so we don't have correct data for page views from mkdocs projects.

Using the origin url + the unresolver gives us the correct path.
Also, we are always storing the full path (index.html instead of "/"),
so we don't have duplicates (let me know if we do want to have duplicates).

closes #7131

Page name is a sphinx only concept, so we don't have correct data for page views from mkdocs projects. Using the origin url + the unresolver gives us the correct path. Also, we are always storing the full path (index.html instead of "/"), so we don't have duplicates (let me know if we do want to have duplicates).

stsewd · 2020-07-15T00:44:34Z

readthedocs/api/v2/views/footer_views.py

+    - docroot: Path where all the source documents are.
+      Used to build the ``edit_on`` URL.
+    - source_suffix: Suffix from the source document.
+      Used to build the ``edit_on`` URL.


We are passing subproject too, but we are not using it. I can remove it in another PR if that's ok

stsewd · 2020-07-15T01:04:54Z

readthedocs/rtd_tests/tests/test_footer.py

@@ -418,7 +419,7 @@ class TestFooterPerformance(APITestCase):

    # The expected number of queries for generating the footer
    # This shouldn't increase unless we modify the footer API
-    EXPECTED_QUERIES = 14
+    EXPECTED_QUERIES = 13


We have one query less bc we are not sending the origin parameter, so it defaults to false before doing a query for the feature flag.

We should test this with an origin. I expect the unresolve method to actually add a few queries, which we should be careful about :/

ericholscher

This is a good change logically, but I'd like to see how many queries we're really adding. We should test a request with a domain & a subproject or translation, as that will likely add the most queries.

ericholscher · 2020-07-15T02:46:44Z

readthedocs/rtd_tests/tests/test_footer.py

@@ -418,7 +419,7 @@ class TestFooterPerformance(APITestCase):

    # The expected number of queries for generating the footer
    # This shouldn't increase unless we modify the footer API
-    EXPECTED_QUERIES = 14
+    EXPECTED_QUERIES = 13


We should test this with an origin. I expect the unresolve method to actually add a few queries, which we should be careful about :/

ericholscher · 2020-07-15T02:47:33Z

readthedocs/analytics/signals.py

+    if not origin or not project.has_feature(Feature.STORE_PAGEVIEWS):
+        return
+
+    unresolved = unresolve(origin)


I'm not sure if this is the best approach right now. I think this is going to add a good number of queries to the footer, which is already our most expensive view. If we can just pull the path off the request, I think that is probably best, but maybe calling unresolve is correct, and we should focus on optimizations there?

stsewd · 2020-07-15T22:48:29Z

Ok, I added a new method unresolve_from_request, since when we are in proxito (proxied footer) the request is passed through our middlewares that already set the request.slug attribute and everything else that the unresolver needs.

I updated the tests to run the whole request instead of just the footer view, that's why we have more queries. But the unresolver is only adding one query. We may be seeing more queries for translations/subprojects, but that's in general, not only for the unresolver, I'll add more tests later.

humitos

I don't have the full context of this PR but the changes looks good to me.

humitos · 2020-07-22T09:32:02Z

readthedocs/api/v2/views/footer_views.py

+    - version
+    - page: Sphinx's page name, used for path operations,
+      like change between languages (deprecated in favor of ``origin``).
+    - origin: Full path with domain, used for path operations.


Why don't we call it full_path?

or absolute_uri as Django calls it when it has the domain on it.

I'm avoiding naming this path since it includes the domain as well, we use full path to refer to a path in other parts of the code. Not sure about uri

Yea, origin isn't the right name, since that's used in HTTP for just the origin hostname: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Origin

I think absolute_uri or similar is best: https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.build_absolute_uri

ericholscher · 2020-08-26T18:15:31Z

readthedocs/api/v2/views/footer_views.py

+    - version
+    - page: Sphinx's page name, used for path operations,
+      like change between languages (deprecated in favor of ``origin``).
+    - origin: Full path with domain, used for path operations.


Yea, origin isn't the right name, since that's used in HTTP for just the origin hostname: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Origin

I think absolute_uri or similar is best: https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.build_absolute_uri

readthedocs/core/unresolver.py

ericholscher

This looks right. It will change how the data is stored for users, but thats probably fine?

stsewd · 2020-08-27T15:41:49Z

It will change how the data is stored for users, but thats probably fine?

Yeah, we will have duplicates from previous pages, like index.html and /, page and page.html. But it would settle down in a month or so after previous page views are deleted.

stsewd commented Jul 15, 2020

View reviewed changes

stsewd requested a review from a team July 15, 2020 00:44

Fix test

c27e4df

stsewd commented Jul 15, 2020

View reviewed changes

ericholscher reviewed Jul 15, 2020

View reviewed changes

stsewd added 3 commits July 15, 2020 17:38

Add unresolve_from_request

d267903

Use unresolve_from_request

70ad4b8

Update tests

65fcc7c

stsewd added 2 commits July 15, 2020 18:07

Fix tests

2e5aa58

Typo

552becc

stsewd requested a review from ericholscher July 15, 2020 23:16

stsewd mentioned this pull request Jul 16, 2020

Search: weight page views into search results #7297

Closed

stsewd requested a review from a team July 20, 2020 16:50

humitos reviewed Jul 22, 2020

View reviewed changes

ericholscher reviewed Aug 26, 2020

View reviewed changes

stsewd added 2 commits August 26, 2020 16:03

Merge branch 'master' into page-views

b726251

Rename origin -> absolute_uri

7a18594

stsewd requested a review from ericholscher August 26, 2020 23:01

ericholscher approved these changes Aug 27, 2020

View reviewed changes

stsewd merged commit 29a9f7e into master Aug 27, 2020

stsewd deleted the page-views branch August 27, 2020 15:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Page views: use origin URL instead of page name #7293

Page views: use origin URL instead of page name #7293

stsewd commented Jul 15, 2020 •

edited

Loading

stsewd Jul 15, 2020

stsewd Jul 15, 2020

ericholscher Jul 15, 2020

ericholscher left a comment

ericholscher Jul 15, 2020

ericholscher Jul 15, 2020

stsewd commented Jul 15, 2020

humitos left a comment

humitos Jul 22, 2020

humitos Jul 22, 2020

stsewd Jul 22, 2020

ericholscher Aug 26, 2020

ericholscher Aug 26, 2020

ericholscher left a comment

stsewd commented Aug 27, 2020 •

edited

Loading

Page views: use origin URL instead of page name #7293

Page views: use origin URL instead of page name #7293

Conversation

stsewd commented Jul 15, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericholscher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stsewd commented Jul 15, 2020

humitos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericholscher left a comment

Choose a reason for hiding this comment

stsewd commented Aug 27, 2020 • edited Loading

stsewd commented Jul 15, 2020 •

edited

Loading

stsewd commented Aug 27, 2020 •

edited

Loading