Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page views: use origin URL instead of page name #7293

Merged
merged 9 commits into from
Aug 27, 2020
Merged

Page views: use origin URL instead of page name #7293

merged 9 commits into from
Aug 27, 2020

Conversation

stsewd
Copy link
Member

@stsewd stsewd commented Jul 15, 2020

Page name is a sphinx only concept,
so we don't have correct data for page views from mkdocs projects.

Using the origin url + the unresolver gives us the correct path.
Also, we are always storing the full path (index.html instead of "/"),
so we don't have duplicates (let me know if we do want to have duplicates).

closes #7131

Page name is a sphinx only concept,
so we don't have correct data for page views from mkdocs projects.

Using the origin url + the unresolver gives us the correct path.
Also, we are always storing the full path (index.html instead of "/"),
so we don't have duplicates (let me know if we do want to have duplicates).
- docroot: Path where all the source documents are.
Used to build the ``edit_on`` URL.
- source_suffix: Suffix from the source document.
Used to build the ``edit_on`` URL.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are passing subproject too, but we are not using it. I can remove it in another PR if that's ok

@stsewd stsewd requested a review from a team July 15, 2020 00:44
@@ -418,7 +419,7 @@ class TestFooterPerformance(APITestCase):

# The expected number of queries for generating the footer
# This shouldn't increase unless we modify the footer API
EXPECTED_QUERIES = 14
EXPECTED_QUERIES = 13
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have one query less bc we are not sending the origin parameter, so it defaults to false before doing a query for the feature flag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test this with an origin. I expect the unresolve method to actually add a few queries, which we should be careful about :/

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good change logically, but I'd like to see how many queries we're really adding. We should test a request with a domain & a subproject or translation, as that will likely add the most queries.

@@ -418,7 +419,7 @@ class TestFooterPerformance(APITestCase):

# The expected number of queries for generating the footer
# This shouldn't increase unless we modify the footer API
EXPECTED_QUERIES = 14
EXPECTED_QUERIES = 13
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test this with an origin. I expect the unresolve method to actually add a few queries, which we should be careful about :/

if not origin or not project.has_feature(Feature.STORE_PAGEVIEWS):
return

unresolved = unresolve(origin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the best approach right now. I think this is going to add a good number of queries to the footer, which is already our most expensive view. If we can just pull the path off the request, I think that is probably best, but maybe calling unresolve is correct, and we should focus on optimizations there?

@stsewd
Copy link
Member Author

stsewd commented Jul 15, 2020

Ok, I added a new method unresolve_from_request, since when we are in proxito (proxied footer) the request is passed through our middlewares that already set the request.slug attribute and everything else that the unresolver needs.

I updated the tests to run the whole request instead of just the footer view, that's why we have more queries. But the unresolver is only adding one query. We may be seeing more queries for translations/subprojects, but that's in general, not only for the unresolver, I'll add more tests later.

@stsewd stsewd requested a review from ericholscher July 15, 2020 23:16
@stsewd stsewd requested a review from a team July 20, 2020 16:50
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the full context of this PR but the changes looks good to me.

- version
- page: Sphinx's page name, used for path operations,
like change between languages (deprecated in favor of ``origin``).
- origin: Full path with domain, used for path operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we call it full_path?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or absolute_uri as Django calls it when it has the domain on it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm avoiding naming this path since it includes the domain as well, we use full path to refer to a path in other parts of the code. Not sure about uri

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, origin isn't the right name, since that's used in HTTP for just the origin hostname: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Origin

I think absolute_uri or similar is best: https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.build_absolute_uri

- version
- page: Sphinx's page name, used for path operations,
like change between languages (deprecated in favor of ``origin``).
- origin: Full path with domain, used for path operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, origin isn't the right name, since that's used in HTTP for just the origin hostname: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Origin

I think absolute_uri or similar is best: https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.build_absolute_uri

readthedocs/core/unresolver.py Outdated Show resolved Hide resolved
@stsewd stsewd requested a review from ericholscher August 26, 2020 23:01
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right. It will change how the data is stored for users, but thats probably fine?

@stsewd
Copy link
Member Author

stsewd commented Aug 27, 2020

It will change how the data is stored for users, but thats probably fine?

Yeah, we will have duplicates from previous pages, like index.html and /, page and page.html. But it would settle down in a month or so after previous page views are deleted.

@stsewd stsewd merged commit 29a9f7e into master Aug 27, 2020
@stsewd stsewd deleted the page-views branch August 27, 2020 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Show full path in traffict analytics
3 participants