-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate a sitemap.xml #557
Comments
This this feature being worked on? I can work on this if no one has worked on it yet. |
A theme that is doing this with an extension: https://github.com/guzzle/guzzle_sphinx_theme/blob/master/guzzle_sphinx_theme/__init__.py#L30 |
Another interesting approach: https://github.com/openstack/openstack-doc-tools/tree/master/sitemap |
I took the sitemap logic out of guzzle_sphinx_theme and made it an extension/package here: https://github.com/jdillard/sphinx-sitemap It's my first time making a package, but several people are using it successfully and I have it running in a few production environments myself. Some are even using it on RTD, for example:
|
Neat, this is definitely something we could incorporate into the standard build process. |
Great! This has been a project to help me learn new things and I'm very much still learning, so let me know if you need anything from me. |
I'd also support this feature being available in the standard build process, as it might be especially relevant for multilingual RTD projects, see https://en.wikipedia.org/wiki/Sitemaps#Multilingual_and_multinational_Sitemaps Context: For Godot Engine, we recently put up localized RTD instances, most of which are still over 80% of English text while translators work on things string by string. Search engines seem to have taken a particular liking to the Ukrainian instance for English queries, which puzzles many users. I hope the sitemap trick mentioned in the above link could fix that. (I'll try jdillard's extension in the meantime) |
@akien-mga I created a PR, jdillard/sphinx-sitemap#15, on my extension that adds support for multilingual sitemaps if you want to test it out and leave feedback there. I don't have much first hand experience with multi-lingual sphinx/RTD setups, so I might have missed some nuances. |
@ericholscher What's your idea to accomplish this? I'm thinking on installing the What do you think? Is this the path to follow? |
There are a few challenges with sitemaps. One challenge is that a sitemap is normally at One possibility is to make https://project.readthedocs.io/sitemap.xml a dynamic page which scans the different versions and translations under that domain for |
@davidfischer You can also use a sitemapindex to manage multiple sitemaps. I'm not sure if the RTD build process could create that file (containing links to the sub-sitemaps) and place in the root directory. |
@humitos I didn't realize there was already a |
This is exactly what I'm thinking! It is possible that RTD could dynamically generate the root sitemap rather than creating/updating it when builds happen. |
Just to put all together and continue with the next step. We need to decide,
I personally like the idea of making all of this automatically, but in that case we need to think if there could be users that don't want this for some particular reason (it could also be an option from the admin). |
How about we do a combination of both! Here's my proposal:
I don't think any users will actively not want this so I don't know if being able to disable it is critical in the first implementation. |
I like your proposal, @davidfischer I think that we have something that it's actionable now, and we can implement it. I'd love to see/get/receive a PR for this.
We will need to install a new dependency that could impact in the building time (not too much, though) but that could bring a new issue. That was my only concern, but I think we are fine by installing and running this by default. It's a new feature that will benefit all the projects and may have a minimum impact on some particular projects (we could add a feature flag if we find problems around it) |
I proposed that we do not add the extra sphinx extension for generating sitemaps by default. I think users should opt-in to it. If users choose not to opt-in, the sitemap we display would just point to the active versions: <?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://django.readthedocs.org/en/latest/</loc>
<lastmod>2013-12-01T19:20:30.45+01:00</lastmod>
<changefreq>daily</changefreq>
<priority>1</priority>
</url>
<url>
<loc>http://django.readthedocs.org/en/1.6.x/</loc>
<lastmod>2013-11-30T19:20:30.45+01:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.9</priority>
</url>
<url>
<loc>http://django.readthedocs.org/en/1.5.x/</loc>
<lastmod>2013-10-03T19:20:30.45+01:00</lastmod>
<changefreq>never</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>http://django.readthedocs.org/en/0.1.x/</loc>
<lastmod>2013-10-03T19:20:30.45+01:00</lastmod>
<changefreq>never</changefreq>
<priority>0.1</priority>
</url>
</urlset> |
Opt in sounds good (at least for now) -- we should however write a guide about how to enable it, documenting our integration and how users can enable it (once we build the integration :D) |
I am on the fence (only slightly!) on making this a Django view. We've been talking more about pushing docs off our servers and to Azure storage and historically served docs entirely from nginx on the community side. However, we could maybe redirect to a .org endpoint in Azure storage (similar to S3 redirects), or could reverse proxy the request to an API endpoint through Nginx. Worst case for an Azure implementation is we could just plop the sitemap index on the storage on any project save. I would however be into making this a more integrated feature, like a |
I created a proposal for this at #5122. This initial version is not allowing users to serve their own generated |
If I add |
The PR with the general sitemap.xml generation is about to get merged. Although, I want to link this comment from David here since it's an important one to consider when working on the next phase (sitemap indexes and more) |
This is already implemented #5122 |
Note: This is about enhancing SEO.
Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling.
The detailed technical specifications are available here.
Why?
Proposals
priority
as:1
for the pages of the latest or stable version. This option could be set inconf.py
.0.1
at each version0.1
for the pages for other version if there is more than 9 versions.lastmod
changefreq
as :daily
for the pages of thelatest
versionweekly
for the pages of the last tag versionnever
for the pages of other versionsExample
Implementation
We currently have logic in the code base for determining version order. We could just subtract .1 from the versions that are supported until we hit 0.1. We could also change the logic for tags and branches, since tags should never change, they can be updated much less frequently.
Bonus Points
The text was updated successfully, but these errors were encountered: